Taking Smart Notes With Org-mode
  • About
  • Articles
  • Notes
  • Search
Home » Projects

Flink

March 20, 2020 · 1 min · Gray King
  • tags: Bigdata,Dataflow Model,Streaming

Links to this note


    Streaming 102: The world beyond batch

    tags: Bigdata,Flink,Dataflow Model,Streaming source: “Streaming 102: The World beyond Batch – O’Reilly.” Accessed January 5, 2022. https://www.oreilly.com/radar/the-world-beyond-batch-streaming-102/. Three more concepts: Watermarks: Useful for event time windowing. All input data with event times less than watermark have been observed. Triggers: Signal for a window to produce output. Accumulation: The way to handle multiple results that are observed for the same window. Streaming 101 Redux What: Transformations Where: windowing Make a temporal boundary for a unbounded data source. ...

    January 5, 2022 · 2 min · Gray King

    Streaming 101: The world beyond batch

    tags: Bigdata,Flink,Streaming source: Akidau, Tyler. “Streaming 101: The World beyond Batch.” O’Reilly Media, August 5, 2015. https://www.oreilly.com/radar/the-world-beyond-batch-streaming-101/. Streaming: a type of data processing engine that is designed with infinite data sets in mind. Other common uses of “streaming” that will be avoid in the rest of the post: Unbounded data: A type of ever-growing, essentially infinite data set. Unbounded data processing: An ongoing mode of data processing, applied to the aforementioned type of unbounded data. Low-latency, approximate, and/or speculative results: These types of results are most often associated with streaming engines. Limitations of streaming To beat batch at its own game, you really only need two things: ...

    January 5, 2022 · 2 min · Gray King

    Stateful Stream Processing

    tags: Stream processing,Flink source: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/learn-flink/overview/#stateful-stream-processing https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/concepts/stateful-stream-processing/ This means that how one event is handled can depend on the accumulated effect of all the events that came before it. How the stateful streaming processing works on a distributed cluster? The set of parallel instances of a stateful operator is effectively a sharded key-value store. Each parallel instance is responsible for handling events for a specific group of keys, and the state for those keys is kept locally. ...

    January 4, 2022 · 2 min · Gray King

    Timely Stream Processing

    tags: Stream processing,Flink source: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/learn-flink/overview/#timely-stream-processing Flink timely stream processing support by using event timestamps that are recorded in data stream, rather than using the clocks of the machines processing the data.

    January 4, 2022 · 1 min · Gray King

    Flink Parallel Dataflows

    tags: Flink Streams can transport data between two operators in a one-to-one (or forwarding) pattern, or in a redistributing pattern:

    January 4, 2022 · 1 min · Gray King

    Stream processing

    tags: Flink Stream processing, on the other hand, involves unbounded data streams. Conceptually, at least, the input may never end, and so you are forced to continuously process the data as it arrives.

    January 4, 2022 · 1 min · Gray King

    知乎:Flink实时计算-深入理解 Checkpoint和Savepoint

    tags: Flink,Flink State Snapshots,Flink Checkpoint,Flink Savepoint source: https://zhuanlan.zhihu.com/p/79526638

    January 4, 2022 · 1 min · Gray King
© 2025 Taking Smart Notes With Org-mode · Powered by Hugo & PaperMod