- tags: Bigdata,Dataflow Model,Streaming
Flink
Links to this note
Streaming 102: The world beyond batch
tags: Bigdata,Flink,Dataflow Model,Streaming source: “Streaming 102: The World beyond Batch – O’Reilly.” Accessed January 5, 2022. https://www.oreilly.com/radar/the-world-beyond-batch-streaming-102/. Three more concepts: Watermarks: Useful for event time windowing. All input data with event times less than watermark have been observed. Triggers: Signal for a window to produce output. Accumulation: The way to handle multiple results that are observed for the same window. Streaming 101 Redux What: Transformations Where: windowing Make a temporal boundary for a unbounded data source....
Streaming 101: The world beyond batch
tags: Bigdata,Flink,Streaming source: Akidau, Tyler. “Streaming 101: The World beyond Batch.” O’Reilly Media, August 5, 2015. https://www.oreilly.com/radar/the-world-beyond-batch-streaming-101/. Streaming: a type of data processing engine that is designed with infinite data sets in mind. Other common uses of “streaming” that will be avoid in the rest of the post: Unbounded data: A type of ever-growing, essentially infinite data set. Unbounded data processing: An ongoing mode of data processing, applied to the aforementioned type of unbounded data....
Stateful Stream Processing
tags: Stream processing,Flink source: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/learn-flink/overview/#stateful-stream-processing https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/concepts/stateful-stream-processing/ This means that how one event is handled can depend on the accumulated effect of all the events that came before it. How the stateful streaming processing works on a distributed cluster? The set of parallel instances of a stateful operator is effectively a sharded key-value store. Each parallel instance is responsible for handling events for a specific group of keys, and the state for those keys is kept locally....
Timely Stream Processing
tags: Stream processing,Flink source: https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/learn-flink/overview/#timely-stream-processing Flink timely stream processing support by using event timestamps that are recorded in data stream, rather than using the clocks of the machines processing the data.
Flink Parallel Dataflows
tags: Flink Streams can transport data between two operators in a one-to-one (or forwarding) pattern, or in a redistributing pattern:
Stream processing
tags: Flink Stream processing, on the other hand, involves unbounded data streams. Conceptually, at least, the input may never end, and so you are forced to continuously process the data as it arrives.
知乎:Flink实时计算-深入理解 Checkpoint和Savepoint
tags: Flink,Flink State Snapshots,Flink Checkpoint,Flink Savepoint source: https://zhuanlan.zhihu.com/p/79526638