- tags: Bigdata
Streaming
Links to this note
Why local state is a fundamental primitive in stream processing
tags: Bigdata,Streaming,Stateful Stream Processing source: Kreps, Jay. “Why Local State Is a Fundamental Primitive in Stream Processing - O’Reilly Radar.” Accessed January 5, 2022. http://radar.oreilly.com/2014/07/why-local-state-is-a-fundamental-primitive-in-stream-processing.html. Why local state is much faster than a distribut database. local state can easily restore by some middleware like Kafka: by writing changes to a Kafka topic.
Streaming 102: The world beyond batch
tags: Bigdata,Flink,Dataflow Model,Streaming source: “Streaming 102: The World beyond Batch – O’Reilly.” Accessed January 5, 2022. https://www.oreilly.com/radar/the-world-beyond-batch-streaming-102/. Three more concepts: Watermarks: Useful for event time windowing. All input data with event times less than watermark have been observed. Triggers: Signal for a window to produce output. Accumulation: The way to handle multiple results that are observed for the same window. Streaming 101 Redux What: Transformations Where: windowing Make a temporal boundary for a unbounded data source....
Dataflow Model
tags: Bigdata,Streaming source: Akidau, Tyler, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J. Fernández-Moctezuma, Reuven Lax, Sam McVeety, et al. “The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, out-of-Order Data Processing.” Proceedings of the VLDB Endowment 8, no. 12 (August 2015): 1792–1803. https://doi.org/10.14778/2824032.2824076.
Streaming 101: The world beyond batch
tags: Bigdata,Flink,Streaming source: Akidau, Tyler. “Streaming 101: The World beyond Batch.” O’Reilly Media, August 5, 2015. https://www.oreilly.com/radar/the-world-beyond-batch-streaming-101/. Streaming: a type of data processing engine that is designed with infinite data sets in mind. Other common uses of “streaming” that will be avoid in the rest of the post: Unbounded data: A type of ever-growing, essentially infinite data set. Unbounded data processing: An ongoing mode of data processing, applied to the aforementioned type of unbounded data....
Flink
tags: Bigdata,Dataflow Model,Streaming