Taking Smart Notes With Org-mode
  • About
  • Articles
  • Notes
  • Search
Home » Notes

Streaming

January 5, 2022 · 1 min · Gray King
  • tags: Bigdata

Links to this note


    Why local state is a fundamental primitive in stream processing

    tags: Bigdata,Streaming,Stateful Stream Processing source: Kreps, Jay. “Why Local State Is a Fundamental Primitive in Stream Processing - O’Reilly Radar.” Accessed January 5, 2022. http://radar.oreilly.com/2014/07/why-local-state-is-a-fundamental-primitive-in-stream-processing.html. Why local state is much faster than a distribut database. local state can easily restore by some middleware like Kafka: by writing changes to a Kafka topic.

    January 5, 2022 · 1 min · Gray King

    Streaming 102: The world beyond batch

    tags: Bigdata,Flink,Dataflow Model,Streaming source: “Streaming 102: The World beyond Batch – O’Reilly.” Accessed January 5, 2022. https://www.oreilly.com/radar/the-world-beyond-batch-streaming-102/. Three more concepts: Watermarks: Useful for event time windowing. All input data with event times less than watermark have been observed. Triggers: Signal for a window to produce output. Accumulation: The way to handle multiple results that are observed for the same window. Streaming 101 Redux What: Transformations Where: windowing Make a temporal boundary for a unbounded data source. ...

    January 5, 2022 · 2 min · Gray King

    Dataflow Model

    tags: Bigdata,Streaming source: Akidau, Tyler, Robert Bradshaw, Craig Chambers, Slava Chernyak, Rafael J. Fernández-Moctezuma, Reuven Lax, Sam McVeety, et al. “The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, out-of-Order Data Processing.” Proceedings of the VLDB Endowment 8, no. 12 (August 2015): 1792–1803. https://doi.org/10.14778/2824032.2824076.

    January 5, 2022 · 1 min · Gray King

    Streaming 101: The world beyond batch

    tags: Bigdata,Flink,Streaming source: Akidau, Tyler. “Streaming 101: The World beyond Batch.” O’Reilly Media, August 5, 2015. https://www.oreilly.com/radar/the-world-beyond-batch-streaming-101/. Streaming: a type of data processing engine that is designed with infinite data sets in mind. Other common uses of “streaming” that will be avoid in the rest of the post: Unbounded data: A type of ever-growing, essentially infinite data set. Unbounded data processing: An ongoing mode of data processing, applied to the aforementioned type of unbounded data. Low-latency, approximate, and/or speculative results: These types of results are most often associated with streaming engines. Limitations of streaming To beat batch at its own game, you really only need two things: ...

    January 5, 2022 · 2 min · Gray King

    Flink

    tags: Bigdata,Dataflow Model,Streaming

    March 20, 2020 · 1 min · Gray King
© 2025 Taking Smart Notes With Org-mode · Powered by Hugo & PaperMod