/dev/reading
Category

Data Streaming

5 books
Order by
View
by David Kjerrumgaard

Deliver lightning fast and reliable messaging for your distributed applications with the flexible and resilient Apache Pulsar platform.

In Apache Pulsar in Action you will learn how to:

  • Publish from Apache Pulsar into third-party data repositories and platforms
  • Design and develop Apache Pulsar functions
  • Perform interactive SQL queries against data stored in Apache Pulsar

Apache Pulsar in Action is a comprehensive and practical guide to building high-traffic applications with Pulsar. You’ll learn to use this mature and battle-tested platform to deliver extreme levels of speed and durability to your messaging. Apache Pulsar committer David Kjerrumgaard teaches you to apply Pulsar’s seamless scalability through hands-on case studies, including IOT analytics applications and a microservices app based on Pulsar functions.

Real-time event processing
by Josh Fischer and Ning Wang

A friendly, framework-agnostic tutorial that will help you grok how streaming systems work—and how to build your own!

In Grokking Streaming Systems you will learn how to:

  • Implement and troubleshoot streaming systems
  • Design streaming systems for complex functionalities
  • Assess parallelization requirements
  • Spot networking bottlenecks and resolve back pressure
  • Group data for high-performance systems
  • Handle delayed events in real-time systems

Grokking Streaming Systems is a simple guide to the complex concepts behind streaming systems. This friendly and framework-agnostic tutorial teaches you how to handle real-time events, and even design and build your own streaming job that’s a perfect fit for your needs. Each new idea is carefully explained with diagrams, clear examples, and fun dialogue between perplexed personalities!

Strategies for real-time event processing
by Sean T. Allen, Matthew Jankowski and Peter Pathirana

Storm Applied is a practical guide to using Apache Storm for the real-world tasks associated with processing and analyzing real-time data streams. This immediately useful book starts by building a solid foundation of Storm essentials so that you learn how to think about designing Storm solutions the right way from day one. But it quickly dives into real-world case studies that will bring the novice up to speed with productionizing Storm.

Fundamentals, Implementation, and Operation of Streaming Applications
by Fabian Hueske and Vasiliki Kalavri

Get started with Apache Flink, the open source framework that powers some of the world’s largest stream processing applications. With this practical book, you’ll explore the fundamental concepts of parallel stream processing and discover how this technology differs from traditional batch data processing.

Longtime Apache Flink committers Fabian Hueske and Vasia Kalavri show you how to implement scalable streaming applications with Flink’s DataStream API and continuously run and maintain these applications in operational environments. Stream processing is ideal for many use cases, including low-latency ETL, streaming analytics, and real-time dashboards as well as fraud detection, anomaly detection, and alerting. You can process continuous data of any kind, including user interactions, financial transactions, and IoT data, as soon as you generate them.

  • Learn concepts and challenges of distributed stateful stream processing
  • Explore Flink’s system architecture, including its event-time processing mode and fault-tolerance model
  • Understand the fundamentals and building blocks of the DataStream API, including its time-based and statefuloperators
  • Read data from and write data to external systems with exactly-once consistency
  • Deploy and configure Flink clusters
  • Operate continuously running streaming applications
Understanding the real-time pipeline
by Andrew G. Psaltis

Streaming Data introduces the concepts and requirements of streaming and real-time data systems. The book is an idea-rich tutorial that teaches you to think about how to efficiently interact with fast-flowing data.