Data streaming is a new software category to process data in motion. Apache Kafka is the de facto standard used by over 100,000 organizations. Plenty of vendors offer Kafka platforms and cloud services. Many complementary stream processing engines like Apache Flink and SaaS offerings have emerged. And competitive technologies like Pulsar and Redpanda try to get market share. This blog post explores the data streaming landscape of 2023 to summarize existing solutions and market trends.
The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a blog series. This is part 4: Case Studies for cloud-native data streaming and data warehouses.
At OOP 2018 conference in Munich, I presented an updated version of my talk about building scalable, mission-critical…
Apache Kafka Streams to build Real Time Streaming Microservices. Apply Machine Learning / Deep Learning using Spark, TensorFlow, H2O.ai, etc. to add AI. Embed Kafka Streams into Java App, Docker, Kubernetes, Mesos, anything else.