Amazon MSK Archives - Kai Waehner

The Importance of Focus for Software and Cloud Vendors - Data Streaming with Apache Kafka and Flink

1.5K views
15 minute read

The Importance of Focus: Why Software Vendors Should Specialize Instead of Doing Everything (Example: Data Streaming)

ByKai Waehner
7. April 2025

As real-time technologies reshape IT architectures, software vendors face a critical decision: specialize deeply in one domain or build a broad, general-purpose stack. This blog examines why a focused approach—particularly in the world of data streaming—delivers greater innovation, scalability, and reliability. It compares leading platforms and strategies, from specialized providers like Confluent to generalist cloud ecosystems, and highlights the operational risks of fragmented tools. With data streaming emerging as its own software category, enterprises need clarity, consistency, and deep expertise. In this post, we argue that specialization—not breadth—is what powers mission-critical, real-time applications at global scale.

12.1K views
21 minute read

The Data Streaming Landscape 2025

ByKai Waehner
4. December 2024
1 share

Data streaming is a new software category. It has grown from niche adoption to becoming a fundamental part of modern data architecture, leveraging open source technologies like Apache Kafka and Flink. With real-time data processing transforming industries, the ecosystem of tools, platforms, and cloud services has evolved significantly. This blog post explores the data streaming landscape of 2025, analyzing key players, trends, and market dynamics shaping this space.

Data Streaming Trends for 2025 - Leading with Apache Kafka and Flink

15.6K views
18 minute read

Top Trends for Data Streaming with Apache Kafka and Flink in 2025

ByKai Waehner
2. December 2024

Apache Kafka and Apache Flink are leading open-source frameworks for data streaming that serve as the foundation for cloud services, enabling organizations to unlock the potential of real-time data. Over recent years, trends have shifted from batch-based data processing to real-time analytics, scalable cloud-native architectures, and improved data governance powered by these technologies. Looking ahead to 2025, the data streaming ecosystem is set to undergo even greater changes. Here are the top trends shaping the future of data streaming for businesses.

4.9K views
3 minute read

When NOT to Use Apache Kafka? (Lightboard Video)

ByKai Waehner
26. March 2024

Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job? This blog post contains a lightboard video that gives you a twenty-minute explanation of the DOs and DONTs.

Is Amazon MSK Serverless for Apache Kafka a Self-Driving Car or just a Car Engine

14.2K views
13 minute read

When NOT to choose Amazon MSK Serverless for Apache Kafka?

ByKai Waehner
30. August 2022

Apache Kafka became the de facto standard for data streaming. Various cloud offerings emerged and improved in the last years. Amazon MSK Serverless is the latest Kafka product from AWS. This blog post looks at its capabilities to explore how it relates to “the normal” partially managed Amazon MSK, when the serverless version is a good choice, and when other fully-managed cloud services like Confluent Cloud are the better option.

Data Warehouse vs Data Lake vs Data Streaming Comparison

14.1K views
10 minute read

Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?

ByKai Waehner
27. June 2022

The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a blog series. This is part 1: Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?

JMS Message Queue vs Apache Kafka Comparison

21.8K views
19 minute read

Comparison: JMS Message Queue vs. Apache Kafka

ByKai Waehner
12. May 2022
4 shares

Comparing JMS-based message queue (MQ) infrastructures and Apache Kafka-based data streaming is a widespread topic. Unfortunately, the battle is an apple-to-orange comparison that often includes misinformation and FUD from vendors. This blog post explores the differences, trade-offs, and architectures of JMS message brokers and Kafka deployments. Learn how to choose between JMS brokers like IBM MQ or RabbitMQ and open-source Kafka or serverless cloud services like Confluent Cloud.

10.9K views
13 minute read

Is Apache Kafka an iPaaS or is Event Streaming its own Software Category?

ByKai Waehner
3. November 2021
3 shares

This post explores why Apache Kafka is the new black for integration projects, how Kafka fits into the discussion around cloud-native iPaaS solutions, and why event streaming is a new software category. A concrete real-world example shows the difference between event streaming and traditional integration platforms respectively iPaaS.

Serverless Kafka for Data in Motion as Rescue for Data at Rest in the Data Lake

9.8K views
12 minute read

Serverless Kafka in a Cloud-native Data Lake Architecture

ByKai Waehner
25. June 2021
1 share

Apache Kafka became the de facto standard for processing data in motion. Kafka is open, flexible, and scalable. Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use a serverless Kafka SaaS offering to focus on business logic. However, hybrid scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden. This blog post explores how to leverage cloud-native and serverless Kafka offerings in a hybrid cloud architecture. We start from the perspective of data at rest with a data lake and explore its relation to data in motion with Kafka.

De Facto Standard API - Amazon S3 for Object Storage and Apache Kafka for Event Streaming

11.5K views
13 minute read

Kafka API is the De Facto Standard API for Event Streaming like Amazon S3 for Object Storage

ByKai Waehner
9. May 2021
4 shares

Real-time beats slow data in most use cases across industries. The rise of event-driven architectures and data in motion powered by Apache Kafka enables enterprises to build real-time infrastructure and applications. This blog post explores why the Kafka API became the de facto standard API for event streaming like Amazon S3 for object storage, and the tradeoffs of these standards and corresponding frameworks, products, and cloud services.

Technology Evangelist

Kai Waehner

Amazon MSK

Global Field CTO

Apache Kafka vs. Middleware (MQ, ETL, ESB) – Slides + Video

Deep Learning Example: Apache Kafka + Python + Keras + TensorFlow + Deeplearning4j

Demo Title