Data streaming is a new software category to process data in motion. Apache Kafka is the de facto standard used by over 100,000 organizations. Plenty of vendors offer Kafka platforms and cloud services. Many complementary stream processing engines like Apache Flink and SaaS offerings have emerged. And competitive technologies like Pulsar and Redpanda try to get market share. This blog post explores the data streaming landscape of 2023 to summarize existing solutions and market trends.
The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a blog series. This is part 5: Best Practices for Building a Cloud-Native Data Warehouse or Data Lake.
The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a blog series. This is part 1: Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
This blog post explores why software vendors (try to) introduce new solutions for Reverse ETL, when Reverse ETL is really needed, and how it fits into the enterprise architecture. The involvement of event streaming to process data in motion is a key piece of Reverse ETL for real-time use cases.
This blog series explores use cases and architectures for Apache Kafka in the cybersecurity space, including situational awareness, threat intelligence, forensics, air-gapped and zero trust environments, and SIEM / SOAR modernization. This post is part six: SIEM / SOAR modernization and integration.
Real-time beats slow data in most use cases across industries. The rise of event-driven architectures and data in motion powered by Apache Kafka enables enterprises to build real-time infrastructure and applications. This blog post explores why the Kafka API became the de facto standard API for event streaming like Amazon S3 for object storage, and the tradeoffs of these standards and corresponding frameworks, products, and cloud services.
Apache Kafka became the de facto standard for event streaming. Various vendors added Kafka and related tooling to their offerings or provide a Kafka cloud service. This blog post uses the car analogy – from the motor engine to the self-driving car – to explore the different Kafka offerings available on the market. The goal is not a feature-by-feature comparison. Instead, the intention is to educate about the different deployment models, product strategies, and trade-offs from the available options.
Introducing new features in Confluent Platform 5.4 and Apache Kafka 2.4… I just want to share a presentation…
Can and should Apache Kafka replace a database? How long can and should I store data in Kafka?…
The combination of streaming machine learning (ML), Apache Kafka and Confluent Tiered Storage enables you to build one…