Data Mesh Archives - Kai Waehner

4.1K views
7 minute read

Policy Enforcement and Data Quality for Apache Kafka with Schema Registry

ByKai Waehner
16. October 2023
1 share
No comments

Good data quality is one of the most critical requirements in decoupled architectures, like microservices or data mesh. Apache Kafka became the de facto standard for these architectures. But Kafka is a dumb broker that only stores byte arrays. The Schema Registry enforces message structures. This blog post looks at enhancements to leverage data contracts for policies and rules to enforce good data quality on field-level and advanced use cases like routing malicious messages to a dead letter queue.

4.9K views
6 minute read

Apache Kafka for Data Consistency (and Real-Time Data Streaming)

ByKai Waehner
27. December 2022
10 shares
No comments

Real-time data beats slow data in almost all use cases. But as essential is data consistency across all systems, including non-real-time legacy systems and modern request-response APIs. Apache Kafka’s most underestimated feature is the storage component based on the append-only commit log. It enables loose coupling for domain-driven design with microservices and independent data products in a data mesh. This blog post explores how Kafka enables data consistency with a real-world case study from financial services.

4.3K views
5 minute read

Decentralized Data Mesh with Data Streaming in Financial Services

ByKai Waehner
28. October 2022
No comments

Digital transformation requires agility and fast time to market as critical factors for success in any enterprise. The decentralization with a data mesh separates applications and business units into independent domains. Data sharing in real-time with data streaming helps to provide information in the proper context to the correct application at the right time. This blog post explores a case study from the financial services sector where a data mesh was built across countries for loosely coupled data sharing but standardized enterprise-wide data governance.

Real-Time Supply Chain Control Tower with Apache Kafka

3.5K views
6 minute read

A Real-Time Supply Chain Control Tower powered by Kafka

ByKai Waehner
23. September 2022
No comments

A modern supply chain requires just-in-time production, global logistics, and complex manufacturing processes. This blog post explores a solution that ingests all information flows into a unified central nervous system. The idea of the Supply Chain Control Tower becomes a reality: An integrated data cockpit with real-time access to all levels and systems of the supply chain.

5.6K views
5 minute read

The Heart of the Data Mesh Beats Real-Time with Apache Kafka

ByKai Waehner
28. July 2022
1 share
No comments

If there were a buzzword of the hour, it would undoubtedly be “data mesh”! This new architectural paradigm unlocks analytic and transactional data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios. The data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a decentralized data mesh infrastructure must be real-time, reliable, and scalable. Learn how the de facto standard for data streaming, Apache Kafka, plays a crucial role in building a data mesh.

Best Practices for Data Analytics with AWS Azure Googel BigQuery Spark Kafka Confluent Databricks

3.6K views
10 minute read

Best Practices for Building a Cloud-Native Data Warehouse or Data Lake

ByKai Waehner
21. July 2022
No comments

The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a blog series. This is part 5: Best Practices for Building a Cloud-Native Data Warehouse or Data Lake.

Data Warehouse vs Data Lake vs Data Streaming Comparison

7.2K views
9 minute read

Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?

ByKai Waehner
27. June 2022
No comments

The concepts and architectures of a data warehouse, a data lake, and data streaming are complementary to solving business problems. Unfortunately, the underlying technologies are often misunderstood, overused for monolithic and inflexible architectures, and pitched for wrong use cases by vendors. Let’s explore this dilemma in a blog series. This is part 1: Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?

Stream Exchange for Data Sharing with Apache Kafka in a Data Mesh

5.4K views
10 minute read

Streaming Data Exchange with Kafka and a Data Mesh in Motion

ByKai Waehner
14. November 2021
No comments

Data Mesh is a new architecture paradigm that gets a lot of buzzes these days. This blog post looks into this principle deeper to explore why no single technology is the perfect fit to build a Data Mesh. Examples show why an open and scalable decentralized real-time platform like Apache Kafka is often the heart of the Data Mesh infrastructure, complemented by many other data platforms to solve business problems.

Technology Evangelist

Kai Waehner

Data Mesh

Policy Enforcement and Data Quality for Apache Kafka with Schema Registry

Technology Evangelist

Apache Kafka vs. Middleware (MQ, ETL, ESB) – Slides + Video

Deep Learning Example: Apache Kafka + Python + Keras + TensorFlow + Deeplearning4j

Apache Iceberg – The Open Table Format for Lakehouse AND Data Streaming

The Digitalization of Airport and Airlines with IoT and Data Streaming using Kafka and Flink

Energy Trading with Apache Kafka and Flink