Integration Archives - Kai Waehner

10.6K views
8 minute read

The Shift Left Architecture – From Batch and Lakehouse to Real-Time Data Products with Data Streaming

ByKai Waehner
15. June 2024
No comments

Data integration is a hard challenge in every enterprise. Batch processing and Reverse ETL are common practices in a data warehouse, data lake or lakehouse. Data inconsistency, high compute cost, and stale information are the consequences. This blog post introduces a new design pattern to solve these problems: The Shift Left Architecture enables a data mesh with real-time data products to unify transactional and analytical workloads with Apache Kafka, Flink and Iceberg. Consistent information is handled with streaming processing or ingested into Snowflake, Databricks, Google BigQuery, or any other analytics / AI platform to increase flexibility, reduce cost and enable a data-driven company culture with faster time-to-market building innovative software applications.

Data Streaming with Apache Kafka for Industrial IoT in the Automotive Industry at Brose

1.3K views
3 minute read

Apache Kafka in Manufacturing at Automotive Supplier Brose for Industrial IoT Use Cases

ByKai Waehner
13. June 2024
No comments

Data streaming unifies OT/IT workloads by connecting information from sensors, PLCs, robotics and other manufacturing systems at the edge with business applications and the big data analytics world in the cloud. This blog post explores how the global automotive supplier Brose deploys a hybrid industrial IoT architecture using Apache Kafka in combination with Eclipse Kura, OPC-UA, MuleSoft and SAP.

Snowflake with Apache Kafka and Iceberg Connector

2.9K views
8 minute read

Snowflake Data Integration Options for Apache Kafka (including Iceberg)

ByKai Waehner
22. April 2024
No comments

The integration between Apache Kafka and Snowflake is often cumbersome. Options include near real-time ingestion with a Kafka Connect connector, batch ingestion from large files, or leveraging a standard table format like Apache Iceberg. This blog post explores the alternatives and discusses its trade-offs. The end shows how data streaming helps with hybrid architectures where data needs to be ingested from the private data center into Snowflake in the public cloud.

2.5K views
13 minute read

Customer Loyalty and Rewards Platform with Apache Kafka

ByKai Waehner
14. January 2024
No comments

Loyalty and rewards platforms are crucial for customer retention and revenue growth for many enterprises across industries. Apache Kafka provides context-specific real-time data and consistency across all applications and databases for a modern and flexible enterprise architecture. This blog post looks at case studies from Albertsons (retail), Globe Telecom (telco), Virgin Australia (aviation), Disney+ Hotstar (sports and gaming), and Porsche (automotive) to explain the value of data streaming for improving the customer loyalty.

The State of Data Streaming for Healthcare in 2023 with Apache Kafka and Flink

2.7K views
6 minute read

The State of Data Streaming for Healthcare with Apache Kafka and Flink

ByKai Waehner
27. November 2023
No comments

This blog post explores the state of data streaming for the healthcare industry powered by Apache Kafka and Apache Flink. IT modernization and innovation with pioneering technologies like sensors, telemedicine, or AI/machine learning are explored. I look at enterprise architectures and customer stories from Humana, Recursion, BHG (former Bankers Healthcare Group), and more. A complete slide deck and on-demand video recording are included.

JMS Message Broker vs Apache Kafka Data Streaming

7.8K views
6 minute read

Message Broker and Apache Kafka: Trade-Offs, Integration, Migration

ByKai Waehner
2. March 2023
2 comments

A Message broker has very different characteristics and use cases than a data streaming platform like Apache Kafka. Data integration, processing, governance, and security must be reliable and scalable across the business process. This blog post explores the capabilities of message brokers, the relation to the JMS standard, trade-offs compared to data streaming with Apache Kafka, and typical integration and migration scenarios.

How to do Error Handling in Data Streaming

64.8K views
15 minute read

Error Handling via Dead Letter Queue in Apache Kafka

ByKai Waehner
30. May 2022
3 comments

Recognizing and handling errors is essential for any reliable data streaming pipeline. This blog post explores best practices for implementing error handling using a Dead Letter Queue in Apache Kafka infrastructure. The options include a custom implementation, Kafka Streams, Kafka Connect, the Spring framework, and the Parallel Consumer. Real-world case studies show how Uber, CrowdStrike, Santander Bank, and Robinhood build reliable real-time error handling at an extreme scale.

4.5K views
4 minute read

Apache Kafka in the Healthcare Industry

ByKai Waehner
28. March 2022
No comments

IT modernization and innovative new technologies change the healthcare industry significantly. This blog series explores real-world examples of data streaming with Apache Kafka to increase efficiency, reduce cost, and improve the human experience across the healthcare value chain including pharma, insurance, providers, retail, and manufacturing. This is part one: Overview.

The Trinity of Data Streaming in Industrial IoT - Apache Kafka MQTT OPC UA

10.4K views
15 minute read

OPC UA, MQTT, and Apache Kafka – The Trinity of Data Streaming in IoT

ByKai Waehner
11. February 2022
10 shares
No comments

In the IoT world, MQTT and OPC UA have established themselves as open and platform-independent standards for data exchange in Industrial IoT and Industry 4.0 use cases. Data Streaming with Apache Kafka is the data hub for integrating and processing massive volumes of data at any scale in real-time. This blog post explores the relationship between Kafka and the IoT protocols, when to use which technology, and why sometimes HTTP/REST is the better choice. The end explores real-world case studies from Audi and BMW.

14.6K views
13 minute read

When to use Apache Camel vs. Apache Kafka?

ByKai Waehner
28. January 2022
3 shares
4 comments

Should I use Apache Camel or Apache Kafka for my next integration project? The question is very valid and comes up regularly. This blog post explores both open-source frameworks and explains the difference between application integration and event streaming. The comparison discusses when to use Kafka or Camel, when to combine them, when not to use them at all. A decision tree shows how you can quickly qualify out one for the other.

Technology Evangelist

Kai Waehner

Integration

Message Broker and Apache Kafka: Trade-Offs, Integration, Migration

Error Handling via Dead Letter Queue in Apache Kafka

OPC UA, MQTT, and Apache Kafka – The Trinity of Data Streaming in IoT

When to use Apache Camel vs. Apache Kafka?

Technology Evangelist

Apache Kafka vs. Middleware (MQ, ETL, ESB) – Slides + Video

Deep Learning Example: Apache Kafka + Python + Keras + TensorFlow + Deeplearning4j

How Microsoft Fabric Lakehouse Complements Data Streaming (Apache Kafka, Flink, et al.)