Analytics Archives - Kai Waehner

Lakehouse and Data Streaming - Competitor or Complementary

2.1K views
12 minute read

How Microsoft Fabric Lakehouse Complements Data Streaming (Apache Kafka, Flink, et al.)

ByKai Waehner
12. October 2024
No comments

In today’s data-driven world, understanding data at rest versus data in motion is crucial for businesses. Data streaming frameworks like Apache Kafka and Apache Flink enable real-time data processing. Meanwhile, lakehouses like Snowflake, Databricks, and Microsoft Fabric excel in long-term data storage and detailed analysis, perfect for reports and AI training. This blog post delves into how these technologies complement each other in enterprise architecture.

10.5K views
8 minute read

The Shift Left Architecture – From Batch and Lakehouse to Real-Time Data Products with Data Streaming

ByKai Waehner
15. June 2024
No comments

Data integration is a hard challenge in every enterprise. Batch processing and Reverse ETL are common practices in a data warehouse, data lake or lakehouse. Data inconsistency, high compute cost, and stale information are the consequences. This blog post introduces a new design pattern to solve these problems: The Shift Left Architecture enables a data mesh with real-time data products to unify transactional and analytical workloads with Apache Kafka, Flink and Iceberg. Consistent information is handled with streaming processing or ingested into Snowflake, Databricks, Google BigQuery, or any other analytics / AI platform to increase flexibility, reduce cost and enable a data-driven company culture with faster time-to-market building innovative software applications.

Data Streaming with Apache Kafka for Industrial IoT in the Automotive Industry at Brose

1.3K views
3 minute read

Apache Kafka in Manufacturing at Automotive Supplier Brose for Industrial IoT Use Cases

ByKai Waehner
13. June 2024
No comments

Data streaming unifies OT/IT workloads by connecting information from sensors, PLCs, robotics and other manufacturing systems at the edge with business applications and the big data analytics world in the cloud. This blog post explores how the global automotive supplier Brose deploys a hybrid industrial IoT architecture using Apache Kafka in combination with Eclipse Kura, OPC-UA, MuleSoft and SAP.

Streaming Analytics SQL API with Apache Kafka Confluent ClickHouse Tinybird

6.8K views
8 minute read

Apache Kafka and Tinybird (ClickHouse) for Streaming Analytics HTTP APIs

ByKai Waehner
4. April 2024
No comments

Apache Kafka became the de facto standard for data streaming. However, the combination of an event-driven architecture with request-response APIs is crucial for most enterprise architectures. This blog post explores how Tinybird innovates with a REST/HTTP layer on top of the open source analytics database ClickHouse in the cloud. Integrating Kafka with Tinybird, the benefits of fully managed services like Confluent Cloud, and customer stories from Factorial and FanDuel show why Kafka and analytics databases complement each other for more innovation and faster time-to-market.

2.0K views
3 minute read

When NOT to Use Apache Kafka? (Lightboard Video)

ByKai Waehner
26. March 2024
1 share
No comments

Apache Kafka is the de facto standard for data streaming to process data in motion. With its significant adoption growth across all industries, I get a very valid question every week: When NOT to use Apache Kafka? What limitations does the event streaming platform have? When does Kafka simply not provide the needed capabilities? How to qualify Kafka out as it is not the right tool for the job? This blog post contains a lightboard video that gives you a twenty-minute explanation of the DOs and DONTs.

The State of Data Streaming for Healthcare in 2023 with Apache Kafka and Flink

2.7K views
6 minute read

The State of Data Streaming for Healthcare with Apache Kafka and Flink

ByKai Waehner
27. November 2023
No comments

This blog post explores the state of data streaming for the healthcare industry powered by Apache Kafka and Apache Flink. IT modernization and innovation with pioneering technologies like sensors, telemedicine, or AI/machine learning are explored. I look at enterprise architectures and customer stories from Humana, Recursion, BHG (former Bankers Healthcare Group), and more. A complete slide deck and on-demand video recording are included.

Modernization of OT IT and SCADA with Data Streaming

3.4K views
4 minute read

Modernizing SCADA Systems and OT/IT Integration with Data Streaming

ByKai Waehner
10. September 2023
No comments

SCADA control systems are a vital component of IT/OT modernization. The old IT/OT infrastructure and SCADA system are monolithic, proprietary, not scalable, and miss open APIs based on standard interfaces. This post explains the modernization of such a system based on the real-life use case of 50Hertz, a transmission system operator for electricity in Germany. A lightboard video is included.

Global Supply Chain with IoT and Data Streaming

2.7K views
8 minute read

Transforming the Global Supply Chain with Data Streaming and IoT

ByKai Waehner
6. February 2023
No comments

The research company IoT Analytics found eight key technologies transforming the future of the global supply chain. This article explores how data streaming helps to innovate in this area. Real-world case studies from global players such as BMW, Bosch, and Walmart show the value of real-time data streaming to improve the supply chain by building use cases such as automated intralogistics, track and trace of vehicles, and proactive and context-specific decision-making with MES and ERP integration.

Real-Time Supply Chain Control Tower with Apache Kafka

4.4K views
6 minute read

A Real-Time Supply Chain Control Tower powered by Kafka

ByKai Waehner
23. September 2022
No comments

A modern supply chain requires just-in-time production, global logistics, and complex manufacturing processes. This blog post explores a solution that ingests all information flows into a unified central nervous system. The idea of the Supply Chain Control Tower becomes a reality: An integrated data cockpit with real-time access to all levels and systems of the supply chain.

6.6K views
5 minute read

The Heart of the Data Mesh Beats Real-Time with Apache Kafka

ByKai Waehner
28. July 2022
No comments

If there were a buzzword of the hour, it would undoubtedly be “data mesh”! This new architectural paradigm unlocks analytic and transactional data at scale and enables rapid access to an ever-growing number of distributed domain datasets for various usage scenarios. The data mesh addresses the most common weaknesses of the traditional centralized data lake or data platform architecture. And the heart of a decentralized data mesh infrastructure must be real-time, reliable, and scalable. Learn how the de facto standard for data streaming, Apache Kafka, plays a crucial role in building a data mesh.

Technology Evangelist

Kai Waehner

Analytics

How Microsoft Fabric Lakehouse Complements Data Streaming (Apache Kafka, Flink, et al.)

Technology Evangelist

Apache Kafka vs. Middleware (MQ, ETL, ESB) – Slides + Video

Deep Learning Example: Apache Kafka + Python + Keras + TensorFlow + Deeplearning4j

How Microsoft Fabric Lakehouse Complements Data Streaming (Apache Kafka, Flink, et al.)