Data Streaming Archives

892 views
9 minute read

Energy Trading with Apache Kafka and Flink

ByKai Waehner
28. June 2024
No comments

Energy trading and data streaming are connected because real-time data helps traders make better decisions in the fast-moving energy markets. This data includes things like price changes, supply and demand, smart IoT meters and sensors, and weather, which help traders react quickly and plan effectively. As a result, data streaming with Apache Kafka and Apache Flink makes the market clearer, speeds up information sharing, and improves forecasting and risk management. This blog post explores the use cases and architectures for scalable and reliable real-time energy trading, including real-world deployments from Uniper, re.alto and Powerledger.

RAG and Kafka Flink to Prevent Hallucinations in GenAI

1.8K views
5 minute read

Real-Time GenAI with RAG using Apache Kafka and Flink to Prevent Hallucinations

ByKai Waehner
30. May 2024
No comments

How do you prevent hallucinations from large language models (LLMs) in GenAI applications? LLMs need real-time, contextualized, and trustworthy data to generate the most reliable outputs. This blog post explains how RAG and a data streaming platform with Apache Kafka and Flink make that possible. A lightboard video shows how to build a context-specific real-time RAG architecture. Also, learn how the travel agency Expedia leverages data streaming with Generative AI using conversational chatbots to improve the customer experience and reduce the cost of service agents.

Data Lineage for Data Streaming with OpenLineage Apache Kafka and Flink

2.9K views
11 minute read

Open Standards for Data Lineage: OpenLineage for Batch AND Streaming

ByKai Waehner
13. May 2024
No comments

One of the greatest wishes of companies is end-to-end visibility in their operational and analytical workflows. Where does data come from? Where does it go? To whom am I giving access to? How can I track data quality issues? The capability to follow the data flow to answer these questions is called data lineage. This blog post explores market trends, efforts to provide an open standard with OpenLineage, and how data governance solutions from vendors such as IBM, Google, Confluent and Collibra help fulfil the enterprise-wide data governance needs of most companies, including data streaming technologies such as Apache Kafka and Flink.

17.8K views
12 minute read

The Past, Present and Future of Stream Processing

ByKai Waehner
20. March 2024
No comments

Stream processing has existed for decades. The adoption grows with open source frameworks like Apache Kafka and Flink in combination with fully managed cloud services. This blog post explores the past, present and future of stream processing, including the relation of machine learning and GenAI, streaming databases, and the integration between data streaming and data lakes with Apache Iceberg.

5.6K views
10 minute read

GenAI Demo with Kafka, Flink, LangChain and OpenAI

ByKai Waehner
29. January 2024
No comments

Generative AI (GenAI) enables automation and innovation across industries. This blog post explores a simple but powerful architecture and demo for the combination of Python, and LangChain with OpenAI LLM, Apache Kafka for event streaming and data integration, and Apache Flink for stream processing. The use case shows how data streaming and GenAI help to correlate data from Salesforce CRM, searching for lead information in public datasets like Google and LinkedIn, and recommending ice-breaker conversations for sales reps.

1.9K views
13 minute read

Customer Loyalty and Rewards Platform with Apache Kafka

ByKai Waehner
14. January 2024
No comments

Loyalty and rewards platforms are crucial for customer retention and revenue growth for many enterprises across industries. Apache Kafka provides context-specific real-time data and consistency across all applications and databases for a modern and flexible enterprise architecture. This blog post looks at case studies from Albertsons (retail), Globe Telecom (telco), Virgin Australia (aviation), Disney+ Hotstar (sports and gaming), and Porsche (automotive) to explain the value of data streaming for improving the customer loyalty.

12.9K views
12 minute read

Apache Kafka + Vector Database + LLM = Real-Time GenAI

ByKai Waehner
8. November 2023
No comments

Generative AI (GenAI) enables advanced AI use cases and innovation but also changes how the enterprise architecture looks like. Large Language Models (LLM), Vector Databases, and Retrieval Augmentation Generation (RAG) require new data integration patterns. Data streaming with Apache Kafka and Apache Flink processes incoming data sets in real-time at scale, connects various platforms, and enables decoupled data products.

2.4K views
6 minute read

The State of Data Streaming for Gaming in 2023

ByKai Waehner
1. November 2023
No comments

This blog post explores the state of data streaming for the gaming industry in 2023, including customer stories from Kakao Games, Mobile Premier League (MLP), Demonware / Blizzard, and more. A complete slide deck and on-demand video recording are included.

Data Streaming with Apache Kafka at Airlines - Lufthansa Case Study

4.4K views
5 minute read

How Lufthansa uses Apache Kafka for Middleware and Analytics

ByKai Waehner
24. September 2023
No comments

Aviation and travel are notoriously vulnerable to social, economic, and political events, as well as the ever-changing expectations of consumers. The coronavirus was just a piece of the challenge. This post explores how Lufthansa leverages data streaming powered by Apache Kafka as cloud-native middleware for mission-critical data integration projects and as data fabric for AI/machine learning scenarios such as real-time predictions in fleet management. An interactive conversation with Lufthansa as an on-demand video is added at the end as a highlight if you want to learn more.

The State of Data Streaming for Energy and Utilities in 2023

2.4K views
8 minute read

The State of Data Streaming for Energy & Utilities

ByKai Waehner
1. September 2023
No comments

The evolution of utility infrastructure, energy distribution, customer services, and new business models requires real-time end-to-end visibility, reliable and intuitive B2B and B2C communication, and integration with pioneering technologies like 5G for low latency or augmented reality for innovation. I look at trends in the utilities sector to explore how data streaming helps as a business enabler, including customer stories from SunPower, 50hertz, Powerledger, and more. A complete slide deck and on-demand video recording are included.

Technology Evangelist

Kai Waehner

Data Streaming

Energy Trading with Apache Kafka and Flink

Open Standards for Data Lineage: OpenLineage for Batch AND Streaming

Technology Evangelist

Apache Kafka vs. Middleware (MQ, ETL, ESB) – Slides + Video

Deep Learning Example: Apache Kafka + Python + Keras + TensorFlow + Deeplearning4j

Apache Iceberg – The Open Table Format for Lakehouse AND Data Streaming

The Digitalization of Airport and Airlines with IoT and Data Streaming using Kafka and Flink

Energy Trading with Apache Kafka and Flink