Apache Flink Archives

Real-Time AI ML Model Inference Predictive AI and Generative AI with Data Streaming using Apache Kafka and Flink

1.1K views
14 minute read

Real-Time Model Inference with Apache Kafka and Flink for Predictive AI and GenAI

ByKai Waehner
1. October 2024
No comments

Artificial Intelligence (AI) and Machine Learning (ML) are transforming business operations by enabling systems to learn from data and make intelligent decisions for predictive and generative AI use cases. Two essential components of AI/ML are model training and inference. This blog post explores how data streaming with Apache Kafka and Flink enhances the performance and reliability of model predictions. Whether for real-time fraud detection, smart customer service applications or predictive maintenance, understanding the value of data streaming for model inference is crucial for leveraging AI/ML effectively.

Industrial IoT Middleware OT IT Bridge between Edge and Cloud with Apache Kafka and Flink

888 views
11 minute read

Industrial IoT Middleware for Edge and Cloud OT/IT Bridge powered by Apache Kafka and Flink

ByKai Waehner
20. September 2024
No comments

As industries continue to adopt digital transformation, the convergence of Operational Technology (OT) and Information Technology (IT) has become essential. The OT/IT Bridge is a key concept in industrial automation to connect real-time operational processes with business-oriented IT systems ensuring seamless data flow and coordination. By leveraging Industrial IoT middleware and data streaming technologies like Apache Kafka and Flink, businesses can achieve a unified approach to managing both production processes and higher-level business operations to drive greater efficiency, predictive maintenance, and streamlined decision-making.

Unified Commerce with Data Streaming using Apache Kafka and Flink at the Edge and in the Cloud

1.3K views
8 minute read

Unified Commerce in Retail and eCommerce with Apache Kafka and Flink for Real-Time Customer 360

ByKai Waehner
30. August 2024
No comments

Delivering a seamless and personalized customer experience across all touchpoints is essential for staying competitive in today’s rapidly evolving retail and eCommerce landscape. Unified commerce integrates all sales channels and backend systems into a single platform to ensure real-time consistency in customer interactions, inventory management, and order fulfillment. This blog post explores how Apache Kafka and Flink can be pivotal in achieving real-time Customer 360 in the unified commerce ecosystem and how it differs from traditional omnichannel approaches.

Apache Iceberg Open Table Format for Data Lake Lakehouse Streaming wtih Kafka Flink Databricks Snowflake AWS GCP Azure

8.3K views
11 minute read

Apache Iceberg – The Open Table Format for Lakehouse AND Data Streaming

ByKai Waehner
13. July 2024
No comments

An open table format framework like Apache Iceberg is essential in the enterprise architecture to ensure reliable data management and sharing, seamless schema evolution, efficient handling of large-scale datasets and cost-efficient storage. This blog post explores market trends, adoption of table format frameworks like Iceberg, Hudi, Paimon, Delta Lake and XTable, and the product strategy of leading vendors of data platforms such as Snowflake, Databricks (Apache Spark), Confluent (Apache Kafka / Flink), Amazon Athena and Google BigQuery.

Airport and Airlines Digitalization with Data Streaming using Apache Kafka and Flink

2.5K views
10 minute read

The Digitalization of Airport and Airlines with IoT and Data Streaming using Kafka and Flink

ByKai Waehner
9. July 2024
No comments

The vision for a digitalized airport includes seamless passenger experiences, optimized operations, consistent integration with airlines and retail stores, and enhanced security through the use of advanced technologies like IoT, AI, and real-time data analytics. This blog post shows the relevance of data streaming with Apache Kafka and Flink in the aviation industry to enable data-driven business process automation and innovation while modernizing the IT infrastructure with cloud-native hybrid cloud architecture.

2.0K views
9 minute read

Energy Trading with Apache Kafka and Flink

ByKai Waehner
28. June 2024
No comments

Energy trading and data streaming are connected because real-time data helps traders make better decisions in the fast-moving energy markets. This data includes things like price changes, supply and demand, smart IoT meters and sensors, and weather, which help traders react quickly and plan effectively. As a result, data streaming with Apache Kafka and Apache Flink makes the market clearer, speeds up information sharing, and improves forecasting and risk management. This blog post explores the use cases and architectures for scalable and reliable real-time energy trading, including real-world deployments from Uniper, re.alto and Powerledger.

10.7K views
8 minute read

The Shift Left Architecture – From Batch and Lakehouse to Real-Time Data Products with Data Streaming

ByKai Waehner
15. June 2024
No comments

Data integration is a hard challenge in every enterprise. Batch processing and Reverse ETL are common practices in a data warehouse, data lake or lakehouse. Data inconsistency, high compute cost, and stale information are the consequences. This blog post introduces a new design pattern to solve these problems: The Shift Left Architecture enables a data mesh with real-time data products to unify transactional and analytical workloads with Apache Kafka, Flink and Iceberg. Consistent information is handled with streaming processing or ingested into Snowflake, Databricks, Google BigQuery, or any other analytics / AI platform to increase flexibility, reduce cost and enable a data-driven company culture with faster time-to-market building innovative software applications.

RAG and Kafka Flink to Prevent Hallucinations in GenAI

5.0K views
5 minute read

Real-Time GenAI with RAG using Apache Kafka and Flink to Prevent Hallucinations

ByKai Waehner
30. May 2024
No comments

How do you prevent hallucinations from large language models (LLMs) in GenAI applications? LLMs need real-time, contextualized, and trustworthy data to generate the most reliable outputs. This blog post explains how RAG and a data streaming platform with Apache Kafka and Flink make that possible. A lightboard video shows how to build a context-specific real-time RAG architecture. Also, learn how the travel agency Expedia leverages data streaming with Generative AI using conversational chatbots to improve the customer experience and reduce the cost of service agents.

Data Lineage for Data Streaming with OpenLineage Apache Kafka and Flink

4.6K views
11 minute read

Open Standards for Data Lineage: OpenLineage for Batch AND Streaming

ByKai Waehner
13. May 2024
No comments

One of the greatest wishes of companies is end-to-end visibility in their operational and analytical workflows. Where does data come from? Where does it go? To whom am I giving access to? How can I track data quality issues? The capability to follow the data flow to answer these questions is called data lineage. This blog post explores market trends, efforts to provide an open standard with OpenLineage, and how data governance solutions from vendors such as IBM, Google, Confluent and Collibra help fulfil the enterprise-wide data governance needs of most companies, including data streaming technologies such as Apache Kafka and Flink.

My Data Streaming Journey with Kafka and Flink - 7 Years at Confluent

3.2K views
11 minute read

My Data Streaming Journey with Kafka & Flink: 7 Years at Confluent

ByKai Waehner
3. May 2024
No comments

Time flies… I joined Confluent seven years ago when Apache Kafka was mainly used by a few tech giants and the company had ~100 employees. This blog post explores my data streaming journey, including Kafka becoming a de facto standard for over 100,000 organizations, Confluent doing an IPO on the NASDAQ stock exchange, 5000+ customers adopting a data streaming platform, and emerging new design approaches and technologies like data mesh, GenAI, and Apache Flink. I look at the past, present and future of my personal data streaming journey. Both, from the evolution of technology trends and the journey as a Confluent employee that started in a Silicon Valley startup and is now part of a global software and cloud company.

Technology Evangelist

Kai Waehner

Apache Flink

Industrial IoT Middleware for Edge and Cloud OT/IT Bridge powered by Apache Kafka and Flink

Apache Iceberg – The Open Table Format for Lakehouse AND Data Streaming

The Digitalization of Airport and Airlines with IoT and Data Streaming using Kafka and Flink

Energy Trading with Apache Kafka and Flink

Open Standards for Data Lineage: OpenLineage for Batch AND Streaming

Technology Evangelist

Apache Kafka vs. Middleware (MQ, ETL, ESB) – Slides + Video

Deep Learning Example: Apache Kafka + Python + Keras + TensorFlow + Deeplearning4j

Fraud Prevention in Under 60 Seconds with Apache Kafka: How A Bank in Thailand is Leading the Charge

How Microsoft Fabric Lakehouse Complements Data Streaming (Apache Kafka, Flink, et al.)