Deep Learning KSQL UDF for Streaming Anomaly Detection of MQTT IoT Sensor Data

Posted in Analytics, Apache Kafka, Big Data, Cloud, Cloud-Native, Confluent, Deep Learning, Integration, Internet of Things, Java / JEE, Kafka Connect, Kafka Streams, KSQL, Machine Learning, Microservices, MQTT, Open Source on August 2nd, 2018 by Kai Wähner

I built a scenario for a hybrid machine learning infrastructure leveraging Apache Kafka as scalable central nervous system. The public cloud is used for training analytic models at extreme scale (e.g. using TensorFlow and TPUs on Google Cloud Platform (GCP) via Google ML Engine. The predictions (i.e. model inference) are executed on premise at the edge in a local Kafka infrastructure (e.g. leveraging Kafka Streams or KSQL for streaming analytics).

This post focuses on the on premise deployment. I created a Github project with a KSQL UDF for sensor analytics. It leverages the new API features of KSQL to build UDF / UDAF functions easily with Java to do continuous stream processing on incoming events.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , ,

Apache Kafka vs. ESB / ETL / MQ

Posted in Apache Kafka, Big Data, Confluent, EAI, ESB, Integration, Kafka Connect, Kafka Streams, KSQL, Messaging, Microservices, Middleware, Open Source, SOA, Stream Processing on July 18th, 2018 by Kai Wähner

Apache Kafka and Enterprise Service Bus (ESB) are complementary, not competitive!

Apache Kafka is much more than messaging in the meantime. It evolved to a streaming platform including Kafka Connect, Kafka Streams, KSQL and many other open source components. Kafka leverages events as a core principle. You think in data flows of events and process the data while it is in motion. Many concepts, such as event sourcing, or design patterns such as Enterprise Integration Patterns (EIPs), are based on event-driven architecture.

Tags: , , , , , , , , , , , , , , , , ,

Deep Learning at Extreme Scale 
with the Apache Kafka Open Source Ecosystem

Posted in Analytics, Apache Kafka, Big Data, Cloud, Confluent, Deep Learning, Integration, Kafka Connect, Kafka Streams, KSQL, Kubernetes, Machine Learning, Microservices, Open Source on May 9th, 2018 by admin

I had a new talk presented at “Codemotion Amsterdam 2018” this week. I discussed the relation of Apache Kafka and Machine Learning to build a Machine Learning infrastructure for extreme scale.

Long version of the title:

Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Source Ecosystem – How to Build a Machine Learning Infrastructure with Kafka, Connect, Streams, KSQL, etc.

As always, I want to share the slide deck. The talk was also recorded. I will share the video as soon as it was published by the organizer.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Video Recording – Apache Kafka as Event-Driven Open Source Streaming Platform (Voxxed Zurich 2018)

Posted in Apache Kafka, Big Data, Cloud, Docker, EAI, ESB, Integration, Java / JEE, Kafka Connect, Kafka Streams, KSQL, Kubernetes, Messaging, Microservices, Middleware, Open Source, SOA, Stream Processing on March 13th, 2018 by admin

I spoke at Voxxed Zurich 2018 about Apache Kafka as Event-Driven Open Source Streaming Platform. The talk includes an intro to Apache Kafka and its open source ecosystem (Kafka Streams, Connect, KSQL, Schema Registry, etc.). Just want to share the video recording of my talk.

Abstract

This session introduces Apache Kafka, an event-driven open source streaming platform. Apache Kafka goes far beyond scalable, high volume messaging. In addition, you can leverage Kafka Connect for integration and the Kafka Streams API for building lightweight stream processing microservices in autonomous teams. The open source Confluent Platform adds further components such as a KSQL, Schema Registry, REST Proxy, Clients for different programming languages and Connectors for different technologies and databases. Live Demos included.

Tags: , , , , , , , , , , , , ,

Apache Kafka Streams + Machine Learning (Spark, TensorFlow, H2O.ai)

Posted in Analytics, Apache Kafka, Apache Spark, Big Data, Confluent, Hadoop, Integration, Kafka Connect, Kafka Streams, Machine Learning, Messaging, Microservices, Open Source, Stream Processing on May 23rd, 2017 by Kai Wähner

I started at Confluent in May 2017 to work as Technology Evangelist focusing on topics around the open source framework Apache Kafka. I think Machine Learning is one of the hottest buzzwords these days as it can add huge business value in any industry. Therefore, you will see various other posts from me around Apache Kafka (messaging), Kafka Connect (integration), Kafka Streams (stream processing), Confluent’s additional open source add-ons on top of Kafka (Schema Registry, Replicator, Auto Balancer, etc.). I will explain how to leverage all this for machine learning and other big data technologies in real world production scenarios.

Tags: , , , , , , , , , , , , , , , , , , ,