Deep Learning KSQL UDF for Streaming Anomaly Detection of MQTT IoT Sensor Data

Posted in Analytics, Apache Kafka, Big Data, Cloud, Cloud-Native, Confluent, Deep Learning, Integration, Internet of Things, Java / JEE, Kafka Connect, Kafka Streams, KSQL, Machine Learning, Microservices, MQTT, Open Source on August 2nd, 2018 by Kai Wähner

I built a scenario for a hybrid machine learning infrastructure leveraging Apache Kafka as scalable central nervous system. The public cloud is used for training analytic models at extreme scale (e.g. using TensorFlow and TPUs on Google Cloud Platform (GCP) via Google ML Engine. The predictions (i.e. model inference) are executed on premise at the edge in a local Kafka infrastructure (e.g. leveraging Kafka Streams or KSQL for streaming analytics).

This post focuses on the on premise deployment. I created a Github project with a KSQL UDF for sensor analytics. It leverages the new API features of KSQL to build UDF / UDAF functions easily with Java to do continuous stream processing on incoming events.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , ,

Apache Kafka vs. ESB / ETL / MQ

Posted in Apache Kafka, Big Data, Confluent, EAI, ESB, Integration, Kafka Connect, Kafka Streams, KSQL, Messaging, Microservices, Middleware, Open Source, SOA, Stream Processing on July 18th, 2018 by Kai Wähner

Apache Kafka and Enterprise Service Bus (ESB) are complementary, not competitive!

Apache Kafka is much more than messaging in the meantime. It evolved to a streaming platform including Kafka Connect, Kafka Streams, KSQL and many other open source components. Kafka leverages events as a core principle. You think in data flows of events and process the data while it is in motion. Many concepts, such as event sourcing, or design patterns such as Enterprise Integration Patterns (EIPs), are based on event-driven architecture.

Tags: , , , , , , , , , , , , , , , , ,

Model Serving: Stream Processing vs. RPC / REST with Java, gRPC, Apache Kafka, TensorFlow

Posted in Analytics, Apache Kafka, Big Data, Confluent, Deep Learning, Java / JEE, Kafka Streams, KSQL, Machine Learning, Microservices, Open Source, Stream Processing on July 9th, 2018 by Kai Wähner

Machine Learning / Deep Learning models can be used in different ways to do predictions. My preferred way is to deploy an analytic model directly into a stream processing application (like Kafka Streams or KSQL). You could e.g. use the TensorFlow for Java API. This allows best latency and independence of external services. Several examples can be found in my Github project: Model Inference within Kafka Streams Microservices using TensorFlow, H2O.ai, Deeplearning4j (DL4J).

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Apache Kafka + KSQL Live Demo (Video Recording) using CSV, JSON, Apache Avro

Posted in Apache Kafka, Big Data, Kafka Streams, KSQL, Messaging, Microservices, Open Source, Stream Processing on June 25th, 2018 by Kai Wähner

KSQL is the open-source, Apache 2.0 licensed streaming SQL engine on top of Apache Kafka from Confluent. KSQL makes stream processing available to everyone. Even though it is simple to use because there is no need to write source code, KSQL is built for mission-critical and scalable production deployments (using Kafka Streams under the hood).

Live Demo – KSQL with CSV, JSON and Apache Avro

The following video shows a live demo using Delimited, JSON and Avro data to create STREAMs and TABLEs for continuous stream processing of events in Apache Kafka:

 

Tags: , , , , , ,

Deep Learning at Extreme Scale 
with the Apache Kafka Open Source Ecosystem

Posted in Analytics, Apache Kafka, Big Data, Cloud, Confluent, Deep Learning, Integration, Kafka Connect, Kafka Streams, KSQL, Kubernetes, Machine Learning, Microservices, Open Source on May 9th, 2018 by admin

I had a new talk presented at “Codemotion Amsterdam 2018” this week. I discussed the relation of Apache Kafka and Machine Learning to build a Machine Learning infrastructure for extreme scale.

Long version of the title:

Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Source Ecosystem – How to Build a Machine Learning Infrastructure with Kafka, Connect, Streams, KSQL, etc.

As always, I want to share the slide deck. The talk was also recorded. I will share the video as soon as it was published by the organizer.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Video Recording – Apache Kafka as Event-Driven Open Source Streaming Platform (Voxxed Zurich 2018)

Posted in Apache Kafka, Big Data, Cloud, Docker, EAI, ESB, Integration, Java / JEE, Kafka Connect, Kafka Streams, KSQL, Kubernetes, Messaging, Microservices, Middleware, Open Source, SOA, Stream Processing on March 13th, 2018 by admin

I spoke at Voxxed Zurich 2018 about Apache Kafka as Event-Driven Open Source Streaming Platform. The talk includes an intro to Apache Kafka and its open source ecosystem (Kafka Streams, Connect, KSQL, Schema Registry, etc.). Just want to share the video recording of my talk.

Abstract

This session introduces Apache Kafka, an event-driven open source streaming platform. Apache Kafka goes far beyond scalable, high volume messaging. In addition, you can leverage Kafka Connect for integration and the Kafka Streams API for building lightweight stream processing microservices in autonomous teams. The open source Confluent Platform adds further components such as a KSQL, Schema Registry, REST Proxy, Clients for different programming languages and Connectors for different technologies and databases. Live Demos included.

Tags: , , , , , , , , , , , , ,

Deep Learning in Real Time with TensorFlow, H2O.ai and Kafka Streams (Slides from JavaOne 2017)

Posted in Analytics, Apache Kafka, Big Data, Business Intelligence, Confluent, Deep Learning, Docker, Java / JEE, Kafka Streams, Machine Learning, Microservices, Open Source, Stream Processing on October 4th, 2017 by Kai Wähner

Early October… Like every year in October, it is time for JavaOne and Oracle Open World in San Francisco… I am glad to be back at this huge event again. My talk at JavaOne 2017 was all about deployment of analytic models to scalable production systems leveraging Apache Kafka and Kafka Streams. Let’s first look at the abstract. After that I attach the slides and refer to further material around this topic.

Tags: , , , , , , , , , , , , , , , , , , , , ,

Kafka Streams + H2O.ai + TensorFlow (Video Recording / Live Demo)

Posted in Analytics, Apache Kafka, Big Data, Kafka Streams, Machine Learning, Open Source, Stream Processing on September 7th, 2017 by Kai Wähner

I do a lot of presentations these days at meetups and conferences with one focus: How to leverage Apache Kafka and Kafka Streams to apply analytic models (built with H2O, TensorFlow, DeepLearning4J and other frameworks) to scalable, mission-critical environments. As many attendees have asked me, I created a video recording about this talk (focusing on live demos).

Tags: , , , , , , , , , , , , , ,

Apache Kafka Streams + Machine Learning (Spark, TensorFlow, H2O.ai)

Posted in Analytics, Apache Kafka, Apache Spark, Big Data, Confluent, Hadoop, Integration, Kafka Connect, Kafka Streams, Machine Learning, Messaging, Microservices, Open Source, Stream Processing on May 23rd, 2017 by Kai Wähner

I started at Confluent in May 2017 to work as Technology Evangelist focusing on topics around the open source framework Apache Kafka. I think Machine Learning is one of the hottest buzzwords these days as it can add huge business value in any industry. Therefore, you will see various other posts from me around Apache Kafka (messaging), Kafka Connect (integration), Kafka Streams (stream processing), Confluent’s additional open source add-ons on top of Kafka (Schema Registry, Replicator, Auto Balancer, etc.). I will explain how to leverage all this for machine learning and other big data technologies in real world production scenarios.

Tags: , , , , , , , , , , , , , , , , , , ,

Why I Move (Back) to Open Source for Messaging, Integration and Stream Processing

Posted in Analytics, API Management, Big Data, Blockchain, Cloud, Cloud-Native, Docker, ESB, Hadoop, Internet of Things, Java / JEE, Machine Learning, Microservices, Middleware, SOA on May 1st, 2017 by Kai Wähner

After three great years at TIBCO Software, I move back to open source and join Confluent, a company focusing on the open source project Apache Kafka to build mission-critical, scalable infrastructures for messaging, integration and streaming analytics. Confluent is a Silicon Valley startup, still in the beginning of its journey, with a 700% growing business in 2016, and is exjustpected to grow significantly in 2017 again.

In this blog post, I want to share why I see the future for middleware and big data analytics in open source technologies, why I really like Confluent, what I will focus on in the next months, and why I am so excited about this next step in my career.

Tags: , , , , , , , , , , , , , , , , , ,