Deep Learning KSQL UDF for Streaming Anomaly Detection of MQTT IoT Sensor Data

Posted in Analytics, Apache Kafka, Big Data, Cloud, Cloud-Native, Confluent, Deep Learning, Integration, Internet of Things, Java / JEE, Kafka Connect, Kafka Streams, KSQL, Machine Learning, Microservices, MQTT, Open Source on August 2nd, 2018 by Kai Wähner

I built a scenario for a hybrid machine learning infrastructure leveraging Apache Kafka as scalable central nervous system. The public cloud is used for training analytic models at extreme scale (e.g. using TensorFlow and TPUs on Google Cloud Platform (GCP) via Google ML Engine. The predictions (i.e. model inference) are executed on premise at the edge in a local Kafka infrastructure (e.g. leveraging Kafka Streams or KSQL for streaming analytics).

This post focuses on the on premise deployment. I created a Github project with a KSQL UDF for sensor analytics. It leverages the new API features of KSQL to build UDF / UDAF functions easily with Java to do continuous stream processing on incoming events.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , ,

Machine Learning Trends of 2018 combined with the Apache Kafka Ecosystem

Posted in Analytics, Apache Kafka, Apache Spark, Big Data, Business Intelligence, Confluent, Deep Learning, Kafka Streams, KSQL, Kubernetes, Machine Learning, Microservices, Open Source, Stream Processing on February 13th, 2018 by Kai Wähner

At OOP 2018 conference in Munich, I presented an updated version of my talk about building scalable, mission-critical microservices with the Apache Kafka ecosystem and Deep Learning frameworks like TensorFlow, DeepLearning4J or H2O. I want to share the updated slide deck and discuss a few updates about newest trends, which I incorporated into the talk.

Tags: , , , , , , , , , , , , , ,