Apache Kafka + KSQL + TensorFlow for Data Scientists via Python + Jupyter Notebook

Posted in Analytics, Apache Kafka, Big Data, Confluent, Deep Learning, Integration, Jupyter, Kafka Connect, Kafka Streams, KSQL, Machine Learning, Open Source, Python, Stream Processing, TensorFlow on January 18th, 2019 by Kai Wähner

Why would a data scientist use Kafka Jupyter Python KSQL TensorFlow all together in a single notebook?

There is an impedance mismatch between model development using Python and its Machine Learning tool stack and a scalable, reliable data platform. The former is what you need for quick and easy prototyping to build analytic models. The latter is what you need to use for data ingestion, preprocessing, model deployment and monitoring at scale. It requires low latency, high throughput, zero data loss and 24/7 availability requirements.

Tags: , , , , , , , , , , , , , , , , , , , , , ,

Visual Analytics + Open Source Deep Learning Frameworks

Posted in Analytics, Big Data, Cloud, Hadoop, Machine Learning on April 24th, 2017 by Kai Wähner

Deep Learning gets more and more traction. It basically focuses on one section of Machine Learning: Artificial Neural Networks. This article explains why Deep Learning is a game changer in analytics, when to use it, and how Visual Analytics allows business analysts to leverage the analytic models built by a (citizen) data scientist.

Tags: , , , , , , , , , , , , , , , , , , , , , ,

Comparison: Data Preparation vs. Inline Data Wrangling in Machine Learning and Deep Learning Projects

Posted in Analytics, Big Data, Business Intelligence, Hadoop on February 13th, 2017 by Kai Wähner

I want to highlight a new presentation about Data Preparation in Data Science projects:

“Comparison of Programming Languages, Frameworks and Tools for Data Preprocessing and (Inline) Data Wrangling  in Machine Learning / Deep Learning Projects”

Data Preparation as Key for Success in Data Science Projects

A key task to create appropriate analytic models in machine learning or deep learning is the integration and preparation of data sets from various sources like files, databases, big data storages, sensors or social networks. This step can take up to 80% of the whole project.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,