Real Time Analytics with Apache Kafka in the Healthcare Industry

Real Time Analytics and Machine Learning with Apache Kafka in Healthcare
IT modernization and innovative new technologies change the healthcare industry significantly. This blog series explores how data streaming with Apache Kafka enables real-time data processing and business process automation. This is part four: Real-Time Analytics. Examples include Cerner, Celmatix, CDC/Centers for Disease Control and Prevention.

IT modernization and innovative new technologies change the healthcare industry significantly. This blog series explores how data streaming with Apache Kafka enables real-time data processing and business process automation. Real-world examples show how traditional enterprises and startups increase efficiency, reduce cost, and improve the human experience across the healthcare value chain, including pharma, insurance, providers, retail, and manufacturing. This is part four: Real-Time Analytics. Examples include Cerner, Celmatix, CDC/Centers for Disease Control and Prevention.

Real Time Analytics and Machine Learning with Apache Kafka in Healthcare

Blog Series – Kafka in Healthcare

Many healthcare companies leverage Kafka today. Use cases exist in every domain across the healthcare value chain. Most companies deploy data streaming in different business domains. Use cases often overlap. I tried to categorize a few real-world deployments into different technical scenarios and added a few real-world examples:

Stay tuned for a dedicated blog post for each of these topics as part of this blog series. I will link the blogs here as soon as they are available (in the next few weeks). Subscribe to my newsletter to get an email after each publication (no spam or ads).

Real-Time Analytics with Apache Kafka

Real-time analytics (aka stream processing, streaming analytics, or complex event processing) is a data processing technology used to collect, store, and manage continuous data streams when produced or received.

Stream processing has many use cases. Examples include the backend process for claim processing, billing, logistics, manufacturing, fulfillment, or fraud detection. Data processing may need to be decoupled from the frontend, where users click buttons and expect things to happen.

The de facto standard for real-time analytics is Apache Kafka. Kafka is like a central data hub that holds shared events and keeps services in sync. Its distributed cluster technology provides availability, resiliency, and performance properties that strengthen the architecture. It leaves the programmer to write and deploy client applications that will run load balanced and be highly available.

Real Time Analytics with Data Streaming Stream Processing and Apache Kafka

Technologies for real-time analytics with the Kafka ecosystem include Kafka-native stream processing with Kafka Streams or ksqlDB, or 3rd party add-ons like Apache Flink, Spark Streaming, or commercial streaming analytics cloud services.

The critical difference with the Kafka ecosystem is that you leverage a single platform for data integration and processing at scale in real-time. There is no need to combine several platforms to achieve this. The result is a Kappa architecture that enables real-time but also batch workloads with a single integration architecture.

Let’s look at a few real-world deployments in the healthcare sector.

Cerner – Sepsis Alerting in Real-Time

Cerner is a supplier of health information technology services, devices, and hardware. ~30% of all US healthcare data in a Cerner solution.

Sepsis kills. In fact, it kills up to 52,000 people every year in the UK alone. With sepsis alerting, the key to saving lives is early identification, especially the need to administer antibiotics within that first critical ‘golden hour’. Quick alerts make a significant impact. Cerner’s sepsis alert, coupled with the care plans developed with the big room approach, means that patients are now 71% more likely to receive timely antibiotics.

Cerner leverages a Kafka-powered central event streaming platform for sepsis alerting in real-time to save lives. Legacy systems hit a wall preventing going faster (and missed SLAs). The data processing with Kafka progressed from minutes to seconds.

Real Time Sepsis Alerting at Cerner with Apache Kafka

Cerner is a long-term Kafka user and early adopter in the healthcare sector. Learn more about this use case in their Kafka Summit talk from 2016.

Celmatix – Reproductive Health Care

Celmatix is a preclinical-stage biotech company that provides digital tools and genetic insights focused on fertility. They offer personalized information to disrupt how women approach their lifelong reproductive health journey.

The streaming platform provides real-time aggregation of heterogeneous data collected from Electronic Medical Records (EMRs) and genetic data collected from partners through their Personalized Reproductive Medicine (PReM) Initiative.

Proactive reproductive health decisions are enabled by real-time genomics data and by applying technologies such as big data analytics, machine learning, A/I, and whole-genome DNA sequencing.

Celmatix Reproductive Health Care Eletronical Medical Records EMR Processing with Apache Kafka

Data governance for security and compliance is critical in such a healthcare application. “Apache Kafka and Confluent are invaluable investments to scale the way we want to and future-proof our business,” says the lead data architect at Celmatix. Learn more in the Confluent case study.

CDC – Covid-19 Electronic Lab Reporting

The Centers for Disease Control and Prevention (CDC) built Covid-19 Electronic Lab Reporting (CELR) with the Kafka ecosystem. Use cases include case notifications, lab reporting, and healthcare interoperability.

The threat of the COVID-19 virus is tracked in real-time to provide comprehensive data for local, state, and federal responses. The application allows them to understand locations with an increase in incidence better.

With the true decoupling of the data streaming platform, the CDC can rapidly aggregate, validate, transform, and distribute laboratory testing data submitted by public health departments and other partners:

Centers for Disease Control and Prevention CDC Covid Analytics with Kafka

Real-Time Analytics with Kafka for Smart Healthcare Applications at any Scale

Think about IoT sensor analytics, cybersecurity, patient communication, insurance, research, and many other domains. Real-time data beats slow data in the healthcare supply chain almost everywhere.

This blog post explored the capabilities of the Apache Kafka ecosystem for real-time analytics. Real-world deployments from Cerner, Celmatix and the Centers for Disease Control and Prevention showed how enterprises successfully deploy Kafka for different enterprise architecture use cases.

How do you leverage data streaming with Apache Kafka in the healthcare industry? What architecture does your platform use? Which products do you combine with data streaming? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

Dont‘ miss my next post. Subscribe!

We don’t spam! Read our privacy policy for more info.
If you have issues with the registration, please try a private browser tab / incognito mode. If it doesn't help, write me: kontakt@kai-waehner.de

Leave a Reply
You May Also Like
How to do Error Handling in Data Streaming
Read More

Error Handling via Dead Letter Queue in Apache Kafka

Recognizing and handling errors is essential for any reliable data streaming pipeline. This blog post explores best practices for implementing error handling using a Dead Letter Queue in Apache Kafka infrastructure. The options include a custom implementation, Kafka Streams, Kafka Connect, the Spring framework, and the Parallel Consumer. Real-world case studies show how Uber, CrowdStrike, Santander Bank, and Robinhood build reliable real-time error handling at an extreme scale.
Read More