
Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including customer stories from the manufacturing and automotive industry.
Real-Time Data Streaming in the Automotive Industry
The modern car is a rolling data center. It generates location data, sensor readings, engine diagnostics, driver behavior, and much more. Traditional batch processing and central data lakes are not fast enough to power connected car services. Automotive companies are now turning to real-time data streaming to support new business models, smarter mobility, and better customer engagement.

Real-time data data streaming allows car manufacturers and mobility providers to:
- Process telemetry as it is created
- Detect and respond to critical events instantly
- Build predictive services using up-to-date information
- Support advanced analytics and AI with fresh data
This transformation is explained in more detail in two blog posts:
- Driving the Future: How Real-Time Data Streaming Is Powering Automotive Innovation
- Streaming the Automotive Future: Real-Time Infrastructure for Vehicle Data
These blogs highlight how automakers are building streaming architectures with Kafka, Flink, and Confluent Cloud to support both operational decisions and strategic initiatives. Real-time data is used across production, charging, remote diagnostics, fleet management, and personalization.
Rivian is one of the leading examples of this shift. Their use of data streaming shows how to scale real-time intelligence across a connected fleet.
What Is Rivian and RV Tech and the Relation to Volkswagen Group
Rivian is a US-based electric vehicle manufacturer focused on adventure and sustainability. Its lineup includes pickup trucks, SUVs, and commercial vans. Every Rivian vehicle is software-defined, with always-on connectivity and frequent over-the-air updates.
To power the connected experience, Rivian created Rivian Tech, a dedicated software and data organization. RV Tech builds the cloud infrastructure that supports vehicle telemetry, diagnostics, real-time monitoring, mobile notifications, and data-driven services.
In 2025, Rivian announced a strategic partnership with Volkswagen Group to create RV Tech, a new joint venture. The goal is to develop a shared software platform for both companies. Volkswagen will use this platform across multiple brands and vehicle types. This partnership puts RV Tech at the center of next-generation vehicle innovation.

RV Tech’s Data Platform for Electric Vehicles
Rivian’s data platform for electric vehicles (EV) processes hundreds of megabytes per second from over 150,000 connected vehicles. It handles tens of millions of requests and supports over 5,500 different types of telemetry signals. These numbers are growing as more vehicles go live.
The Challenge of Streaming Vehicle Telemetry at Scale
Every Rivian vehicle streams over 5,500 telemetry signals every five seconds. This creates a firehose of raw data. Only a small fraction of these signals are relevant for downstream use cases such as push notifications or anomaly detection. Still, every Flink job had to consume the entire Kafka topic, filtering out 99.9 percent of the data internally.
This caused several problems. Compute costs increased. Flink jobs competed for resources. Kafka clusters became harder to scale. Adding new pipelines meant even more duplication of filtering logic. Teams were hitting both cost and stability limits.
The situation was not sustainable.
Rivian’s Mega Filter: Shift Left Architecture to Reduce Traffic Volume by 88 Percent
Rivian vehicles stream over 5,500 signals every five seconds, but only some correlated information is needed for real-time use cases like push notifications or alerts. Without filtering and aggregations, downstream systems were flooded. Kafka topics ballooned, Flink jobs wasted resources, and EKS clusters scaled up unnecessarily.
To solve this, Rivian built a shift left architecture: Instead of pushing the full firehose through every Flink job, they introduced a stateful pre-filtering layer called Mega Filter.

The results are massive. Daily data volume dropped by 88 percent from 288 TB to 34 TB per day. Kafka stays lean. Flink jobs focus on logic, not filtering. EKS clusters need fewer nodes.

This is what a modern data streaming architecture looks like: efficient, adaptable, and built for scale. More importantly, data quality improves. Teams now work with relevant, well-defined telemetry from the beginning.
The Shift Left Architecture from Apps, Databases and IoT Telemetry to Lakehouse Analytics
The Shift Left pattern is explained in more depth in this blog post: The Shift Left Architecture: From Batch and Lakehouse to Real-Time Data Products with Data Streaming.

In Rivian’s case, it means filtering early, enriching early, and letting downstream teams focus on business logic instead of cleanup.
Rivian’s Mega Filter is built with Apache Flink and RocksDB. It reads from Kafka, uses metadata from a Kafka control topic and DynamoDB, and outputs a filtered stream to a separate Kafka topic. It updates in real time when teams add or remove signal specifications using a REST API.
The architecture is fully automated. New signals can be added on demand. Deprecated signals are removed with a simple cancel request. Flink state updates immediately to reflect the changes. The system remains flexible, even as signal definitions evolve over time.
Mega Filter is positioned directly before Event Watch, the core stream processing layer at Rivian.
Rivian’s Data Streaming Platform “Event Watch” for Focus on Business Logic with No Code
Event Watch is Rivian’s data streaming platform built on Kafka and Flink. It enables teams to focus on business logic using no code and low code tools. Over 120 Flink pipelines power features like:
- Mobile push notifications
- Geofence alerts
- Anomaly detection
- Vehicle activity monitoring
Events are processed with low latency and high reliability. Data is stored in Delta tables for analytics and served through time series databases for dashboards and alerts.
The real-time data streaming architecture uses Kafka and Flink as the communication backbone between the vehicle and the cloud:

The above example shows the event flow that enables bidirectional data flow where signals are ingested from the car processed and correlated in the backend and messages are sent back to the vehicle in real time. For example it ensures customers are only charged while the vehicle is actively charging.
From Raw Telemetry Signals to Curated Data Products for Real-Time Automotive Use Cases
Rivian’s streaming architecture supports more than 250 unique Kafka consumers, showing the value of Kafka as an event broker and the power of true decoupling through event-driven architecture.
The processing of raw telemetry from over 5,500 signals per vehicle every few seconds is the foundation of various use cases across many business units. Rivian filters, enriches, and transforms the telemetry data early in the pipeline to provide curated data products for business units. The result is clean, relevant event streams that can be consumed by many teams across the business.
Curated Data Products and Use Cases for 250 Unique Kafka Consumers
Each use case receives only the data it needs. Whether consumed via streaming, APIs, or batch, every stream is available as a curated data product. This reduces system load, eliminates redundant logic, and allows teams to focus on business outcomes instead of filtering or preprocessing.

Key real-time data consumers include:
- Fleet operations: Monitor vehicle usage and performance in logistics and commercial use cases
- Mobile applications: Deliver live vehicle updates and notifications to improve the end user experience
- Vehicle controls and dynamics: Tune driving systems using real-time feedback from telemetry
- Cybersecurity teams: Detect unusual behavior and protect against threats using live anomaly detection
- Connectivity teams: Analyze WiFi drop-offs and signal quality to improve vehicle connectivity
- Autonomy and ADAS: Use high-frequency telemetry to validate models and improve assisted driving features
- Service and diagnostics: Detect and resolve issues remotely to reduce vehicle downtime
- Factory and manufacturing: Track quality metrics and detect calibration issues during assembly and testing
- Safety and compliance: Monitor safety-critical systems and maintain real-time audit trails
- Privacy and governance: Enforce rules for data usage and geolocation in compliance with regulations
- Charging and energy management: Optimize charging behavior and predict charge duration based on live data
- Mapping and navigation: Use GPS and sensor data to improve routing, live maps, and trip planning
This approach turns raw, noisy telemetry into high-quality data products that drive business value across engineering, operations, and customer experience. Kafka and Flink provide the foundation for scalable, real-time communication between the car and the cloud. The event-driven architecture is built for flexibility, performance, and growth.
Architecture Evolution from Amazon Kinesis and Redshift to Apache Kafka, Flink and Druid
Rivian’s early architecture relied on Amazon Kinesis, Firehose, and Redshift to process and store telemetry data. Signals were streamed, written to Parquet files, and then loaded into Redshift for analysis. This worked for batch use cases but quickly ran into limits with real-time scale, flexibility, and cost.
As connected vehicle volume grew, the system became harder to manage. Latency was too high, pipelines were tightly coupled, and every new consumer added overhead. Apache Spark was evaluated but did not meet the low latency needs of always-on vehicle telemetry.
To address these challenges, Rivian transitioned to a modern event-driven architecture using Kafka and Flink, supported by Apache Druid for real-time analytics.

Today, the platform supports a wide range of consumption patterns, including streaming, APIs, batch, and AI tools, without duplicating processing or losing data quality. This shift enabled RV Tech to scale real-time use cases with a unified, future-ready data infrastructure.
Videos and Slides from Rivian’s P99 CONF and Current Talks
Rivian presented the use cases, architecture, and tech evolution in two excellent talks.
At Confluent’s Current 2025, Rupesh More and Guruguha Marur Sreenivasa gave a full overview of Rivian’s vehicle-to-cloud platform. They described how Apache Kafka and Flink form the foundation for streaming telemetry, powering everything from low-latency alerts to AI-driven services.
At P99 CONF, Marcus Kim and Saahil Khurana focused on performance. They showed how Mega Filter enables low-latency notifications without overloading the system. They explained how the filter protects Flink clusters from noisy neighbors and stabilizes session workloads.
Driving Business Value with Real-Time Data Streaming Using Kafka and Flink Across the Automotive Lifecycle
Rivian’s streaming platform powers real-time use cases across vehicle development, production, and operations. Each domain depends on accurate, fresh, and continuous data to deliver business value at scale:
- Service and diagnostics teams detect issues in the field to enable proactive maintenance, reducing repair costs and improving vehicle uptime for customers.
- Factory operations monitor vehicle quality during production to catch defects early, resulting in lower rework rates and higher manufacturing efficiency.
- Safety and compliance processes rely on real-time alerts and audit trails to ensure regulatory compliance and faster incident response, reducing legal and financial risk.
- Privacy teams govern how vehicle data is accessed and used, enforcing policies in real time to maintain customer trust and legal compliance.
- Charging and energy services optimize infrastructure and power use based on real-time telemetry, enabling better load balancing, lower energy costs, and improved charger availability.
- Mapping systems ingest live vehicle feedback to improve routing and road awareness, delivering better navigation and driver experience.
- Field reliability engineering uses real-world telemetry to continuously improve product design, helping reduce failure rates and accelerate hardware and software iterations.
All these use cases rely on a real-time data streaming platform to act on events as they happen, not hours or days later. This leads to faster decisions, better customer experience, and tighter operational control across the business.
From Raw Telemetry to Real-Time Use Cases in the Automotive Industry with Data Streaming
Rivian’s data streaming platform shows how to handle high-frequency telemetry in a way that scales across teams and systems. By filtering and shaping data early in the pipeline, engineering and business teams can work with clean, relevant streams without unnecessary overhead.
This architecture supports many real-time applications across the automotive lifecycle. It also enables shared platforms like RV Tech to serve multiple brands and vehicle types with consistent, high-quality data.
Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including customer stories from the manufacturing and automotive industry.
