Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including customer stories from the manufacturing and automotive industry.
The modern car is a rolling data center. It generates location data, sensor readings, engine diagnostics, driver behavior, and much more. Traditional batch processing and central data lakes are not fast enough to power connected car services. Automotive companies are now turning to real-time data streaming to support new business models, smarter mobility, and better customer engagement.
Real-time data data streaming allows car manufacturers and mobility providers to:
This transformation is explained in more detail in two blog posts:
These blogs highlight how automakers are building streaming architectures with Kafka, Flink, and Confluent Cloud to support both operational decisions and strategic initiatives. Real-time data is used across production, charging, remote diagnostics, fleet management, and personalization.
Rivian is one of the leading examples of this shift. Their use of data streaming shows how to scale real-time intelligence across a connected fleet.
Rivian is a US-based electric vehicle manufacturer focused on adventure and sustainability. Its lineup includes pickup trucks, SUVs, and commercial vans. Every Rivian vehicle is software-defined, with always-on connectivity and frequent over-the-air updates.
To power the connected experience, Rivian created Rivian Tech, a dedicated software and data organization. RV Tech builds the cloud infrastructure that supports vehicle telemetry, diagnostics, real-time monitoring, mobile notifications, and data-driven services.
In 2025, Rivian announced a strategic partnership with Volkswagen Group to create RV Tech, a new joint venture. The goal is to develop a shared software platform for both companies. Volkswagen will use this platform across multiple brands and vehicle types. This partnership puts RV Tech at the center of next-generation vehicle innovation.
Rivian’s data platform for electric vehicles (EV) processes hundreds of megabytes per second from over 150,000 connected vehicles. It handles tens of millions of requests and supports over 5,500 different types of telemetry signals. These numbers are growing as more vehicles go live.
Every Rivian vehicle streams over 5,500 telemetry signals every five seconds. This creates a firehose of raw data. Only a small fraction of these signals are relevant for downstream use cases such as push notifications or anomaly detection. Still, every Flink job had to consume the entire Kafka topic, filtering out 99.9 percent of the data internally.
This caused several problems. Compute costs increased. Flink jobs competed for resources. Kafka clusters became harder to scale. Adding new pipelines meant even more duplication of filtering logic. Teams were hitting both cost and stability limits.
The situation was not sustainable.
Rivian vehicles stream over 5,500 signals every five seconds, but only some correlated information is needed for real-time use cases like push notifications or alerts. Without filtering and aggregations, downstream systems were flooded. Kafka topics ballooned, Flink jobs wasted resources, and EKS clusters scaled up unnecessarily.
To solve this, Rivian built a shift left architecture: Instead of pushing the full firehose through every Flink job, they introduced a stateful pre-filtering layer called Mega Filter.
The results are massive. Daily data volume dropped by 88 percent from 288 TB to 34 TB per day. Kafka stays lean. Flink jobs focus on logic, not filtering. EKS clusters need fewer nodes.
This is what a modern data streaming architecture looks like: efficient, adaptable, and built for scale. More importantly, data quality improves. Teams now work with relevant, well-defined telemetry from the beginning.
The Shift Left pattern is explained in more depth in this blog post: The Shift Left Architecture: From Batch and Lakehouse to Real-Time Data Products with Data Streaming.
In Rivian’s case, it means filtering early, enriching early, and letting downstream teams focus on business logic instead of cleanup.
Rivian’s Mega Filter is built with Apache Flink and RocksDB. It reads from Kafka, uses metadata from a Kafka control topic and DynamoDB, and outputs a filtered stream to a separate Kafka topic. It updates in real time when teams add or remove signal specifications using a REST API.
The architecture is fully automated. New signals can be added on demand. Deprecated signals are removed with a simple cancel request. Flink state updates immediately to reflect the changes. The system remains flexible, even as signal definitions evolve over time.
Mega Filter is positioned directly before Event Watch, the core stream processing layer at Rivian.
Event Watch is Rivian’s data streaming platform built on Kafka and Flink. It enables teams to focus on business logic using no code and low code tools. Over 120 Flink pipelines power features like:
Events are processed with low latency and high reliability. Data is stored in Delta tables for analytics and served through time series databases for dashboards and alerts.
The real-time data streaming architecture uses Kafka and Flink as the communication backbone between the vehicle and the cloud:
The above example shows the event flow that enables bidirectional data flow where signals are ingested from the car processed and correlated in the backend and messages are sent back to the vehicle in real time. For example it ensures customers are only charged while the vehicle is actively charging.
Rivian’s streaming architecture supports more than 250 unique Kafka consumers, showing the value of Kafka as an event broker and the power of true decoupling through event-driven architecture.
The processing of raw telemetry from over 5,500 signals per vehicle every few seconds is the foundation of various use cases across many business units. Rivian filters, enriches, and transforms the telemetry data early in the pipeline to provide curated data products for business units. The result is clean, relevant event streams that can be consumed by many teams across the business.
Each use case receives only the data it needs. Whether consumed via streaming, APIs, or batch, every stream is available as a curated data product. This reduces system load, eliminates redundant logic, and allows teams to focus on business outcomes instead of filtering or preprocessing.
Key real-time data consumers include:
This approach turns raw, noisy telemetry into high-quality data products that drive business value across engineering, operations, and customer experience. Kafka and Flink provide the foundation for scalable, real-time communication between the car and the cloud. The event-driven architecture is built for flexibility, performance, and growth.
Rivian’s early architecture relied on Amazon Kinesis, Firehose, and Redshift to process and store telemetry data. Signals were streamed, written to Parquet files, and then loaded into Redshift for analysis. This worked for batch use cases but quickly ran into limits with real-time scale, flexibility, and cost.
As connected vehicle volume grew, the system became harder to manage. Latency was too high, pipelines were tightly coupled, and every new consumer added overhead. Apache Spark was evaluated but did not meet the low latency needs of always-on vehicle telemetry.
To address these challenges, Rivian transitioned to a modern event-driven architecture using Kafka and Flink, supported by Apache Druid for real-time analytics.
Today, the platform supports a wide range of consumption patterns, including streaming, APIs, batch, and AI tools, without duplicating processing or losing data quality. This shift enabled RV Tech to scale real-time use cases with a unified, future-ready data infrastructure.
Rivian presented the use cases, architecture, and tech evolution in two excellent talks.
At Confluent’s Current 2025, Rupesh More and Guruguha Marur Sreenivasa gave a full overview of Rivian’s vehicle-to-cloud platform. They described how Apache Kafka and Flink form the foundation for streaming telemetry, powering everything from low-latency alerts to AI-driven services.
At P99 CONF, Marcus Kim and Saahil Khurana focused on performance. They showed how Mega Filter enables low-latency notifications without overloading the system. They explained how the filter protects Flink clusters from noisy neighbors and stabilizes session workloads.
Rivian’s streaming platform powers real-time use cases across vehicle development, production, and operations. Each domain depends on accurate, fresh, and continuous data to deliver business value at scale:
All these use cases rely on a real-time data streaming platform to act on events as they happen, not hours or days later. This leads to faster decisions, better customer experience, and tighter operational control across the business.
Rivian’s data streaming platform shows how to handle high-frequency telemetry in a way that scales across teams and systems. By filtering and shaping data early in the pipeline, engineering and business teams can work with clean, relevant streams without unnecessary overhead.
This architecture supports many real-time applications across the automotive lifecycle. It also enables shared platforms like RV Tech to serve multiple brands and vehicle types with consistent, high-quality data.
Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including customer stories from the manufacturing and automotive industry.
Mobile World Congress (MWC) 2026 highlights the shift from batch systems to real time data…
This blog post explores how data streaming transforms airline operations by enabling real-time visibility, faster…
The second edition of The Ultimate Data Streaming Guide is now available as a free…
Apache Kafka has long been the foundation for real-time data streaming. With the release of…
Diskless Kafka is transforming how fintech and financial services organizations handle observability and log analytics.…
Airlines face constant pressure to deliver reliable service while managing complex operations and rising customer…