Automotive

CARIAD’s Unified Data Platform: A Data Streaming Automotive Success Story Behind Volkswagen’s Software-Defined Vehicles

The Automotive Industry is Becoming a Software Industry.  The industry is changing fast. Cars are no longer just machines. They are connected computers on wheels. The shift to the software-defined vehicle (SDV) is a huge transformation. It touches every part of the car. From autonomous driving and over-the-air updates to new in-car experiences and direct-to-consumer services. Drivers want more. Real-time maps. Driver assistance. Entertainment. Constant improvements over the lifetime of their vehicle. Manufacturers want to deliver faster. They need platforms that support AI, cloud computing, and real-time analytics. This shift is not just technical. It is also cultural and organizational. Leaders like Volkswagen Group are not only adapting. They are setting the pace. This blog post introduces the CARIAD success story within the Volkswagen Group. It shows how data streaming with Apache Kafka and Flink acts as the central nervous system between connected vehicles and a multi-cloud IT landscape.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including customer stories from the manufacturing and automotive industry.

Real-Time Data Streaming in Automotive and Manufacturing

Automotive and manufacturing innovation needs fast data and continuous processing of technical and business events. Not in batches. In real time.

That’s why companies are turning to event-driven architectures and data streaming. Apache Kafka and Apache Flink enable continuous streaming for transactional workloads, ETL and analytics.

A data streaming platform (DSP) connects, moves and processes data from many sources in real time, including vehicles on the road, equipment in factories, and IT systems across the enterprise. This real-time streaming unlocks new use cases across development, production, and after-sales from predictive maintenance and fleet monitoring to supply chain optimization and operational intelligence.

Data streaming technologies help meet regulatory requirements such as the EU Data Act, GDPR, and other standards like ISO 27001 and HIPAA. The EU Data Act gives users the right to access and share data generated by connected products, while GDPR focuses on protecting personal data and privacy. A data streaming platform ensures governance, traceability, and secure sharing of sensitive data.

For more details about automotive use cases and success stories from Tesla, BMW, and others, see the full blog post: Driving the Future: How Real-Time Data Streaming is Powering Automotive Innovation.

Let’s explore the latest success story presented by Chetan Alatagi from CARIAD, a subsidiary of the Volkswagen Group, at Confluent’s Data Streaming World Tour 2025 in Frankfurt, Germany.

The Event Driven Platform at the Volkswagen Group by CARIAD

To keep pace with the growing complexity of software-defined vehicles, CARIAD set out to build a unified, event-driven data platform for the entire Volkswagen Group. The goal: connect millions of vehicles, systems, and teams with one scalable foundation powered by real-time data streaming. The result is a major step forward in how automotive software is developed, deployed, and monetized at global scale.

Source: CARIAD

CARIAD: Software Powerhouse of Volkswagen Group

CARIAD is the software company of the Volkswagen Group. Their main focus is to develop a unified, scalable software stack for all Group brands that makes mobility safer, more sustainable and more comfortable by bringing together automotive engineering and software development. CARIAD supports leading brands like Audi, Porsche, SEAT, and VW itself.

Over 45 million vehicles are already connected through CARIAD technology. In 2024 alone, 14 million new vehicles launched with CARIAD software. This scale demands both technical precision and operational excellence.

CARIAD drives the digital transformation of the group. It builds shared software platforms and tools to deliver better features, faster updates, and new digital services.

The Problem: Data Chaos and Silos

Before CARIAD’s transformation, data was scattered. Different teams had different systems. Testing data, customer data, and development data lived in silos.

Source: CARIAD

Handling vehicle data was slow, expensive, and manual. This blocked innovation. It also made it hard to meet legal demands like traceability or compliance.

The company needed a unified solution.

Source: CARIAD

The Solution: Unified Data Ecosystem (UDE)

CARIAD created the Unified Data Ecosystem (UDE). This is the single source of truth for all vehicle-related data, pre- and post-production.

Source: CARIAD

Confluent serves as the central nervous system architecture. The data streaming platform enables an event-driven architecture to data flow across the ecosystem. It integrates with solutions such as Databricks for analytics and Elasticsearch for search and log analysis.

The infrastructure runs in a global, multi-region setup. Serverless Confluent Cloud operates in North America and Europe, while a self-managed Confluent Platform is deployed in Mainland China.

Source: CARIAD

Apache Kafka and Flink are the core of the data platform. They ingest, transport and process petabytes of vehicle data. From development vehicles, test benches, and customer fleets. Everything flows through one event-driven platform.

Source: CARIAD

Use cases range from simulation and reprocessing of test drives to fleet health monitoring and data monetization.

The Unified Platform: From Ingestion to Insight

CARIAD built an advanced architecture with:

  • Multi-region architecture to ensure high availability, low latency, and compliance with regional data regulations
  • Streaming pipelines that enrich, validate, and normalize data
  • External APIs to integrate vehicle data to any IT and SaaS application
  • Stream processing to power real-time use cases
  • Data sharing across multiple consumers, applications, and use cases

The Kafka log acts as a central backbone in the architecture, enabling true decoupling between producers and consumers, handling backpressure, and supporting a wide range of scenarios: from real-time processing to batch analytics and reprocessing – all on the same foundation. Development data and customer data live in the same ecosystem, fully governed and secured. Consent management, legal compliance, and scalability were key design drivers.

Business Outcomes and Use Cases

With the Unified Data Ecosystem, CARIAD reduced data management costs. Teams now have faster access to quality data. That speeds up development, testing, and delivery.

Examples of use cases include:

  • Battery and Charging Analytics
  • ADAS Function Development
  • Predictive Maintenance
  • Anomaly Detection
  • Remote Diagnosis and Vehicle Health
  • EU Data Act Compliance
  • Fleet Data Services via Data Hub
Source: CARIAD

CARIAD’s event driven platform is NOT just about technology. It’s about delivering value to the business. CARIAD is helping Volkswagen Group stay competitive in a market where software defines the brand.

The CARIAD story offers a clear path: transform data chaos into a streamlined, governed, and scalable real-time platform. A modern Data Streaming Platform enables:

  • Fast access to quality data for all teams
  • Reduced operational costs through consolidation
  • Compliance with strict regulatory frameworks
  • New revenue streams through data products
Source: CARIAD

The Future of CARIAD’s Event-Driven Architecture

CARIAD is evolving its platform to extract more value from real-time data and better unify operational and analytical systems:

  • Open table format integration with Apache Iceberg / Delta Lake leveraging Confluent Tableflow will unify data streaming with the lakehouse. This gives all teams better access to reliable, governed data across the end-to-end data pipeline.
  • Apache Kafka’s new queueing feature Queues for Kafka (QfK) will support task-based workloads, improving flexibility for background jobs and worker processes.
  • AI is being integrated into the data pipeline to improve data quality, automate streaming ETL tasks, and support real-time decision-making.

CARIAD is shifting quality and governance earlier across both development and operations, following the shift-left architecture approach. This improves feedback loops, reduces risk, and accelerates delivery across all the various connected car use cases.

Driving Automotive Transformation with a Unified Data Strategy

The transformation at CARIAD highlights how real-time data streaming is a key enabler of innovation in both the automotive and manufacturing industries. With data streaming powered by Apache Kafka, and Flink at the core, Volkswagen’s CARIAD built a unified, event-driven platform that connects vehicles, test systems, and cloud services in real time. This allows teams to streamline development, improve production processes, and deliver new digital services faster and more reliably.

Beyond just data integration, the platform supports compliance, reduces operational costs, and unlocks new opportunities from predictive maintenance to data services, and serves and an enabler for Artificial Intelligence.

CARIAD’s success shows how a modern data streaming platform drives business value and keeps manufacturers competitive in a software-defined world.

To learn more about data streaming adoption in the automotive industry, check out my other articles featuring insights from OEMs like BMW and Tesla, suppliers like Michelin and Brose, and ecosystem players such as Virta in the EV charging space.

CARIAD, like many others in the automotive and industrial space, combines Apache Kafka with MQTT to connect millions of vehicles and edge devices — even in poor network conditions. Learn how this powerful combo enables real-time data flows for connected cars, machines, and robots in my blog series.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including customer stories from the manufacturing and automotive industry.

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Recent Posts

Data Streaming Meets Lakehouse: Apache Iceberg for Unified Real-Time and Batch Analytics

Apache Iceberg is gaining momentum as the open table format of choice for modern data…

5 days ago

Data Streaming in Retail: Social Commerce from Influencers to Inventory

Social commerce is reshaping retail by merging entertainment, influencer marketing, and instant purchasing into one…

2 weeks ago

Kafka Proxy Demystified: Use Cases, Benefits, and Trade-offs

A Kafka proxy adds centralized security and governance for Apache Kafka. Solutions like Kroxylicious, Conduktor,…

4 weeks ago

How Stablecoins Use Blockchain and Data Streaming to Power Digital Money

Stablecoins are reshaping digital money by linking traditional finance with blockchain technology. Built for stability…

4 weeks ago

Cybersecurity with a Digital Twin: Why Real-Time Data Streaming Matters

Cyberattacks on critical infrastructure and manufacturing are growing, with ransomware and manipulated sensor data creating…

1 month ago

How Siemens, SAP, and Confluent Shape the Future of AI Ready Integration – Highlights from the Rojo Event in Amsterdam

Many enterprises want to become AI ready but are limited by slow, batch based integration…

2 months ago