Apache Kafka

The Rise of Diskless Kafka: Rethinking Brokers, Storage, and the Kafka Protocol

Apache Kafka has come a long way from being just a scalable data ingestion layer for data lakes. Today, it’s the backbone of real-time transactional applications. In many organizations, Kafka serves as the central nervous system that connects both operational and analytical workloads. Over time, the architecture has shifted significantly from brokers managing all storage, to Tiered Storage, and now, toward a new paradigm: Diskless Kafka refers to a Kafka architecture where no local disk storage is used by brokers. Instead, all event data is stored directly in cloud object storage such as Amazon S3, Google Cloud Storage, or Azure Blob Storage.

This shift redefines Kafka’s role; not just as a messaging platform, but as a scalable, long-term, and cost-efficient storage layer for event-driven architectures. This post explores that journey, the business value behind it, and what it means to operate Kafka without brokers.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, including various data Kafka architectures and best practices.

Kafka Protocol vs. Apache Kafka Open Source Framework

Kafka is now more than just an open-source project. It is become the de facto standard protocol for streaming data. Many companies still use the open-source Apache Kafka or solutions built on top of it. However, others are adopting Kafka-compatible services and products that separate the protocol from traditional broker and storage infrastructure.

This approach enables producers and consumers to continue using Kafka’s familiar APIs while relying on alternative storage solutions behind the scenes. In this new world, Kafka brokers may no longer be required for certain workloads.

As outlined in the Data Streaming Landscape, the Kafka protocol has become the foundation of modern data streaming platforms leveraging and event-driven architecture. And as storage and retrieval methods evolve, the focus shifts from infrastructure management to protocol consistency.

Some of these innovations eventually return to the open-source project. Diskless Kafka, for instance, might be added to Apache Kafka. Several KIPs are under discussion to evolve Kafka’s storage model:

  • Slack’s KIP-1176 proposes fast-tiering by offloading active log segments to cloud storage like S3, reducing cross-AZ replication traffic while keeping Kafka’s core architecture intact.
  • Aiven’s KIP-1150 introduces diskless topics, but requires deeper architectural changes and still faces design challenges.
  • AutoMQ’s KIP-1183 aims to support their proprietary storage backend but is still too vendor-specific to gain traction in its current form.

All three KIPs reflect growing momentum to modernize Kafka’s storage. Though, they also show how complex and long the path to adoption can be.

But let’s go one step back first.

Tiered Storage: The First Step Toward Cost-Efficient Kafka

The introduction of Tiered Storage marked a turning point in Kafka’s evolution. It separates short-term and long-term storage by allowing Kafka to offload older data from local disks to object storage.

Business Value of Tiered Storage for Apache Kafka:

  • Cost Reduction: Older data is stored on cheaper services like Amazon S3, Google Cloud Storage, or Azure Blob, rather than expensive local disks.
  • Improved Scalability: Brokers only manage recent data with low-latency needs. Historical data is fetched directly from object storage when needed.
  • Long-Term Retention: Kafka becomes a permanent store for event data, not just a transient buffer. This unlocks use cases such as event sourcing, reprocessing historical data, and model training.
  • Simplified Operations: Scaling Kafka clusters becomes easier. There’s no need to shuffle large volumes of data when resizing the cluster.

Tiered Storage helped many organizations lower their total cost of ownership while expanding the functional value of Kafka. But the journey didn’t stop there.

Tiered Storage started as a proprietary feature and is now available through an open interface in Apache Kafka. “Why Tiered Storage for Apache Kafka is a BIG THING” explores the evolution and concepts in more detail.

Diskless Kafka: The Next Evolution without Brokers

The next stage is more radical: Diskless Kafka.

In this model, Kafka brokers disappear completely. Producers and consumers still interact using the Kafka protocol, but the storage and control plane are entirely reimagined.

How It Works:

  • Events are published and consumed using the Kafka protocol.
  • All data is stored directly in object storage like S3 or GCS.
  • A lightweight control plane manages metadata and offsets.
  • There are no brokers involved in data storage or transport.

This approach removes the operational burden of managing Kafka brokers while maintaining API compatibility. It changes the game.

WarpStream already explained in May 2024 why Diskless Kafka is better for the end user when showcasing its architecture.

Source: WarpStream

Real-World Implementations of Diskless Kafka

Companies are already pioneering brokerless Kafka models. Some have operated them in production for several quarters. Others are just getting started or are new startups focused entirely on this architecture.

WarpStream (BYOC Kafka)

WarpStream offers a Kafka API-compatible solution without brokers, relying fully on object storage. Deployed directly in a customer’s cloud account, it dramatically lowers infrastructure and operational costs.

WarpStream also emphasizes security and zero trust architecture based on the Bring Your Own Cloud (BYOC) concept to allow deployments within private environments such as a customer’s VPC or on-premises infrastructure. Learn more in their documentation.

I explored the benefits of Bring Your Own Cloud (BYOC) in a dedicated blog post.

Confluent Freight (Serverless Kafka)

Confluent has implemented this architecture within its serverless Confluent Cloud. By separating compute and storage, customers get near-infinite scalability and pay only for what they use. In some cases, this has led to up to 90% cost reduction compared to traditional clusters.

Many more: Aiven, Buf, AutoMQ, et al…

The ecosystem is growing fast, with differentiation emerging through architecture, cost models, and security approaches.

Meanwhile, more startups are entering this space. Some, like Buf or AutoMQ, already offer Kafka-compatible services built entirely on object storage, while others are just beginning to explore diskless Kafka implementations.

Aiven created KIP-1150: Diskless Topics to bring brokerless Kafka into the open source framework, following the same collaborative approach seen with Tiered Storage.

The Business Value of Diskless Kafka

Object Store-Only Kafka without the need for brokers brings tangible benefits:

  • Cost Savings: Brokers demand expensive compute and storage. Object storage is cheaper, resilient, and scales effortlessly.
  • Elastic Scaling: There’s no need to manually size and rebalance clusters. Storage scales automatically with usage.
  • Operational Simplicity: Without brokers, there’s no ZooKeeper, no KRaft, and no rebalancing. Metadata services are managed or abstracted, reducing the need for internal expertise.

When to Use Diskless Kafka?

This architecture is NOT for every use case. It’s most suitable when latency requirements are moderate and workloads are centered around analytics or historical processing.

Diskless Kafka is ideal for:

  • Streaming use cases with latency needs above a few hundred milliseconds, like observability and log aggregation.
  • Event-driven near real-time and batch data ingestion pipelines for analytics and AI/ML training.
  • Use cases involving long-term retention, compliance, or auditability.
  • Multi-region data storage and disaster recovery.

The last point is particularly noteworthy. Diskless Kafka is not limited to analytical workloads. Because object storage operates differently than traditional disk systems, it can support strict durability and consistency guarantees, making it a strong fit even for critical operational and transactional applications. My article about “Multi-Region Kafka using Synchronous Replication for Disaster Recovery with Zero Data Loss (RPO=0)” explores the WarpStream implementation for this scenario.

Diskless Kafka is NOT ideal for:

  • Object storage is not available (e.g., in edge environments).
  • Low-latency applications where rapid end-to-end processing is critical. In such cases, in-memory or edge processing architectures are better suited than diskless Kafka.

That’s the main disadvantage. In summary, if you don’t need very low latency and have access to object storage, diskless Kafka might be the better choice from a value and TCO perspective.

When talking about low latency, also keep in mind that Kafka and similar competing technologies were never built for hard real-time and deterministic, safe-critical systems. Use cases such as robotics or autonomous systems are built in embedded systems with programming languages like C or Rust. Kafka is great for connecting this systems with the rest of the IT infrastructure leveraging low latency in milliseconds.

Always define what ‘real-time’ means for your use case. From a latency perspective, Diskless Kafka is sufficient for most scenarios.

Optimizing Kafka Workloads with a Multi-Cluster Strategy

Most organizations won’t replace Kafka brokers entirely. Instead, they will adopt a multi-cluster strategy to align architecture with workload requirements:

  • Kafka with Brokers: Ideal for real-time applications and edge deployments where object stores aren’t available.
  • Tiered Storage Kafka: Balances performance and cost for general-purpose workloads.
  • Object Store-Only Kafka: Best for cost-efficient scalability, durability, and long-term storage use cases.

Enterprise architectures with multiple Kafka cluster are becoming the standard, not an exception! Organizations run multiple clusters optimized for specific use cases, all unified by the Kafka protocol. This enables seamless integration and consistent tooling. In my blog “Apache Kafka Cluster Type Deployment Strategies“, I explored various deployment scenarios such as multi-cloud, hybrid, disaster recovery, aggregation, edge, and more.

Whether using fully managed offerings like Confluent Cloud, brokerless alternatives like WarpStream, or hybrid deployments, teams can align infrastructure choices with their latency, cost, and scalability goals.

Kafka’s Future: Protocol First

The shift to diskless Kafka is more than a technical evolution. It’s a strategic transformation. Kafka’s core value is moving away from broker infrastructure toward protocol standardization. The protocol has become the foundation that unifies real-time and historical processing, regardless of the underlying storage or compute architecture.

Kafka brokers and Object Store-Only Kafka deployments will coexist. This flexibility in storage backend allows organizations to support a wide range of workloads – operational, analytical, real-time, and historical – while maintaining one consistent protocol. Managed services will continue to dominate due to their ability to reduce operational complexity, and hybrid or edge deployments will become more common in industries like manufacturing, automotive, and energy.

Startups are pushing the boundaries with Kafka-compatible solutions that bypass traditional brokers entirely. At the same time, Kafka contributors are advancing efforts to modernize storage through multiple competing KIPs for diskless Apache Kafka. KIP-1150 from Aiven proposes diskless Kafka, KIP-1176 from Slack introduces fast-tiering via cloud-based Write-Ahead Log (WAL), and KIP-1183 from AutoMQ outlines a vendor-specific approach to shared storage. While each proposal targets similar goals – decoupling Kafka from local disks – they take different technical paths, adding to the complexity and extending the timeline for consensus and adoption.

Still, this diversity of approaches highlights a broader shift: Kafka is evolving from a tightly coupled broker-based system toward a protocol-centric architecture. Recognizing all three proposals offers a more balanced view of this transition, even if the community ultimately consolidates around one direction.

Companies that embrace this shift to Diskless Kafka will benefit from lower infrastructure costs, easier operations, and highly scalable streaming platforms. All of this comes without sacrificing compatibility or vendor neutrality – thanks to the Kafka protocol-first approach.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And make sure to download my free book about data streaming use cases, including various data Kafka architectures and best practices.

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Share
Published by
Kai Waehner

Recent Posts

Multi-Region Kafka using Synchronous Replication for Disaster Recovery with Zero Data Loss (RPO=0)

Apache Kafka is the backbone of real-time data streaming. Choosing the right deployment model -…

1 week ago

Driving the Future: How Real-Time Data Streaming Is Powering Automotive Innovation

The automotive industry is rapidly shifting toward a software-defined, data-driven future. Real-time technologies like Apache…

2 weeks ago

Pinterest Fights Spam and Abuse with Kafka and Flink: A Deep Dive into the Guardian Rules Engine

Pinterest uses Apache Kafka and Flink to power Guardian, its real-time detection platform for spam,…

3 weeks ago

Building Agentic AI with Amazon Bedrock AgentCore and Data Streaming Using Apache Kafka and Flink

Agentic AI goes beyond chatbots. These are autonomous systems that observe, reason, and act—continuously and…

3 weeks ago

Inside FourKites Logistics Platform: Data Streaming for AI and End-to-End Visibility in the Supply Chain

Global supply chains face constant disruption. Trade conflicts, wars, inflation, and shifting regulations are making…

4 weeks ago

The Rise of Kappa Architecture in the Era of Agentic AI and Data Streaming

The shift from Lambda to Kappa architecture reflects the growing demand for unified, real-time data…

1 month ago