The State of Data Streaming for Retail in 2023

The State of Data Streaming for Retail in 2023
This blog post explores the state of data streaming for the retail industry in 2023 with omnichannel customer experiences, hybrid shopping models, and hyper-personalized recommendations. Data streaming allows integrating and correlating data in real time at any scale. I explore customer stories from Walmart, Albertsons, Otto,, and more, including a slide deck and on-demand video recording.

This blog post explores the state of data streaming for the retail industry in 2023. The evolution of omnichannel customer experiences, hybrid shopping models, and hyper-personalized recommendations requires an optimized end-to-end supply chain, fancy mobile apps, and integration with pioneering technologies like social commerce or metaverse. Data streaming allows integrating and correlating data in real time at any scale. I look at retail trends to explore how data streaming helps as a business enabler, including customer stories from Walmart, Albertsons, Otto,, and more. A complete slide deck and on-demand video recording are included.

The State of Data Streaming for Retail in 2023

Several disruptive trends impact innovation in the retail industry to reduce costs, increase the customer experience, and keep customer retention and revenue high:

Disruptive Trends in Retail for Data Streaming

Researchers, analysts, startups, and last but not least, labs and the first real-world rollouts of traditional players show a few upcoming trends in the retail industry:

  • Hybrid shopping models with digitalization and omnichannel (see a recent Gartner webinar)
  • Generative AI and automation to improve existing business processes and innovation (as discussed in an article by McKinsey)
  • Live commerce with social platforms changes the shopping experience (even beyond China and Asia, as an analysis of Grand View Research shows)

Let’s explore the goals and impact of these trends.

Hybrid shopping models with digitalization and omnichannel

Capabilities for omnichannel retail change and improve the customer experience significantly. Mobile apps enable seamless hybrid and location-based in-store experiences. Customers leverage options like “buy online and pick up in the store” more and more:

Gartner - Hybrid Shopping Models
Source: Gartner

Generative AI and automation for innovation

Generative AI and automation to become more productive, get to market faster, and serve customers better. The McKinsey article explores various use cases for technologies like NLP (Natural Language Processing) with Machine Learning and Large Language Models (LLM) like ChatGPT, including:

  • Merchandising and Product: Customizing
  • Supply Chain and Logistics: Support negotiations with suppliers
  • Marketing: Generate personalized offers
  • Digital commerce: Tailor virtual product try-on
  • Store operations: Optimize store layout through simulations
  • Organization: Enable self-serve and automate support tasks

Live commerce changing the shopping experience

Live commerce with social platforms changes the shopping experience by combining instant purchasing of a featured product and audience participation. The covid pandemic sped up this trend. Live commerce emerged in China but arrived in the West across industries, whether you sell fashion, toys, cars, digital features, or anything else. A chart of Grand View Research shows the growth of social commerce in the North America:

Grand View Research - Social Commerce Market North America
Source: Grand View Research

I explored some time ago how Apache Kafka transforms the retail and shopping metaverse. Let’s inspect the relation of data streaming with technologies like Kafka and Flink for the retail industry.

Data streaming in the retail industry

Adopting trends like hybrid shopping models, location-based services or advanced loyalty platforms is only possible if enterprises in the retail industry can provide and correlate information at the right time in the proper context. Real-time, which means using the information in milliseconds, seconds, or minutes, is almost always better than processing data later (whatever later means):

Real Time Data Streaming in Retail

Data streaming combines the power of real-time messaging at any scale with storage for true decoupling, data integration, and data correlation capabilities. Apache Kafka is the de facto standard for data streaming.

Use Cases for Apache Kafka in Retailis a good article for starting with an industry-specific point of view on data streaming.

This is just one example. Data streaming with the Apache Kafka ecosystem and cloud services are used throughout the supply chain of the retail industry. Search my blog for various articles related to this topic: Search Kai’s blog.

Cloud adoption in retail as the foundation for innovation with data streaming

Forrester analyzed the cloud adoption in the retail industry in their research about The State Of Cloud In Retail, 2023. The results are impressive:

Forrester - The State of Cloud in Retail 2023
Source: Forrester

The cloud provides elastic scalability and shorter time-to-market cycles for innovation. Building new real-time applications is much easier in the cloud because the data streaming infrastructure is available as fully managed SaaS with critical SLAs.

Nuuly’s innovative clothing rental subscription service is a great example. It differs greatly from a typical e-commerce model with the need for a real-time event-driven architecture. They use Confluent Cloud and Kafka as the central nervous system of its business, spanning everything from customer-facing applications to distribution center operations from a technology perspective. The entire business case was developed and brought to production in just six months, thanks to the data streaming SaaS under the hood to focus on business logic.

Software is eating retail, and real-time data enables innovation

CBINSIGHTS explored various use cases that optimize the retail supply chain or improve the customer experience and retention:

CBINSIGHTS - Software is Eating Retail

If you look at the architecture trends and customer stories for data streaming in the next section, you realize that real-time data integration and processing at scale is required to provide most modern retail use cases.

The retail industry applies various trends for enterprise architectures for cost, flexibility, security, and latency reasons. The three major topics I see these days at customers are:

  • Edge data synchronization to the cloud in real-time
  • Omnichannel up-/cross-selling
  • New retail concepts and strategies like augmented reality, live commerce, or metaverse

Let’s look deeper into some enterprise architectures that leverage data streaming for retail use cases.

Hybrid architecture with data streaming at the edge in retail store and cloud

Most retailers have a cloud-first strategy to set up modern e-commerce, CRM, marketing, loyalty, and payment platforms. However, edge computing gets more relevant for use cases like location-based services, hybrid shopping models, and other real-time analytics scenarios:

Hybrid Edge to Global Retail Architecture with Apache Kafka

Learn about architecture patterns for Apache Kafka that may require multi-cluster solutions and see real-world examples with their specific requirements and trade-offs. That blog explores scenarios such as disaster recovery, aggregation for analytics, cloud migration, mission-critical stretched deployments, and global Kafka.

Edge deployments for data streaming are their own challenges. In separate blog posts, I covered use cases for Kafka at the edge and provided an infrastructure checklist for edge data streaming.

Hyper-personalized customer experience

Customers expect a great customer experience across devices (like a web browser or mobile app) and human interactions (e.g., in a bank branch). Data streaming enables a context-specific omnichannel retail experience by correlating real-time and historical data at the right time in the proper context:

Context-specific Omnichannel Retail Experience with Data Streaming

Omnichannel Retail and Customer 360 in Real Time with Apache Kafka” goes into more detail. But one thing is clear: Most innovative use cases require both historical and real-time data. In summary, correlating historical and real-time information is possible with data streaming out-of-the-box because of the underlying append-only commit log and replayability of events. A cloud-native Tiered Storage Kafka infrastructure to separate compute from storage makes such an enterprise architecture more scalable and cost-efficient.

The article “Fraud Detection with Apache Kafka, KSQL and Apache Flink” explores stream processing for real-time analytics in more detail, shows an example with embedded machine learning, and covers several real-world case studies.

Live commerce with social platforms and data streaming

Live commerce requires a great customer experience end to end. Most actions and data correlations should or even have to happen in real time. Data correlation requires connectivity to the social platforms, the live commerce sales platform, and many other backend processes and applications.

Social commerce requires the right action at the right time. Requirements include:

  • Interact with the customer during the show.
  • Recommend products that need to be sold.
  • Provide context-specific pricing.
  • All automated. In real-time. At scale.

Here is an example architecture for a decentralized, scalable, real-time live commerce infrastructure powered by Kafka and its ecosystem:

Live Commerce in Retail with Data Streaming powered by Apache Kafka

A potentially enormous impact on future live commerce platforms is the metaverse and new payment and social functionality leveraging crypto platforms. This is its own topic, but most crypto platforms are powered by data streaming with Apache Kafka at its heart.

New customer stories for data streaming in the retail industry

So much innovation is happening in the retail sector. Automation and digitalization change how we search and buy products and services, communicate with partners and customers, provide hybrid shopping models, and more.

Most retail enterprises use a cloud-first approach to improve time-to-market, increase flexibility, and focus on business logic instead of operating IT infrastructure.

Here are a few customer stories from worldwide retail enterprises across industries:

  • Walmart: Supply chain optimization for replenishment from warehouse to retail stores with data consistency across batch and real-time applications
  • Albertsons: Central integration data hub and loyalty platform to keep customers for life with a scalable supply chain, revamped customer experience, and new retail media network
  • Hyper-personalized retail experience with real-time clickstream analytics while the customer is in the (online) store
  • Otto: Data exchange with a domain-driven design for true decoupling, faster time-to-market, and data privacy (GDPR) compliance within a multi-cloud enterprise architecture
  • BigCommerce: Cloud-native eCommerce platform that provides services on the cloud with analytics and advice for merchants
  • WhatNot: A social live auctions platform with interactive selling and metaverse / augmented reality capabilities

Resources to learn more

This blog post is just the starting point. Learn more about data streaming in the retail industry in the following on-demand webinar recording, the related slide deck, and further resources, including pretty cool lightboard videos about use cases.

On-demand video recording

The video recording explores the retail industry’s trends and architectures for data streaming. The primary focus is the data streaming case studies. Check out our on-demand recording:

Video: The State of Data Streaming for Retail in 2023


If you prefer learning from slides, check out the deck used for the above recording:

Fullscreen Mode

Case studies and lightboard videos for data streaming in retail

The state of data streaming for retail in 2023 is fascinating. New use cases and case studies come up every month. This includes better data governance across the entire organization, collecting and processing data from location-based services and mobile apps in real-time, data sharing and B2B partnerships with Open APIs for new business models, and many more scenarios.

We recorded lightboard videos showing the value of data streaming simply and effectively. These five-minute videos explore the business value of data streaming, related architectures, and customer stories. Stay tuned; I will update the links in the next few weeks and publish a separate blog post for each story and lightboard video.

And this is just the beginning. Every month, we will talk about the status of data streaming in a different industry. Manufacturing was the first. Financial services second, then retail, telcos, gaming, and so on…

Let’s connect on LinkedIn and discuss it! Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter.

Dont‘ miss my next post. Subscribe!

We don’t spam! Read our privacy policy for more info.
If you have issues with the registration, please try a private browser tab / incognito mode. If it doesn't help, write me:

Leave a Reply
You May Also Like
How to do Error Handling in Data Streaming
Read More

Error Handling via Dead Letter Queue in Apache Kafka

Recognizing and handling errors is essential for any reliable data streaming pipeline. This blog post explores best practices for implementing error handling using a Dead Letter Queue in Apache Kafka infrastructure. The options include a custom implementation, Kafka Streams, Kafka Connect, the Spring framework, and the Parallel Consumer. Real-world case studies show how Uber, CrowdStrike, Santander Bank, and Robinhood build reliable real-time error handling at an extreme scale.
Read More