Telco

Telecom OSS Modernization with Data Streaming: From Legacy Burden to Cloud-Native Agility

Telecom networks are under pressure to deliver more services, faster, and with higher reliability. Yet many operators remain held back by legacy Operational Support Systems (OSS) that were designed for another era. These platforms, once the backbone of service delivery and assurance, now often slow down innovation, block real-time automation, and drive up costs. At the same time, Business Support Systems (BSS) and new Over-the-Top (OTT) services demand seamless integration with OSS to meet customer expectations. This article explores how a data streaming platform powered by Apache Kafka and Flink transforms OSS into a cloud-native, real-time nervous system of the telco to enable modernization step by step while supporting AI, multi-cloud, and event-driven operations.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including a dedicated chapter about the telecom sector.

OSS as the Nervous System of Telecom Operations

Operational Support Systems (OSS) have always been the backbone of telecom operations. They orchestrate provisioning, assurance, inventory, and workforce management. Without them, no service is delivered, monitored, or billed correctly. OSS ensures that customer promises are fulfilled in the network, often in near real time.

In the modern telco stack, OSS works hand in hand with Business Support Systems (BSS). The BSS defines the “what”: the product catalog, pricing, eligibility, and customer orders. The OSS executes the “how”: activating those services in the network, assuring their performance, and managing the underlying resources. This separation of concerns is essential to avoid overloading one layer with responsibilities that belong to the other.

The rise of OTT (Over-the-Top) services adds another dimension. OTT services are third party digital offerings such as video, messaging or cloud apps that telcos must provision and assure alongside their own network services. Customers expect seamless activation of third-party video, messaging, or cloud applications alongside traditional telco services. That requires tight integration between BSS, which captures the commercial offer, and OSS, which ensures that both network and partner services are provisioned, assured, and synchronized.

OSS is therefore more than a supporting player. OSS is the nervous system of the telco, bridging customer intent in the BSS with fulfillment in the network and with external OTT ecosystems.

The Pain of Legacy OSS

Yet, while OSS should enable innovation, it often slows it down. Legacy OSS, built for a world of static services and manual processes, is increasingly misaligned with the demands of 5G, edge computing, and cloud-native applications.

As Gustavo Mársico outlined in his article “A Strategic Approach to Modernize Telco OSS”, the legacy challenge is not that OSS was built “wrong.” It was simply built for another era. Over the years, systems piled up like bricks, making the environment fragile and expensive:

  • High OPEX from maintaining monolithic platforms and professional services-heavy integrations.
  • Slow time-to-market, where introducing a new service can take months because provisioning logic is buried in outdated workflows.
  • Rigid, batch-driven architecture that cannot handle real-time telemetry, intent-based orchestration, or closed-loop automation.
  • Vendor lock-in, as proprietary integrations make OSS migration risky and costly.

The result is an OSS estate that acts as an anchor rather than a launchpad. Gustavo’s article was the inspiration for this article to explore in more detail how data streaming helps building cloud-native OSS infrastructure.

Data Streaming as the Cloud-Native Middleware for OSS Transformation

Data streaming becomes the backbone of event driven telecom architectures, Many operators are only starting this journey. This is where data streaming with the de facto standards Apache Kafka and Flink come into play: Kafka Connectors integrate billing systems, compliance platforms, and legacy OSS/BSS. Real-time telemetry from the radio access network, IoT gateways, and 5G core functions flows into Kafka Topics. Flink processes these events for fraud detection, quality monitoring, subscriber behavior insights, and real-time alerts.

Data streaming introduces a new integration fabric that is:

  • Cloud-native and open: Kafka and Flink run in hybrid and multi-cloud environments, at the edge, or on-premises, avoiding vendor lock-in.
  • Real-time and scalable: OSS can ingest, process, and act on events from networks, devices, and applications as they happen.
  • Democratized and reusable: Data products can be shared across OSS, BSS, and beyond, creating a common data backbone.
  • Cost-efficient: By decoupling legacy systems with event-driven bridges, modernization can follow a strangler fix pattern – replacing old systems step by step, without a “big bang.”

With a data streaming platform, OSS no longer needs to rely on fragile point-to-point integrations. Instead, every system (assurance, inventory, workforce, catalog, orchestration) can publish and consume streams of events. This makes OSS a dynamic hub, ready for continuous change.

Readers seeking an overview of the Data Streaming Platform market should review “The Data Streaming Landscape 2025”:

Data Streaming and Business Process Management: Partners, Not Rivals

In telecom, Business Process Management (BPM) provides workflow orchestration for use cases such as service activation, order management, and assurance. Data streaming BPM address different needs. Used together, they combine real-time data flow with structured process orchestration:

  • Business process-led orchestration with tools around standards like BPMN gives structure. It defines workflows, decomposes orders, and drives fulfillment.
  • Data streaming with Kafka and Flink provides the event-driven nervous system. Every event, from network telemetry to customer updates, flows reliably across OSS, BSS, and partners.

OSS modernization needs both. BPM tools and workflow engines leverage BPMN to provide the model by visualizing and structuring business processes such as order decomposition and fulfillment. Kafka and Flink ensure those processes actually run in an event driven world. This keeps clear separation of concerns: BSS manages offers and customer intent, while OSS executes activation and assurance with speed and accuracy.

Data Streaming as Workflow Orchestration Engine

When Kafka can be the business process engine: Skip a BPM tool if the workflow has no human steps, is code first, and fits an event/state machine pattern. Use Kafka Topics for persistence and ordering, compacted topics for current state, replay for recovery, and the Saga pattern for multi step consistency.

Here is an example where data streaming with Kafka and Flink is used as stateful workflow engine to provide observability across multiple sites to calculate billing, adjust prices, incorporate late arriving information, etc.:

More details about data streaming for managing stateful business processes: Apache Kafka as Workflow and Orchestration Engine.

For long running transactions: Bring in a durable execution engine like Temporal or Restate. They add built in durability, retries, timers, and compensation for machine to machine workflows. Integrate them with Kafka to run reliable, multi step activations and jeopardy handling at telco scale. More details: The Rise of the Durable Execution Engine (Temporal, Restate) in an Event-driven Architecture (Apache Kafka).

Proof Point: EchoStar’s Dish Wireless and the Event-Driven OSS/BSS Merger

As part of EchoStar, Dish Wireless built its greenfield 5G network in the United States with an event-driven architecture at its core. By using Kafka as the central nervous system, Dish merged OSS and BSS into a cloud-native stack, orchestrating everything from provisioning to assurance in real time.

While this is not a brownfield modernization but a greenfield build, it is still highly relevant for the topic of Telecom OSS Modernization with Data Streaming. Dish demonstrates the target state that incumbents can aim for: a streaming-first, cloud-native OSS/BSS stack. Dish’s approach offers a blueprint for operators with legacy estates, showing how data streaming can gradually transform existing OSS from rigid and batch-driven to agile and event-driven.

Source: Dish Network

Instead of siloed legacy workflows, Dish runs on a streaming-first foundation. OSS is no longer a passive afterthought but a proactive engine of customer experience. This demonstrates the power of starting fresh – but also provides a blueprint for incumbents to gradually evolve legacy OSS through the same event-driven model.

More details, including an interview with Dish, in the following article: How Apache Kafka helps Dish Wireless building cloud-native 5G Telco Infrastructure.

AI in Modern Telecom OSS: From Reactive to Predictive and Agentic

OSS modernization is not only about speed; it is about intelligence. With Kafka and Flink feeding real-time events into AI systems, telcos can evolve from reactive fault management to predictive and agent-driven operations:

  • Predictive AI: Detecting anomalies in telemetry data before service degradation occurs.
  • Generative AI: Assisting operations teams by summarizing incidents or suggesting workflows.
  • Agentic AI: Acting autonomously on OSS events, triggering closed-loop automation via the event-driven streaming backbone.

The following chart illustrates Agentic AI powered by data streaming across OSS and BSS layers:

A subscriber agent expresses intent, which the OSS/BSS agent interprets and executes by orchestrating network services, billing, and compliance. Events flow through Kafka Topics, where Flink processes them for assurance, fraud detection, and quality monitoring across the telco stack.

Data streaming ensures AI has the fuel it needs: complete, real-time, contextual data across OSS and BSS.

Business Outcomes: Step-by-Step OSS Modernization with Streaming

Adopting data streaming into OSS modernization unlocks measurable outcomes:

  • Reduced OPEX: By decoupling legacy systems and reducing custom integration costs.
  • Faster time-to-market: New services can be launched in weeks, not months.
  • Agility at scale: Deploy anywhere — on the edge, in private data centers, or across multiple clouds.
  • Data democratization: OSS no longer a silo, but a platform that shares data products with BSS, CRM, assurance, and analytics.
  • Future-proofing with AI: A foundation ready for predictive and intent-based operations.

The strangler fix pattern makes this achievable. Legacy OSS modules can be replaced incrementally:

Kafka and Flink act as the bridge — keeping old systems alive while new cloud-native components are introduced. Over time, the telco moves from rigid batch to agile streaming, without operational disruption.

Turning OSS into a Growth Engine

Legacy OSS has become a bottleneck in a world where telcos must innovate faster and deliver more reliable services. Data streaming with Apache Kafka and Flink provides the foundation to modernize step by step: reducing OPEX, cutting time-to-market, and enabling real-time automation.

By bridging OSS, BSS, and OTT with an event-driven backbone, telcos transform OSS from a cost center into a growth engine. The result is an agile, cloud-native nervous system that powers predictive operations today and prepares for agent-driven automation tomorrow.

Join the data streaming community and stay informed about new blog posts by subscribing to my newsletter and follow me on LinkedIn or X (former Twitter) to stay in touch. And download my free book about data streaming use cases, including a dedicated chapter about the telecom sector.

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Recent Posts

How Stablecoins Use Blockchain and Data Streaming to Power Digital Money

Stablecoins are reshaping digital money by linking traditional finance with blockchain technology. Built for stability…

2 days ago

Cybersecurity with a Digital Twin: Why Real-Time Data Streaming Matters

Cyberattacks on critical infrastructure and manufacturing are growing, with ransomware and manipulated sensor data creating…

2 weeks ago

How Siemens, SAP, and Confluent Shape the Future of AI Ready Integration – Highlights from the Rojo Event in Amsterdam

Many enterprises want to become AI ready but are limited by slow, batch based integration…

3 weeks ago

Scaling Kafka Consumers: Proxy vs. Client Library for High-Throughput Architectures

Apache Kafka’s pull-based model and decoupled architecture offer unmatched flexibility for event-driven systems. But as…

4 weeks ago

Square, SumUp, Shopify: Real-Time Point-of-Sale (POS) in the Age of Data Streaming

Point-of-Sale systems are evolving into real-time, connected platforms that go far beyond payments. Mobile solutions…

1 month ago

Online Feature Store for AI and Machine Learning with Apache Kafka and Flink

Real-time personalization requires more than just smart models. It demands fresh data, fast processing, and…

1 month ago