The demand for stream processing is increasing a lot these days. Frameworks (Apache Storm, Spark) and products (e.g. IBM InfoSphere Streams, TIBCO StreamBase, Software AG Apama) for stream processing and streaming analytics are getting a lot of attention these days. The reason is that often processing big volumes of data is not enough. Data has to be processed fast, so that a firm can react to changing business conditions in real time. This is required for trading, fraud detection, system monitoring, and many other examples. A “too late architecture” cannot realize these use cases.
There is not much literature available about stream processing and streaming analytics in 2014. Analyst Forrester has published a report recently: The Forrester Wave™: Big Data Streaming Analytics Platforms, Q3 2014. Leaders are Software AG (Apama), IBM (InfoSphere Streams), TIBCO (StreamBase), SAP (Event Stream Processor) and Informatica.
I have written an article for InfoQ, which gives some more details about stream processing and streaming analytics:
Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse
This article discusses what stream processing is, how it fits into a big data architecture with Hadoop and a data warehouse (DWH), when stream processing makes sense, and what technologies and products you can choose from. The following is just a snippet of the article describing the components of a stream processing solution.
Stream processing can be implemented by doing-it-yourself, using a framework or a product. Doing-it-yourself should not be an option in most cases, because there are good open source frameworks available for free. However, a stream processing product might solve many of your issues out-of-the-box, while a framework still requires a lot of self-coding and the Total Cost of Ownership might be much higher than expected compared to a product.
From a technical perspective, the following components are required to solve all “streaming challenges” and implement a stream processing use case:
As of end-2014, only a few products are available on the market that offer these components. Often, a lot of custom coding is required instead of using a full product for stream processing.
The two most-widespread open source frameworks for stream processing are Apache Storm and Spark. IBM InfoSphere Streams, TIBCO StreamBase and Software AG’s Apama are important players from proprietary vendors.
Please read my InfoQ article for more details about these products, and how they relate to a Data Warehouse (DWH) and Apache Hadoop. Here is the link: Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse.
As always, I appreciate all feedback and discussions…
Every enterprise is being told to go agentic. Meanwhile, the platforms holding your most critical…
AI agents fail in production when they are connected directly to raw event streams. Flink…
Complex Event Processing is the most underused capability in Apache Flink. It detects meaningful event…
MCP, REST/HTTP APIs, and Apache Kafka are not alternatives. They solve different problems at different…
The Enterprise Agentic AI Landscape 2026 maps every major AI vendor across two dimensions that…
Agentic AI without governed processes is fast but ungoverned. Event-driven integration without process intelligence moves…