Analytics, Big Data, Business Intelligence, Hadoop, In Memory, NoSQL

TIBCO BusinessWorks and StreamBase for Big Data Integration and Streaming Analytics with Apache Hadoop and Impala

April 14, 2015

Apache Hadoop is getting more and more relevant. Not just for Big Data processing (e.g. MapReduce), but also for Fast Data processing (e.g. Stream Processing). Recently, I published two blog posts on the TIBCO blog to show how you can leverage TIBCO BusinessWorks 6 and TIBCO StreamBase to realize Big Data and Fast Data Hadoop use cases.

TIBCO ActiveMatrix BusinessWorks 6 + Apache Hadoop = Big Data Integration

Apache Hadoop was built for processing complex computations on Big Data stores (that is, terabytes to petabytes) with a MapReduce distributed computation model that runs easily on cheap commodity hardware.

A Hadoop distribution from vendors such as Hortonworks, Cloudera or MapR packages different projects of the Hadoop ecosystem. This assures that all used versions work together smoothly. On top of the packaging, Hadoop vendors offer tooling for deployment, administration and monitoring of Hadoop clusters. Commercial support completes their offerings.

The key challenge is to integrate the input and results of Hadoop processing into the rest of the enterprise. Using just a Hadoop distribution requires a lot of complex coding for integration services.

Continue here for the full article: TIBCO ActiveMatrix BusinessWorks 6 + Apache Hadoop = Big Data Integration

TIBCO StreamBase + Hadoop + Impala = Fast Data Streaming Analytics

As of today, Hadoop is evolving quickly. It is not only used for batch processing anymore. YARN, Storm, Spark, and several other solutions introduce modern paradigms to Hadoop. However, some problems still remain with Hadoop:

No good, easy development tooling for the Hadoop ecosystem components such as Hive, Storm, Spark, etc.
Missing maturity (a lot of alpha/beta/0.x versions) especially in management and monitoring tools, as well as security, connectivity, and APIs
No “real time” (== seconds, milliseconds, microseconds), but “near real time” (still several seconds and more, much more when recovering from infrastructure faults)
No operational analytics (human monitoring and proactive actions)

So why not combine the great benefits of Hadoop with the Fast Data streaming analytics tool TIBCO StreamBase with its mature, mission-critical deployments in several different industries, great graphical tooling, and operational real-time analytics (via TIBCO Live Datamart on top of StreamBase)?

This post shows how to realize a Fast Data use case with TIBCO StreamBase and the Hadoop framework’s Impala analytical database quickly and easily.

Continue here for the full article: TIBCO StreamBase + Hadoop + Impala = Fast Data Streaming Analytics

For a general introduction to Stream Processing and Streaming Analytics, I recommend the InfoQ article: Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse.

As always, I appreciate any feedback…

Share this post :

Kai Waehner

bridging the gap between technical innovation and business value for data integration, workflow orchestration, and agentic AI.

Analytics, Big Data, Business Intelligence, Hadoop, In Memory, NoSQL

TIBCO BusinessWorks and StreamBase for Big Data Integration and Streaming Analytics with Apache Hadoop and Impala

TIBCO ActiveMatrix BusinessWorks 6 + Apache Hadoop = Big Data Integration

TIBCO StreamBase + Hadoop + Impala = Fast Data Streaming Analytics

Don't miss my next post. Subscribe!

Share this post :

Latest Posts

Data Integration vs Workflow Orchestration: Connecting Systems Is Not Coordinating the Work

Process Intelligence Landscape 2026: Mining, Orchestration, and the Agentic AI Shift

When to Use AMQP, JMS, Kafka, or MQTT: Trade-offs, Not a Winner

Kafka vs Flink vs Spark: Do You Really Need Real-Time?

Don’t miss my next post. Subscribe!

Analytics, Big Data, Business Intelligence, Hadoop, In Memory, NoSQL

TIBCO BusinessWorks and StreamBase for Big Data Integration and Streaming Analytics with Apache Hadoop and Impala

TIBCO ActiveMatrix BusinessWorks 6 + Apache Hadoop = Big Data Integration

TIBCO StreamBase + Hadoop + Impala = Fast Data Streaming Analytics

Don't miss my next post. Subscribe!

Share this post :

Tag Cloud

Data Integration vs Workflow Orchestration: Connecting Systems Is Not Coordinating the Work

Process Intelligence Landscape 2026: Mining, Orchestration, and the Agentic AI Shift

When to Use AMQP, JMS, Kafka, or MQTT: Trade-offs, Not a Winner

Kafka vs Flink vs Spark: Do You Really Need Real-Time?

Don’t miss my next post. Subscribe!