Categories: EAI

Slides from NoSQLmatters: “Big Data beyond Apache Hadoop – How to integrate ALL your data with Apache Camel and Talend”

Slides from my talk “Big Data beyond Apache Hadoop – How to integrate ALL your data” at NoSQLmatters 2013 in Cologne are online.

Here the abstract:

Big data represents a significant paradigm shift in enterprise technology. Big data radically changes the nature of the data management profession as it introduces new concerns about the volume, velocity and variety of corporate data.
Apache Hadoop is the open source defacto standard for implementing big data solutions on the Java platform. Hadoop consists of its kernel, MapReduce, and the Hadoop Distributed Filesystem (HDFS). A challenging task is to send all data to Hadoop for processing and storage (and then get it back to your application later), because in practice data comes from many different applications (SAP, Salesforce, Siebel, etc.) and databases (File, SQL, NoSQL), uses different technologies and concepts for communication (e.g. HTTP, FTP, RMI, JMS), and consists of different data formats using CSV, XML, binary data, or other alternatives.
This session shows different open source frameworks and tools to solve this challenging task. Learn how to use every thinkable data with Hadoop – without plenty of complex or redundant boilerplate code.

Here the slides:

http://www.slideshare.net/KaiWaehner/big-data-beyond-apache-hadoop-how-to-integrate-all-your-data

Kai Waehner

bridging the gap between technical innovation and business value for data integration, workflow orchestration, and agentic AI.

Next Big Data beyond Hadoop - How to integrate ALL data with Apache Camel and Talend => Video from NoSQL matters 2013 online »

Previous « Book Review: "Getting Started with NoSQL" by Gaurav Vaish (Packt Publishing)

Published by

Kai Waehner

Tags: ApacheBig DataClouderaEAIEnterprise Integration PatternEnterprise Service BusESBHadoopHortonworksIntegration FrameworkJavaMapRopen sourcetalend

13 years ago

Data Integration Landscape 2026: Event Streaming, API, and Batch in the Era of Agentic AI

The Data Integration Landscape 2026 maps every major vendor across three communication paradigms: request-response, event-driven,…

1 day ago

Agentic AI

Why I Joined Kestra: Enterprise Workflow Orchestration for the Agentic AI Era

Enterprises run separate tools for IT scheduling, data pipelines, business processes, and infrastructure. None talk…

1 week ago

Confluent

My Confluent Chapter: From Apache Kafka Startup to $11 Billion IBM Acquisition

Nine years at Confluent: from a Silicon Valley startup with 100 people to an $11…

2 weeks ago

Data Integration

YAML vs XML vs JSON: History, Trade-offs, and Where Each Wins in the Age of Agentic AI

XML, JSON, and YAML were built for different jobs in different eras. This post covers…

2 weeks ago

Data Integration

Why Databricks and Snowflake Speak the Kafka Protocol: Ingestion vs. Architecture

Databricks and Snowflake now speak the Kafka protocol. But using the Kafka API to feed…

2 weeks ago

Choosing an ERP for Manufacturing: How AI Is Reshaping the Vendor Landscape

ERP vendor selection for manufacturing is not a product decision. It is a strategic bet…

3 weeks ago

Slides from NoSQLmatters: “Big Data beyond Apache Hadoop – How to integrate ALL your data with Apache Camel and Talend”

Related Post

Recent Posts

Data Integration Landscape 2026: Event Streaming, API, and Batch in the Era of Agentic AI

Why I Joined Kestra: Enterprise Workflow Orchestration for the Agentic AI Era

My Confluent Chapter: From Apache Kafka Startup to $11 Billion IBM Acquisition

YAML vs XML vs JSON: History, Trade-offs, and Where Each Wins in the Age of Agentic AI

Why Databricks and Snowflake Speak the Kafka Protocol: Ingestion vs. Architecture

Choosing an ERP for Manufacturing: How AI Is Reshaping the Vendor Landscape