Categories: EAI

Slides online: “Big Data beyond Apache Hadoop – How to Integrate ALL your Data” – JavaOne 2013

Slides from my session “Big Data beyond Apache Hadoop – How to Integrate ALL your Data” at JavaOne 2013 in San Francisco are online.

Abstract

Big data represents a significant paradigm shift in enterprise technology. Big data radically changes the nature of the data management profession as it introduces new concerns about the volume, velocity and variety of corporate data.

Apache Hadoop is the open source defacto standard for implementing big data solutions on the Java platform. Hadoop consists of its kernel, MapReduce, and the Hadoop Distributed Filesystem (HDFS). A challenging task is to send all data to Hadoop for processing and storage (and then get it back to your application later), because in practice data comes from many different applications (SAP, Salesforce, Siebel, etc.) and databases (File, SQL, NoSQL), uses different technologies and concepts for communication (e.g. HTTP, FTP, RMI, JMS), and consists of different data formats using CSV, XML, binary data, or other alternatives.

This session shows different open source frameworks and products (especially Apache Camel and Talend Open Studio for Big Data) to solve this challenging task. Learn how to use every thinkable data with Hadoop – without plenty of complex or redundant boilerplate code.

Slides

Click on the button to load the content from www.slideshare.net.

Load content

Kai Waehner

bridging the gap between technical innovation and business value for real-time data streaming, processing and analytics

Recent Posts

How Data Streaming Powers AI and Autonomous Networks in Telecom – Insights from TM Forum Innovate Americas

AI and autonomous networks took center stage at TM Forum Innovate Americas 2025 in Dallas.…

3 days ago

Telecom OSS Modernization with Data Streaming: From Legacy Burden to Cloud-Native Agility

OSS is critical for service delivery in telecom, yet legacy platforms have become rigid and…

6 days ago

Amazon MSK Forces a Kafka Cluster Migration from ZooKeeper to KRaft

The Apache Kafka community introduced KIP-500 to remove ZooKeeper and replace it with KRaft, a…

1 week ago

Streaming the Automotive Future: Real-Time Infrastructure for Vehicle Data

Connected vehicles are transforming the automotive industry into a software-driven, data-centric ecosystem. While APIs provide…

2 weeks ago

How Global Payment Processors like Stripe and PayPal Use Data Streaming to Scale

This blog post explores how leading payment processors like Stripe, PayPal, Payoneer, and Worldline are…

3 weeks ago

The Future of Data Streaming with Apache Flink for Agentic AI

Agentic AI is moving into production. Autonomous, tool-using, goal-driven systems that need real-time data and…

4 weeks ago