Data Preparation: Comparison of Programming Languages, Frameworks and Tools for Data Preprocessing and (Inline) Data Wrangling in Machine Learning / Deep Learning Projects.
Log Analytics is the right framework or tool to monitor for Distributed Microservices. Comparison of Open source, SaaS and Enteprrise Products. Plus relation to big data components such as Apache Hadoop / Spark.
Slide deck from OOP 2016: Comparison of Frameworks and Products for Big Data Log Analytics and ITOA, e.g. Open Source ELK, TIBCO LogLogic / Unity, Splunk, Papertrail; Relation to Hadoop is also discussed.
Data Warehouses have existed for many years in almost every company. While they are still as good and relevant for the same use cases as they were 20 years ago, they cannot solve new, existing challenges and those sure to come in a ever-changing digital world. The upcoming sections will clarify when to still use a Data Warehouse and when to use a modern Live Datamart instead.
The article discusses what stream processing is, how it fits into a big data architecture with Hadoop and a data warehouse (DWH), when stream processing makes sense, and what technologies and products you can choose from. Comparison of open source and proprietary stream processing / streaming analytics alternatives: Apache Storm, Spark, IBM InfoSphere Streams, TIBCO StreamBase, Software AG’s Apama, etc.
Slides from my talk “Hadoop and Data Warehouse (DWH) – Friends, Enemies or Profiteers? What about Real Time?”…
In this blog post, I will show you how to „ETL“ all kinds of data to Amazon’s cloud data warehouse Redshift wit Talend’s big data components. You need not be a cloud or DWH expert, or an expert developer to integrate with Amazon’s cloud data warehouse Redshift. It is very easy with Talend’s integration solutions. Just drag&drop, configure, do some graphical mappings / transformations (if necessary), that’s it. Code is generated. Job runs. With Talend, you can easily „ETL“ all data from different sources to Redshift and store it there for under $1,000 per terabyte per year – even with the open source version!