Apache Kafka is the de facto standard for event streaming to process data in motion. This blog post explores when NOT to use Apache Kafka. What use cases are not a good fit for Kafka? What limitations does Kafka have? How to qualify Kafka out as it is not the right tool for the job?
Data Mesh is a new architecture paradigm that gets a lot of buzzes these days. This blog post looks into this principle deeper to explore why no single technology is the perfect fit to build a Data Mesh. Examples show why an open and scalable decentralized real-time platform like Apache Kafka is often the heart of the Data Mesh infrastructure, complemented by many other data platforms to solve business problems.
Apache Kafka became the de facto standard for processing data in motion. Kafka is open, flexible, and scalable. Unfortunately, the latter makes operations a challenge for many teams. Ideally, teams can use a serverless Kafka SaaS offering to focus on business logic. However, hybrid scenarios require a cloud-native platform that provides automated and elastic tooling to reduce the operations burden. This blog post explores how to leverage cloud-native and serverless Kafka offerings in a hybrid cloud architecture. We start from the perspective of data at rest with a data lake and explore its relation to data in motion with Kafka.
Log Analytics is the right framework or tool to monitor for Distributed Microservices. Comparison of Open source, SaaS and Enteprrise Products. Plus relation to big data components such as Apache Hadoop / Spark.
Slide deck from OOP 2016: Comparison of Frameworks and Products for Big Data Log Analytics and ITOA, e.g. Open Source ELK, TIBCO LogLogic / Unity, Splunk, Papertrail; Relation to Hadoop is also discussed.
Data Warehouses have existed for many years in almost every company. While they are still as good and relevant for the same use cases as they were 20 years ago, they cannot solve new, existing challenges and those sure to come in a ever-changing digital world. The upcoming sections will clarify when to still use a Data Warehouse and when to use a modern Live Datamart instead.
The article discusses what stream processing is, how it fits into a big data architecture with Hadoop and a data warehouse (DWH), when stream processing makes sense, and what technologies and products you can choose from. Comparison of open source and proprietary stream processing / streaming analytics alternatives: Apache Storm, Spark, IBM InfoSphere Streams, TIBCO StreamBase, Software AG’s Apama, etc.
In this blog post, I will show you how to „ETL“ all kinds of data to Amazon’s cloud data warehouse Redshift wit Talend’s big data components. You need not be a cloud or DWH expert, or an expert developer to integrate with Amazon’s cloud data warehouse Redshift. It is very easy with Talend’s integration solutions. Just drag&drop, configure, do some graphical mappings / transformations (if necessary), that’s it. Code is generated. Job runs. With Talend, you can easily „ETL“ all data from different sources to Redshift and store it there for under $1,000 per terabyte per year – even with the open source version!
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.