Kafka for Cybersecurity (Part 1 of 6) – Data in Motion as Backbone

Apache Kafka - The Backbone for Cybersecurity including SIEM and SOAR
This blog series explores use cases and architectures for Apache Kafka in the cybersecurity space, including situational awareness, threat intelligence, forensics, air-gapped and zero trust environments, and SIEM / SOAR modernization. This post is part one: Data in motion as cybersecurity backbone.

Apache Kafka became the de facto standard for processing data in motion across enterprises and industries. Cybersecurity is a key success factor across all use cases. Kafka is not just used as a backbone and source of truth for data. It also monitors, correlates, and proactively acts on events from various real-time and batch data sources to detect anomalies and respond to incidents. This blog series explores use cases and architectures for Kafka in the cybersecurity space, including situational awareness, threat intelligence, forensics, air-gapped and zero trust environments, and SIEM / SOAR modernization. This post is part one: Data in motion as the cybersecurity backbone.

Apache Kafka - The Backbone for Cybersecurity including SIEM and SOAR

Blog series: Apache Kafka for Cybersecurity

This blog series explores why security features such as RBAC, encryption, and audit logs are only the foundation of a secure event streaming infrastructure. Learn about use cases,  architectures, and reference deployments for Kafka in the cybersecurity space:

Why should you care about Cybersecurity?

Cybersecurity is the protection of computer systems and networks from information disclosure, theft of, or damage to their hardware, software, or electronic data, as well as from the disruption or misdirection of the services they provide.

The field is becoming increasingly significant due to the increased reliance on computer systems, the internet, and wireless network standards such as Bluetooth and Wi-Fi, and due to the growth of “smart” devices, including smartphones, televisions, and the various devices that constitute the “Internet of Things”. Owing to its complexity, cybersecurity is also one of the major challenges in the contemporary world in terms of politics and technology.

Various actors can be involved in cybersecurity attacks. This includes web scraping, hackers, criminals, terrorists, state-sponsored and state-initiated actors.

Examples of recent successful attacks

Most successful attacks have a financial and brand impact. However, it depends on the organization and the kind of attack or data breach. Here are a few recent examples quoted from news articles that you have probably heard of in the tv, newspaper, or internet:

  • 533 million Facebook users’ phone numbers and personal data have been leaked online
  • 500 Million LinkedIn Users’ Data Were Allegedly Hacked A Tale O
  • A Tale of Two Hacks: From SolarWinds to Microsoft Exchange Ransomware
  • Ransomware attack shuts down biggest U.S. gasoline pipeline

Privacy, safety, and cost are huge factors in these successful attacks. The craziest part is that most of the successful cyber attacks don’t even get public as companies prefer to keep them secret.

Supply Chain Attacks

You are not even safe if your own infrastructure is secure. A supply chain attack is a cyber-attack that seeks to damage an organization by targeting less-secure elements in the supply chain.

A supply chain attack can occur in any industry, from the financial, oil, or government sectors.  For instance, cybercriminals tamper with the manufacturing process by installing a rootkit or hardware-based spying components.

The supply chain involves hardware, software, and humans.

Norton Rose Fulbright shows how a supply chain attack looks like:

Supply Chain Attack

A well-known example of a supply chain attack is the Experian breach, where the data of millions of T-Mobile customers was exposed by the world’s biggest consumer credit monitoring firm.

The SolarWinds breach is another famous example. SolarWinds’ network management system has over 300,000 customers. Many of them are heavy hitters, much of the US Federal government, including the Department of Defense, 425 of the US Fortune 500, and many customers worldwide.

Impact of Cybersecurity attacks

“It takes 20 years to build a reputation and few minutes of cyber-incident to ruin it.” (Stephane Nappo)

Security attacks are exploding with high costs. The average cost of a data breach is $3.86 MILLION. It takes 280 DAYS on average time to identify and contain a breach. Additionally, the brand impact is huge but very hard to quantify.

Cybersecurity as a key piece of the security landscape

As you can see in the above examples: The threat is real!

The digital transformation requires IT capabilities, even in the OT world. Internet of Things (IoT), Industrial IoT (IIoT), Industry 4.0 (I40), connected vehicles, smart city, social networks, and similar game-changing trends are only possible with modern technology: Networking, communication, connectivity, open standards, ”always-on”, billions of devices, and so on.

The security landscape gets more and more important. And Cybersecurity is a key piece of that:

The security landscape including cybersecurity

The other security components are very relevant and complimentary, of course. But let’s also talk about some challenges:

  • Access control: Complex and error-prone
  • Encryption: Very important in most cases, sometimes not needed (e.g., in DMZ / air-gapped environments)
  • Hardware security: No help against insiders
  • OT security: Avoid risk (change operations) vs. transfer some risk (buy insurance)

Therefore, in addition to the above security factors, each organization requires good cybersecurity infrastructure (including SIEM and SOAR components).

Continuous real-time data correlation across various data sources is mandatory for a good cybersecurity strategy to have a holistic view and understanding of all the events and potential abuses that are taking place. This is a combination of data collection of different activities happening on critical networks followed by data correlation in real-time (so-called stream processing/streaming analytics).

Cybersecurity challenges – The threat is real!

Plenty of different attacks exist: Stealing intellectual property (IP), Denial-of-service attacks (DDoS), ransomware, wiperware, and so on. WannaCry, NotPetya, and SolarWinds are a few famous examples of successful and impactful cyberattacks. The damage is often billions of dollars.

The key challenge for cybersecurity experts: Find the Needle(s) in the Haystack.

Systems need to detect true positives in real-time automatically. This includes capabilities such as:

  • Threat detection
  • Intrusion prevention
  • Anomaly detection
  • Compliance auditing
  • Proactive response

The haystack is typically huge, i.e., massive volumes of data. Often, it is not just one haystack but many. Hence, a key task is to reduce false positives. This is possible via:

  • Automation
  • Process big volumes of data in real-time
  • Integration of all sources
  • No ‘ignore’ on certain events
  • Creation of filters and correlated event rules
  • Improve signal-to-noise ratio (SNR)
  • Correlate a “collection of needles” into a “signature needle”

TL;DR: Correlate massive volumes of data and act in real-time to threats. That’s why Kafka comes into play…

Kafka and Cybersecurity

Real-time data beats slow data. That’s true (almost) everywhere and the main reason for Kafka’s success and huge adoption across use cases and industries. Real-time sensor diagnostics and track&trace in transportation, instant payment and trade processing in banking, real-time inventory and personalized offers in retail, that’s just a few of the examples.

But: Real-time data beats slow data in the (cyber)security space, too. Actually, it is even more critical here. A few examples:

  • Security: Access control and encryption, regulatory compliance, rules engine, security monitoring, surveillance.
  • Cybersecurity: Risk classification, threat detection, intrusion detection, incident response, fraud detection

The main goal in cybersecurity is to reduce the Mean Time to Detect (MTTD) and Mean Time to Respond (MTTR). Faster detection and response ultimately leads to better prevention. For this reason, modern enterprise architectures adopt Apache Kafka and its ecosystem as the backbone for cybersecurity:

Event Streaming with Apache Kafka is the Backbone for Cybersecurity

Kafka and its ecosystem provide low-latency performance at high throughput in conjunction with data integration and data processing capabilities. That’s exactly what you need for most cybersecurity systems.

Having said this, let’s be clear:

Kafka was NOT built for Cybersecurity

From a technology perspective, cybersecurity includes the following features, products, and services:

  • Situational Awareness
  • Operational Awareness
  • Intrusion Detection
  • Signals and Noise
  • Signature Detection
  • Incident Response
  • Threat Hunting & Intelligence
  • Vulnerability Management
  • Digital Forensics

Kafka is an event streaming platform built to process data in motion, not to solve cybersecurity issues. So, why are we talking about Kafka in this context then?

Most existing cybersecurity platforms contain the same characteristics: Batch, proprietary, inflexible, not scalable, expensive. Think about this with your favorite SIEM in mind.

Kafka as the backbone for Cybersecurity

Kafka has different characteristics: Real-time, open, flexible, scalable, cost-efficient. Hence, Kafka is the ideal backbone for a next-generation cybersecurity infrastructure. This enables the following capabilities:

  • Integrate with all legacy and modern interfaces
  • Record, filter, curate a broad set of traffic streams
  • Let analytic sinks consume just the right amount of data
  • Drastically reduce the complexity of the enterprise architectures
  • Drastically reduce the cost of SIEM / SOAR deployments
  • Add new analytics engines
  • Add stream-speed detection and response at scale in real-time
  • Add mission-critical (non-) security-related applications

Kafka complements SIEM/SOAR and other Cybersecurity and Network Monitoring Tools

Every enterprise is different… Flexibility is key for your cybersecurity initiative! That’s why I see many customers adopting Confluent as an independent but enterprise-grade and hybrid foundation for the cybersecurity enterprise architecture.

Kafka or Confluent do not replace but complement other security products such as IBM QRadar, HP ArcSight, or Splunk. The same is true for other network security monitoring and cybersecurity tools.

High Velocity and (ridiculous) volume of Netflow / PCAP data are processed via tools such as Zeek / Corelight. Open-source frameworks such as TensorFlow or AutoML products such as DataRobot provide modern analytics (machine learning/deep learning) to enhance the intrusion detection systems (IDS) to respond to incidents proactively.

Kafka provides a flexible and scalable real-time backplane for the cybersecurity platform. Its storage capabilities truly decouple different systems. For instance, Zeek handles the incoming ridiculous volume of PCAP data before Kafka handles the backpressure for slow batch consumers such as Splunk or Elasticsearch:

Kafka as Flexible Scalable Real-Time Backplane for the Cybersecurity Platform

In the real world, there is NOT just one SIEM, SOAR, or IDS in the enterprise architecture. Different applications solve different problems. Kafka shines due to its true decoupling while still providing real-time consumption for consumers that can handle it.

Cybersecurity is required everywhere

Running a cybersecurity suite in one location is not sufficient. Of course, it depends on the enterprise architecture, use cases, and many other factors. But most companies have hybrid deployments, multi-cloud, and/or edge scenarios.

Therefore, many companies choose Confluent to deploy event streaming everywhere, including uni- or bi-directional integration, edge aggregation setups, and air-gapped environments. This way, one open and flexible template architecture enables streaming applications and real-time cybersecurity everywhere.

Here is an example of an end-to-end cybersecurity infrastructure leveraging serverless Kafka in the cloud (Confluent Cloud) and self-managed, cloud-native Kafka (Confluent Platform) on-premise, connected Kafka at the edge (a Confluent cluster on the ships), and disconnected edge (a single broker in the drone):

End-to-End Cybersecurity with the Kafka Ecosystem

Let’s now take a look at a real-world example for a Kafka-powered cybersecurity platform.

Crowdstrike’s Kafka backbone for cybersecurity

Crowdstrike is a cybersecurity cloud solution for endpoint security, threat intelligence, and cyberattack response services. Kafka is the backbone of their infrastructure. They ingest ~5 trillion events per week into the cloud platform.

A cybersecurity platform needs to be up and responsive 24/7. It must be available, operational, reliable, and maintainable all the time. Crowdstrike defined four critical roles for operating their streaming data infrastructure: Observability, Availability, Operability, Data Quality.

Crowdstrike Kafka Cybersecurity Cloud Platform

Check out Crowdstrike’s tech blog to learn more about their Kafka infrastructure.

Kafka is (not) all you need for cybersecurity

This introductory post explored the basics of cybersecurity and how it relates respectively why it requires data in motion powered by Apache Kafka. The rest of the series will go deeper into specific topics that partly rely on each other.

Threat intelligence is only possible with situational awareness. Forensics is complementary. Deployments differ depending on security, safety, and compliance requirements.

I will also give a few more concrete Kafka-powered examples and discuss a few success stories for some of these topics. Last but not least, I will show different reference architectures where Kafka complements existing tools such as Zeek or Splunk within the enterprise architecture.

How do you solve cybersecurity risks? What technologies and architectures do you use? Let’s connect on LinkedIn and discuss it! Stay informed about new blog posts by subscribing to my newsletter.

Dont‘ miss my next post. Subscribe!

We don’t spam! Read our privacy policy for more info.
If you have issues with the registration, please try a private browser tab / incognito mode. If it doesn't help, write me: kontakt@kai-waehner.de

Leave a Reply
You May Also Like
How to do Error Handling in Data Streaming
Read More

Error Handling via Dead Letter Queue in Apache Kafka

Recognizing and handling errors is essential for any reliable data streaming pipeline. This blog post explores best practices for implementing error handling using a Dead Letter Queue in Apache Kafka infrastructure. The options include a custom implementation, Kafka Streams, Kafka Connect, the Spring framework, and the Parallel Consumer. Real-world case studies show how Uber, CrowdStrike, Santander Bank, and Robinhood build reliable real-time error handling at an extreme scale.
Read More