February 3, 2023

Using a Kafka Broker to Build Real-time Streaming Applications

Fred Patton

Many organizations that practice stream processing utilize Kafka brokers, the storage layer servers that are used to host Kafka topics. They may also use Kafka Streams to build applications and microservices that unlock insights based on logical or mathematical relationships between data sources and their states.

The challenge with this approach is that stateful operations like data aggregations and joins using stream processing operators can quickly become expensive. Businesses need a way to surface valuable insights from all sources in real-time without the overhead of fetching state for external streams when processing the event.

That’s exactly what SwimOS aims to do. Swim allows organizations to seamlessly run business logic on top of streaming data so they can understand what the information means and facilitate immediate action based on that meaning. Below we will explain:

The value of using real-time streaming applications with brokers
How Swim apps utilize business context from all relevant streams in real time
An example of a real-time streaming application in action

Why are real-time streaming applications needed to surface valuable insights Your streaming data contains a never-ending flow of events that are updated to the states of data sources. When designing an architecture to process streaming data, it’s important to recognize that an event broker (such as a Kafka broker) is a powerful tool that plays a central role but has limitations.

Brokers excel at unifying access to streaming data, but they are not an all-in-one solution for making sense of massive amounts of information. This is because the effective analysis is dependent on the meaning of your events — the state changes the data represents — rather than the data itself.

Real-time streaming applications are able to make sense of streaming data and provide valuable insights because they are always stateful, which means they always have all relevant state ready without needing to ask a database. Accessing a database to gather state is significantly slower than stateful, in-memory analysis.

While in-memory databases and data grids can help enhance performance in some cases, none address the need to compute on the fly as data arrives. Real-time streaming applications allow organizations to continuously find and compute in the context of the event source (e.g., “within 100m” or “correlated to”). They illuminate the dynamic, fluid relationships between data sources that are crucial for accurate, meaningful insights.

How do Swim applications maximize the value of streaming data? Unlocking the full value of streaming data requires more than analytics. The inability to use streaming data to fuel deep decision automation is a serious roadblock for many organizations: Any moment can be an expiring opportunity, and if humans are not available to take action, that opportunity is lost. The window to act on time-sensitive matters can be minutes, seconds, or even milliseconds. Even if the window is longer for human response or decision automation, given the real-time data rates required for processing, much smaller windows may be needed at processing time given all the sources that must be evaluated to provide full context.

Automating business decisions to meet these brief windows of opportunity requires applications to always have an answer from the latest data. Each new event must be statefully analyzed in real-time as soon as it arrives so that application outputs are also real-time. And, as mentioned above, the relationships among data sources (e.g., containment, proximity, and correlation) are critical for effective insights and action that result from the joint meaning of events over time.

SwimOS gives you an easy way to utilize business context from all relevant streams in real-time so you can build real-time streaming applications that analyze, learn, and predict on the fly. With SwimOS, every data source is represented by a concurrent, stateful actor called a Web Agent. Each Web Agent continuously receives events from its real-world source and statefully evolves like a smart digital twin.

Web Agents continuously process their own event data and link to other agents based on their real-world context, dynamically building an in-memory graph whose links indicate relationships such as proximity, containment, or correlation. Linked Web Agents see each other’s state changes in memory, in real-time. Web Agents continuously analyze, learn, and predict using their own state and the states of their linked neighbors in the graph. They then stream the resulting insights to brokers, data lakes, and enterprise applications.

With Swim’s end-to-end streaming, the real-time experience extends all the way to the real-time user interface or decision automation function. There is no need to poll or subscribe to a separate streaming technology that induces lag. Real-time UIs and client streaming end-points are part of the same continuous data flow that runs in half-ping latency at the speed of the network. Example of a real-time streaming application with Swim

Swim applications allow you to run business logic on top of your streaming data, which powers true real-time observability, live modeling of complex business operations, and responsive decision automation at scale. Here is a cellular network example application you can explore to understand how Swim connects to a Kafka broker and feeds messages to Web Agents. You can also see a hosted version of the application running here.

For an additional real-world example, let’s consider one of our customers who was able to seamlessly execute a sophisticated use case for real-time streaming applications using Swim. This customer needed to monitor more than 100 million mobile devices owned by subscribers and massive amounts of telephony data from cell towers (gigabits per second from multiple datasets).

Previous attempts to design applications for their needs (e.g., trigger automated expansion in the event of saturated capacity on a wireless network) suffered from both latency and lack of context. Insights weren’t arriving soon enough to be useful, and aggregate metrics gave an overall picture of network activity but still lacked complete context.

Swim enabled the company to solve these challenges by ensuring stateful, low-latency processing and maintaining continuous streaming context. This allowed the company to execute business logic using data from multiple Kafka topics in real-time. These computations included performance KPIs to identify subscriber issues, as well as user-scoring algorithms to measure customer experience. The company was able to use Swim to unlock:

A significant reduction in latency (down to ~500 milliseconds)
Elimination of redundant data transmission
Real-time visibility into subscriber performance
Real-time visibility into cell tower performance
A real-time interactive user interface for live data

Take a dip into Swim Kafka brokers act as a useful buffer between the real world and applications, but unlocking the full potential of streaming data requires something more. That’s where SwimOS comes in to solve the “last mile” challenges, so businesses can build real-time applications on top of streaming data while maintaining full context and ultra-low latency.

Interested in joining the organizations that use Swim to build streaming applications for use cases such as network monitoring, user experience scoring, real-time situational awareness, continuous fault detection, and much more? Dive into our project-based tutorials and guides to quickly get up to speed on how to use Swim to build real-time streaming applications.