What is Contextual Data? And, How is it Evolving?
Imagine you’re a credit card company and a charge comes in. All you know is that this charge is larger than the customer’s previous charge. Is it fraud or not? If it is and you allow it, you’ll lose money, and the fraud may continue; if it’s not and you deny it, you’ll frustrate or even lose a genuine customer.
In this scenario, you can’t make an informed decision because you only have one piece of data: the size of the charge. That’s why credit card companies gather and fuse multiple streams of data – building what’s known as contextual data – to inform decisions.
The trouble is that in this case, and many others, the decision needs to be instant and automatic. As a result, companies are increasingly learning that a single stream of data is not enough to support robust modern applications. Modern applications must be able to process a stream of data in real time alongside all necessary contextual data.
What is contextual data? And why is real-time access to it essential.
Contextual data is data associated with a given entity or object that provides context for that entity or object. Contextual data is essential to the interpretation of data and the making of informed, data-driven decisions.
Contextual data makes data meaningful – but only if it’s fresh.
Contextual data provides information that makes the interpretation of otherwise isolated data possible. Using the credit card fraud example again, you can imagine how a credit card company might notice suspicious activity and pull in contextual data that shows the customer is traveling and approve the charge. But if the contextual data is stale, and it turns out the customer returned home but lost their credit card along the way, then the company will have approved a fraudulent charge.
Contextual data is more useful the closer it is to real-time. The staler the data gets, the more likely it is to be misleading.
Example: A telco company can only diagnose poor network performance with fresh contextual data.
Imagine a telco company with cell towers around the country. If the company notices bad network performance from one of the towers, how can it diagnose the source of the problem? The diagnosis requires contextual data.
The diagnosis process starts with an initial signal: A cell tower sends a message to the telco company that indicates poor network performance. To inform the diagnosis, the telco company will want to know the following metrics, among others:
- How many subscribers are attached to the cell tower?
- How are the cell towers around the poorly performing cell tower performing?
- What are the current aggregate performance metrics for this market?
Each metric relies on data from different sources:
- The first requires access to the subscriber stream
- The second and third require access to multiple messages from the network performance stream, as well as access to topology data in an associated database
To make an effective diagnosis, the telco company needs to fuse streaming data and static data in real-time. Only then will the company have the contextual data necessary to make sense of the initial data.
Stateless systems are limiting the use of contextual data, but stateful systems point a way forward.
Contextual data is not novel, but as companies increasingly rely on it to make decisions and rapidly respond to changes, they discover the limitations of technologies built on stateless systems.
Stateless systems will only get you so far.
A stateless microservice is a microservice that doesn’t maintain a state within its service across calls. Stateless microservices ingest, process, and send data without retaining any state information (hence “stateless”).
The same pattern applies to contextual data: A stateless system will receive a message from a data stream for a given entity and write it to a database. From there, the system will perform queries in a different process, one that spans multiple tables in a database to capture a complete context.
Depending on the complexity of the query, this process can take anywhere from seconds to hours.
Once processed, the system writes the result to another table in the database. The result, which is likely already stale given the time needed to process, is still not visible to downstream applications since these applications have to poll an API to get the results.
Stateful systems provide the foundation to real-time contextual data
A stateful microservice maintains a persistent state as it functions. Processes across stateful microservices run within the context of previous transactions. Previous transactions can inform new transactions.
Swim uses stateful entities (which we call Web Agents) to provide users with real-time contextual data – data processed and provided at a speed even the best stateless systems can’t emulate.
The process starts when you create a Web Agent for each entity in your system. You load the static data associated with that entity, and it becomes part of the state of the associated Web Agent. From there, you can send messages associated with a given entity from different streams to the Web Agent.
With this process running, you can perform business logic any time you receive a message for that Web Agent. Your business logic will have access to the state of the Web Agent, meaning it will have the contextual data it needs. The result of the business logic will then be added to the state of the Web Agent.
The business logic has access to the state of the Web Agent in a manner of nanoseconds because the state exists in memory (highlighting the necessity of using stateful microservices).
In other words, Web Agents enable Swim applications to perform contextual data analysis over streaming data. Web Agents are similar to digital twins, but whereas digital twins are passive, Web Agents give you the ability to take action on an entity.
Web Agents expose their state via streaming APIs, meaning that the newly computed state will be available, in real-time, to all downstream applications.
Contextual data requires a new paradigm
Applications using Swim are orders of magnitude faster than applications using stateless microservices because the latter either have to use multiple network calls or gain disk access to query data from a database.
On the other hand, Swim is the only streaming application platform that combines Web Agents, streaming APIs, and real-time UIs –– the combination of which fully unlocks the potential of contextual data. With robust contextual data, companies can make time-critical decisions faster and more confidently.
Interested in seeing how real-time contextual data can impact your operations? Check out SwimOS to see how you can integrate real-time insights into your decision-making process today.
Recommended ReadingView All Resources
Brokers Aren’t Databases
The rise of event streaming as a new class of enterprise data that demands continuous analysis is uncontroversial. What’s puzzling...
Forget Data – It’s State That Matters
Streaming data contains events that are updates to the states of applications, devices, or infrastructure. When choosing an architecture to...
A Quick Note on Continuous Analysis
Analyzing data on the fly is tricky: Data sets are unbounded and real-time responses demand fast analysis. Incremental algorithms can...