Eager vs. Lazy Architectures

Choice: Eager or Lazy (a.k.a. Push vs. Pull)

Every data-driven software architecture must make a fundamental choice: Be Eager Or be Lazy

Streaming data is a movement to build less lazy data-driven applications. This debate is sometimes called push vs. pull.

Not a black and white decision (how eager or lazy)

Possible to be eager about some things, and lazy about others.

How eager is the state of the art?

The status quo is very lazy. Most applications do nothing until asked (queried)

Streaming data mostly used to eagerly truck data between lazy warehouses.

Stream processors will eagerly transform and count data as it’s moved.

What you get are data pipelines

Recreating the monoliths the industry spent the last two decades dismantling.

The oil and gas industry is not a good reference architecture for streaming data

Data may be the new oil economically.

But data is not oil.

The value of OIL is that it’s made up of CARBON atoms. The value of DATA is NOT that it’s made up of BITS

How do you control what you receive? (pub-sub)

Streaming data is about reducing latency

Use Case: Observability

Situational awareness Monitoring & alerting Experience scoring Early fault warning

Use Case: Automation

Root cause analysis Logistics routing Self-healing systems Performance optimization Cost optimization

Message brokers vs. distributed transaction logs

Stream processors are good at counting and conversions

None of the above use cases are counting or conversion problems.

kSQL can transform streams and enrich with static context

kSQL struggles to join multiple streams together. Most use cases involve more than one stream.

Global state doesn’t scale. Flink stateful functions are just a facade of statefulness. Not optimized for locality, just a hidden query.

Challenge: Real-Time Enrichment

Can’t get rich without context. Stream processors not built for context.

Challenge: Streaming Aggregation

Need to up-level streaming data without adding latency. Stream processors not built for this either.

Challenge: Real-time business logic

Business logic drives enterprise software. Business logic takes action, has side effects.

Business logic needs lots of context as input. Stream processors not designed to run business logic.

Challenge: Streaming APIs

Need to disseminate granular, derivative data without breaking the stream.

Without streaming APIs, you can’t built low latency service-oriented architectures.

Challenge: Real-Time UIs

Need to keep humans in the loop. Humans need to validate and oversee automation.