On Streaming Processing and Data Platforms. Q&A with David Rolfe
Q1. How is stream processing different from traditional data processing?
“In the beginning”, there was only one computer, one CPU, and no network. The invention of networking and the PC led to client-server computing. In client-server, the client sends requests to a server, which responds with the results of the request. But if the request goes horribly wrong for a non-business reason, the client is left holding a cryptic internal error message, and has to figure out the next business steps itself.
What was always needed was a client-server troubleshooting architecture where non-business errors could be sent to a process that might be able to fix the issue, but it never came to pass.
The shift to streaming makes this worse, as our client-server metaphor breaks badly. In the streaming universe you get told of an event that is an unchangeable fact, and whoever is processing the stream has to handle all contingencies for success and failure, be they business or technical.
Q2. In particular, what are the challenges in streaming?
Aside from the issues with error handling I mention above, industrial IoT (IIoT) is a very different world from the kind of overly simplistic streaming scenarios you see powerpointed at conferences.
Industrial IoT gives us a live connection to real-world, mission-critical processes or systems. This is very different from the synthetic, managed environment software normally works in. With streaming, if I don’t like the messages I’m getting from my software component, I can patch it and make it behave. That isn’t going to be so easy if the messages are coming from a piece of factory machinery that you are leasing, as we have no way to change its behavior.
Real-world devices have life spans measured in decades. Manufacturers may be using old and boring mechanisms that work, and will not be interested in changing them. We are just going to have to find a way to cope with the near-infinite variety of messages we’re going to get.
Then we come to the next issue: We’re presumably trying to influence events in real time, which means latency becomes a challenge. It’s easy to point to scenarios where responding late could be worse than not responding at all. Our experience in this space has shown us that translating message formats between devices is a major part of the actual work, and this is what tends to slow things down in the world of industrial IoT.
The last major issue is the complexity and eccentricity of real-world systems, which creates use cases that can’t be solved with a simple mapping or SQL statement.
Q3 Are traditional bilateral client-server interfaces fit for purpose in a streaming IIoT world?
No. If you look at how every traditional client-server API works, there are two parties involved, and the client is sent all the information about each event, even if it’s an error the client has a limited ability to cope with. This made sense in client-server computing but arguably became obsolete when the application server was invented and is downright problematic in a streaming context.
In many streaming scenarios, the incoming event is idempotent — meaning it’s definitively happened and can’t change — so sending an error message back can’t actually change the source of incoming events. In which case, why do we send messages back to components that won’t read them and then find ourselves spending hours trawling through log files when things go wrong?
Q4: So are you saying that streaming requires a different approach?
Absolutely! And not only that, In a streaming context, we need some kind of ‘command and control’ mechanism or an ‘error bus’. Suppose a non-business error happens — a company name is too long to store, for example. The record that triggered the event, along with details of the failure, need to be sent to someone who both cares and is capable of addressing the issue.
This opens up a broader issue, which is that streaming processes aren’t always “source -> sink” flows and could involve multiple possible outputs and even recursion. Given that all of this is asynchronous and not directly visible to a user, we’re a long way from traditional client-server territory.
Another issue is that in the streaming universe, you don’t have the same syntax guarantees you’d have in a traditional API-based system. Your next Kafka record could be anything, not just what you were hoping for.
Q5. So do you see two separate and distinct mechanisms emerging for managing traditional client-server and streaming applications?
I think we’re definitely going to see people revisiting how streaming interfaces work, and I can foresee the emergence of a sort of “client-server-complaints dept” mode for both traditional streaming and client-server, with a separate channel for reporting non-business errors.
But we could also see hybrid applications emerging. Streaming is so useful, especially in the context of industrial IoT, that it’s going to become part of almost every application. But we’ll still need synchronous client-server stuff for OLTP applications, and I can foresee applications that do both, while working with the same data.
Q6. Doesn’t that imply a whole new kind of application stack?
What we’re seeing is people trying to extend what they have to cope with the new reality. Over the last few years, we’ve already seen Kafka try to reposition itself as a database, with mixed results. But we’ve also seen a lot of NoSQL platforms suddenly sprout streaming capabilities.
Volt has had streaming connectivity for years, and last year we got to the point where a Volt server could sit on the network and successfully masquerade as a Kafka server. So you could send messages to ‘Kafka’ and they would in fact go to Volt, or subscribe to a Kafka topic that is, in fact, coming from a Volt server. We are continuing our R&D in this area and will be announcing something at Big Data LDN in September. Bear in mind, this is early days for this new hybrid model, and I think we’ll see a lot of developments, especially in the on-prem space.
Q7. Why do you think on-prem is going to be so important for hybrid client-server/stream applications?
Two reasons.
First, because these applications will have a significant client-server component, it’s hard to see how a cloud provider could meet all their needs with a one-stop shop, unless you’re willing to go ‘all in’ on that cloud provider’s stack.
Second, cost is starting to become a big issue in cloud-hosted streaming. The issue is that while most companies have small quantities of streaming data, a minority have huge tsunamis of it. We’ve had multiple conversations with customers who are getting sticker shock from hosted streaming solutions and are looking to repatriate workloads.
Qx. Anything else you wish to add?
We’re at what some people might call an ‘inflection point’. Direct human activity is no longer driving growth. Industrial IoT traffic is, which means streaming data becomes more and more important. In response, the streaming data companies and streaming data platforms are frantically adding database-like capabilities to their products, and the database companies are equally frantically adding streaming data capabilities. I would argue that, in both cases, this is being done in fairly clumsy ways.
Volt is an in-memory data platform with database capabilities that has had a streaming data-specific worldview since its birth. In September, we’ll be releasing the next iteration of our real-time stream processing capability, and I think people will be impressed.
……………………………
David Rolfe brings 20+ years of experience managing data in the telecom industry. David helps telecom software vendors meet the scale and latency requirements imposed by 5G data utilizing Volt Active Data. He helps companies take the steps they need to deploy mass-scale, ultra-low latency
transactional applications in cloud-native environments. He has over 25 years of experience with high-performance databases and telco systems and demonstrated expertise with charging and policy systems. He has authored multiple patents relating to geo-replicated conflict resolution.
Sponsored by Volt Active Data.