Reasoning first from principles: Our journey towards building the most advanced edge intelligence platform for Industrial IoT
By Abhi Sharma, Head of Analytics and Machine Learning, FogHorn Systems
In part one of this four-part series, we discussed the value and the absolute necessity of edge computing as a paradigm for the modern world, especially for industrial IoT (IIoT). Bandwidth, storage costs and latency requirements just won’t allow for a centralized cloud computing model for IoT. To that end, FogHorn architected its core data ingestion and data multiplexing stack for the edge with extreme low-latency and high throughput in mind. From an engineering standpoint, it was really exciting for me to discuss specifics of our systems engineering and high performance programming techniques in the last blog.
In this post, we evaluate another inevitable first principle about IoT: that nearly all of this real-time, real-world information needs to be analyzed right at the edge, closer to the source of data. I also showcase how, reasoning from this particular first principle, we engineered real-time complex event processing (CEP) semantics and advanced analytic capabilities “right at the edge.” These capabilities are instantly accessible, usable and deployable by anyone needing an IoT solution without requiring IT expertise.
For the first time in computing history, we are collecting massive amounts of data from our environment. It is real-world, real-time, heterogeneous data from various sophisticated endpoints devices and sensors (e.g. vision, location, temperature, pressure, acceleration, luminosity, etc.). This is fundamentally different from traditional, textual or static images internet data (e.g. e-commerce or Instagram likes, search queries, clicks, or Twitter posts) that are most familiar to us. In IoT, the physical world is getting instrumented and the scale of this data is mind-boggling. Much of this data will need to be processed in real-time at the edge itself.
The next-generation machine learning (ML) and artificial intelligence (AI) applications that truly add business value in real-time will all happen at the edge, with the cloud providing a larger learning and orchestration mechanism. Why is this the case?
Well, as an analogy, think about the human brain. You are reading this right now. You are already continuously summarizing this information and making sense of it. Although you may not remember precisely what the third line of the second paragraph said, you already have the core context of this post in your head. Now, imagine if you had to read the entire post first, and then your brain had to run a batch job to process and understand it. That would be inefficient and would significantly slow down your learning and doing. Making sense now? FogHorn acts just like the fast-processing part of your brain, doing complex, real-time analysis to extract events, meaning and insights.
Principle: Massive heterogeneous IoT data needs to be processed mostly at the edge. Advanced non-trivial analytics, coupled with localized domain expertise, need to be performed in real time right at the edge, even on low-footprint form factors, to produce business outcomes.
Reasoning from the above first principle
Drawing from the discussion in Part One, unlike a data center, devices at the edge are extremely heterogeneous and are often compute- and memory-constrained. Imagine a Raspberry Pi on an elevator or a small embedded device in the parking lot or a camera at the traffic light - all of them need to analyze continuous signals. The form factor of IoT gateways (the size of a cable modem) in a manufacturing or oil and gas plants or transport vehicles is much more palatable at the edge as compared to massive racks of a data center. Infrastructure at the edge is distributed, relatively cheap and easily replaceable. Nobody wants to deal with data center-like racks of infrastructure at the edge. That’s the main reason we move data to the cloud in the first place. In addition, in many physical environments, heavy infrastructure is simply not possible.
To make matters complicated, although the physical environments might be heterogeneous with constrained devices, their analytics complexity and data processing needs are not getting any more trivial. In fact, the more asynchronous data sources you have, the more you must look at them all simultaneously, combining data in interesting ways from continuous, real-time streams to derive insights.
Another question bothering you might be - Why “real-time”? Because, going back to my brain analogy above, latency matters. Even if latency is not a concern, we just don’t have the bandwidth and storage at the edge like a data center. Data and situations have to be analyzed in-flight, just like the human brain does to all of our sensory inputs. Just having basic filtering, aggregation or an if-then-else-alert rules engine is not enough. To move the needle, we must detect complex patterns over multiple data sources, express complex analytic intent, and run these programs right at the edge. There is no way you can put a cloud platform like Apache Spark, Beam or Hadoop on an embedded device or an IoT gateway and have the resources to do non trivial data analysis.
Engineering the platform
Obviously, solving the problem outlined above is quite the engineering challenge and, to no surprise, this is where most of Foghorn’s technical innovation stems from. We needed to offer full-blown, cloud-like analytic semantics right at the edge and make it available on really low-footprint devices (for both Intel/ARM 32 and 64 bit), while also completing most processing in near real-time. To that end, we created a CEP engine that exposes a simple, reactive, functional programming language called VEL written from the ground up specifically for IoT use cases. VEL offers streaming semantics and expressibility, low-latency and low-footprint analytics right at the edge. VEL programs can run on just a few megabytes of memory. VEL offers complex pattern matches, asynchronous stream processing and cross-stream event correlation semantics, first class notion of time, numerous data slicing/dicing/aggregation/filtering capabilities, 200+ mathematical and statistics functions, various machine learning functions, data windowing and pattern matching and first-class support for all physical standard units—all built-in.
So why is our CEP engine so special, tiny and fast, and how can it still offer all these advanced analytic capabilities? The analytics authored in FogHorn using VEL are compiled and run natively on the respective hardware platform, offering optimized runtime performance and memory characteristics. You can import existing, trained ML models into FogHorn with PMML, which then get code-generated into efficient VEL scripts that run at the edge. Our CEP engine supports running complex analytics and pre-trained ML models at the edge. This is where we really innovated on the technology and achieved this via creating efficient machine representations while still offering hyper-composable pattern-matching techniques. In-fact we never have to buffer or use backtracking techniques because we never need to evaluate incoming tokens more than once for semantic extractions, unless explicitly needed. This allows for extremely tight runtime representations. To summarize, we rigorously evaluate all of our design choices, and we always reason from these first principles. This enables our CEP engine to be an order of magnitude smaller than Python or Java runtimes while still offering the same richness in functionality and fine-grained control over memory characteristics.
In part three of this series, we will talk about how offering a great technical platform is still not good enough to capture the IIoT market. We’ll discuss why and how we focused on keeping the product very OT (operational technologist) centric. Creating real business value in IIoT meant the platform needed to be easy enough to capture tribal knowledge of domain experts across various industries by letting them describe the analytic intents and goals.