CEO Interview: David King on the Future of IIoT Edge Intelligence

Edge computing has emerged as a key pillar of digital transformation for industrial organizations. FogHorn CEO, David C. King, sat down for a conversation with Jeff Frick at theCUBE Studio to discuss the current and future state of Edge Computing, and the tremendous potential it holds for improving industrial operations.


David C King, FogHorn Systems CUBEConversation, November 2018



Interview Transcript:


Jeff Frick:  [00:12] Welcome back, everybody. Jeff Frick here with The CUBE. We're at our Palo Alto studios having a CUBE Conversation, a little break in the action of the conference season before things heat up, before we come to the close of 2018. It's been quite a year. It's nice to be back in the studio.

[00:24] Things are a little less crazy and we're excited to talk about one of the really hot topics right now, which is edge computing, fog computing, cloud computing. What do all these things mean? How do they all intersect? We've got with us today David King. He's the CEO of FogHorn Systems. David, first off, welcome.

David King:  [00:39] Thank you, Jeff.

Jeff:  [00:40] FogHorn Systems. I guess by the fog, you guys are all about the fog. For those who don't know, fog is this intersection between cloud and on prem. First off, give us a little background of the company and let's jump into what this fog thing is all about.

David:  [00:54] Sure. It all dovetails together. Yeah, you're right. FogHorn, the name itself came from Cisco invented a term called fog computing almost a decade ago. It connoted this idea of computing at the edge, but it didn't really have a lot of definition early on.

[01:09] FogHorn started actually by Palo Alto incubator just nearby here that had the idea that we ought to put some real meaning and some real meat on the bones with fog computing.

[01:19] What we think FogHorn has become over the last three and a half years since we took it out of the incubator, since I joined, was to put some real purpose, meaning, and value in that term.

[01:29] It's more than just edge computing. Edge computing is a related term. In the industrial world people have said I've had this computer for 30, 40, 50 years with my production like controllers and my distributed control systems. I've got hard wire compute. I run what they call industrial PCs in the factory. That's edge compute.

[01:47] The IT world came along and said, "No, no. Fog computing is a more advanced form of it." The real purpose of fog computing and edge computing in our view in the modern world is to apply what has traditionally been thought of as cloud computing functions, big, big data but running in an industrial environment, running on a machine.

[02:04] We call it as really big data, operating on the world's smallest footprint. The real point of this for industrial customers, which is our primary focus, industrial IoT is to deliver as much analytic machine learning, deep learning AI capability on live streaming sensor data.

[02:23] What that means is rather than persisting a lot of data either on prem and then sending it to the cloud or trying to stream all this to the cloud to make sense of terabytes or petabytes a day per machine sometimes.

[02:33] Think of jet engines, a petabyte every flight. You want to do the compute as close to the source as possible. If possible on the live streaming data, not after you persisted it on a big storage system. That's the idea.

Jeff:  [02:45] You touched on all kinds of stuff there, so let's break it down and unpack it. The first one is just the OT, IT thing. I think that's really important. We talked before turning the cameras on about Dr. Tom from HP loves to make a big symbolic handshake of operations technology and the marriage of these two things.

[03:02] Whereas before as you said, the OT guys, guys who've been around factories, they've been doing this for a long time and now suddenly the IT folks are butting in and want to get access to that data to provide more control. As you see the marriage of those two things coming together, what are the biggest points of friction? What's the biggest opportunity?

David:  [03:20] Great set of questions. Quite right. The OT folks are inherently suspicious of IT. If you don't know the history, 40 plus years ago there was a fork in the road where in factory operations were they going to embrace things like Ethernet, the Internet, connected systems.

[03:38] They purposely air gapped or islanded those systems because that was all about machine control, real time for safety, productivity, and up time of the machine.

[03:47] You can't use standard Ethernet. It has to be industrial Ethernet. It has to have time down to deterministic. It can't be a retry kind of a system. It's a different MAC layer for Ethernet, for example. What does the physical wiring look like? It's also different cabling because you can't have cuts in the cable. It's a different environment entirely that OT grew up in.

[04:08] FogHorn is trying to really bring the value of what people are delivering for AI essentially into that environment in a way that's nonthreatening to, supplemental to and adds value in the OT world. Dr. Tom is right.

[04:22] This idea of bringing IT and OT together is inherently challenging because these were fork in the road, islanded networks if you will, different systems, different nomenclature, different protocols. There's a real education curve that IT companies are going through.

[04:40] The idea of taking all this OT data that's already being produced in tremendous volumes already before you add new kinds of sensing and sending it across a LAN which it never talked to before, then across a WAN to go to a cloud to get some insight doesn't make any sense.

[04:57] You want to leverage the cloud. You want to leverage data centers. You want to leverage the land. You want to leverage 5G. You want to leverage all the new IT technologies, but you have to do it in a way that makes sense for and adds value in the OT context.

Jeff:  [05:08] I'm just curious. You talked about air gapping the two systems, which means they are not connected, right?

David:  [05:14] No, they're connected to themselves.

Jeff:  [05:16] But before the OT system was air gapped from the IT system, so thinking about security and those types of threats. If those things are connected, that security measure has gone away. What is the excitement, adoption, scare when now suddenly these things that were separate especially in the age of breaches that we know happen all the time, as you bring those things together.

David:  [05:42] Absolutely. In fact, there have been cyber breaches in the OT context. Think about Stuxnet. Think about things that have happened. Think about the utilities back east that were found to have malwares implanted in them. This idea of industrial IoT is very exciting. The ability to get real time game changing insights about your production. That's a huge amount of economic activity.

[06:04] The world could be dramatically improved. You can talk about trillions of dollars of value, which is what McKinsey, BCG, and Bain talk about. By bringing AI, ML into the plant environment.

[06:15] The inherent problem is that by connecting these systems, you introduce security problems. You're talking about a huge amount of cost to move this data around, persistent and then add value, and it's not real time.

[06:27] It's not that cloud is not relevant. It's not that it's not used. It's that you want to do the compute where it makes sense. For industrial, the more industrialized the environment, the more high‑frequency, high‑volume data, the closer to the system that you can do the compute, the better.

[06:44] Again, it's multilayer of compute. You probably have something on the machine, something in the plant, and something in the cloud.

[06:50] Rather than send raw OT data to the cloud, you're going to send processed intelligent metadata insights that have already been derived at the edge to update what they call the fleet wide digital twin for the whole fleet of assets sitting in the cloud. The digital twin of the specific asset should probably be on the asset.

Jeff:  [07:07] Let's break that down a little bit. There's so much good stuff here. We talked about OT IT and that marriage. Next I just want to touch on cloud because a lot of people know cloud and it's very hot right now. The ultimate promise of cloud is you have infinite capacity available on demand and you have infinite compute and hopefully you have some big fat pipes to get your stuff in and out.

[07:29] The OT challenges, as you said the device challenge is very, very different. They've got proprietary operating systems they've been running for a very, very long time. As you said, they've put off boatloads and boatloads and boatloads of data.

[07:40] That was never really designed to feed a machine learning algorithm or an artificial intelligence algorithm when these things were designed. It wasn't really part of the equation.

[07:51] We talk all the time about do you move the compute to the data? Do you move the data to the compute? Really what you're talking about in this fog computing world is kind of a hybrid if you will of trying to figure out which data you want to process locally and which data you have time, relevance, other factors that just go ahead and pump it up stream.

David:  [08:12] That's a great way to describe it. Actually, we're trying to move as much of the compute as possible to the data. That's why we say fog computing is a nebulous term about edge compute. It doesn't have any value until you actually decide what you're trying to do with it.

[08:26] What we're trying to do is take as much of the harder compute challenges like analytics, machine learning, deep learning, AI, and bring it down to the source. As close to the source as you can because you can essentially streamline or make more efficient every layer of the stack. Your models will get much better. You might have built them in the cloud initially.

[08:43] Think about a deep learning model, but it may only be about 60, 70 percent accurate. How do you do the improvement of the model to get it closer to perfect? I can't go send all the data up to keep trying to improve it.

[08:52] While typically what happens when I down sample the data, I average it, and I send it up and I don't see any changes in the average data. Guess what? We do inference all the time on all the data, run in our stack, and then send the metadata up.

[09:05] Then have the cloud look across all the assets of a similar type and say, "Oh, the global fleet‑wide model needs to be updated," and then to push it down. Google just about a month ago in Barcelona at the IoT show, what we demonstrated was the world's first instance of AI for industrial, which is closed‑loop machine learning.

[09:21] We were taking in a model, a tensorflow model trained in the cloud in the data center, brought into our stack. We're running 100 percent inferencing on all the live data, pushing the insights back up into Google Cloud, and then automatically updating the model without a human data scientist having to look at it because it's essentially ML MNL. That to us, ML MNL is the foundation of AI for industrial.

Jeff:  [09:43] I just love it. Something comes up all the time. We used to make decisions based on the sampling of historical data after the fact.

David:  [09:50] That's right. That's how the world's been doing it.

Jeff:  [09:51] Right now the promise of streaming is you can make it based on all the data all the time in real time. It's a very, very different thing. As you talk about running some complex models and running ML and retraining these things, when you think of edge you think of some little hokey puck that's out on the edge of a field with limited power, limited connectivity.

[10:15] What is the reality? How much power do you have at some of these more remote edges? We always talk about the field of turbines, oil platforms. How much power do you need? How much compute that actually starts to be meaningful in terms of the platform for the software?

David:  [10:31] There's definitely use cases. Thinking about smart meters in the home. Older generation of those meters may have had very limited compute. Talking about single megabyte of memory or less, kilobytes of memory. Very hard to run a stack in that kind of footprint.

[10:48] The latest generation of smart meters have about 250 megabytes of memory. A raspberry pie today is anywhere from half a gig to a gig of memory. We're fundamentally memory bound. Obviously CP have returned really fast compute like vibration analysis or acoustic or video.

[11:02] If you're just trying to take digital sensing data like temperature pressure, velocity, torque, humidity. We can take all of that, believe it or not, run literally dozens and dozens of models. Even train the models in something as small as a raspberry pie or a low index 86. Our stack can run on any hardware. We're completely OS independent. It's a full up software layer.

[11:23] The whole stack is about 100 megabytes of memory with all the components, including Docker containerization, which compares to about 10 gigs of running a stream‑processing stack like Spark in the cloud. It's that order of magnitude of footprint reduction and speed of execution improvement.

[11:41] As I said, the world's smallest, fastest compute engine. You need to do that if you're going to talk about like a wind turbine. It's generating data every millisecond. You have high‑frequency data like turbine pitch and you have other conceptual data you're trying to bring in like wind conditions, reference information about how the turbine is supposed to operate.

[12:00] You're bringing in a torrential amount of data to do this computation on the fly. The challenge for a lot of the companies that have really started to move into this space, the cloud companies like our partners at Google, Amazon, and Microsoft is they have great cloud capabilities for AI ML. They're trying to move down to the edge by just transporting the whole stack.

[12:18] In a plant environment, that might work if you have a massive data center that can run it. Now I've got to stream all my assets, all the data from all my assets to that central point.

[12:27] What we're trying to do is come out the opposite way which is by having the world's smallest, fastest engine we can run it in a small compute, very limited compute on the asset or near the asset, or you can run us in a big compute and we can take on lots and lots of use cases or models simultaneously.

Jeff:  [12:44] I'm just curious in the small compute case. Again, you want all the data to analyze it. Does it eventually go back? Are there a lot of cases where you can get the information you need off the stream and you don't necessarily have to save or send that up stream?

David:  [13:00] Fundamentally today in the OT world, the data usually gets...If the PLC, the production line controller, that has simple KPIs, if temperature goes to X or pressure goes to Y, do this. Those simple KPIs, if nothing is executed, it gets dumped into a local protocol server.

[13:15] Then about every 30, 60, 90 days it gets rewritten over. Nobody ever looks at it. That's why they say 99 percent of the brownfield data in OT has never really been mined for insight. If you're doing inferencing and doing real time decision making, real time action with our stack, what you would then persist is metadata insights.

[13:35] Here is an event or here is an outcome. Oh, by the way, if you're doing deep learning or machine learning and you're seeing deviation or drift from the model's prediction, you probably want to keep that and some of the raw data packets from that moment in time and send that to the cloud or data center to say, "Oh, our fleet‑wide model may not be accurate or may be drifting."

[13:56] What you want to do is different horses for different courses. Use our stack to do the lion share of the heavy duty real time compute, produce metadata that you can send to either a data center or a cloud environment for further learning.

Jeff:  [14:11] Your piece is really the gathering and the ML. Then, if it needs to go back up for more heavy lifting, you'll send it back up? Or do you have the cloud application as well that connects if you need it?

David:  [14:21] We've built connectors to Google Cloud Platform, Google IoT Core, to AWS, S3, to Microsoft Azure, virtually any, Kafka, Hadoop, we can send the data wherever you want. Either on plant right back into the existing control systems.

[14:35] We can send it to OSIsoft PI, which is a great time series database that a lot of process industries use. You can also, of course, send it to any public cloud or a Hadoop data like private cloud.

[14:44] You can send the data wherever you want. One of our components is a time series database. You can also persist to the in memory in our stack just for buffering or if you have high value data that you want to take a measurement or value from a previous calculation and bring it in to another calculation we're doing later. It's a very flexible system.

Jeff:  [15:01] We were at OSIsoft PI world earlier this year. Some fascinating stories that came out. The building maintenance and all kinds of stuff. I'm just curious. Some of the easy to understand applications that you've seen in the field and maybe some of the ones that were a surprise on the OT side. Obviously, preventing maintenance was toward the top of the list.

David:  [15:24] I call it the layer cake. Especially when you get to remote assets that are either not monitored or are lightly monitored, they call it drive by monitoring. Someone shows up and listens, or looks at a valve or gauge and leaves. Condition‑based monitoring, that is actually a big breakthrough for some. Think about fracking sites or remote oil fields or mining sites.

[15:46] The second layer is predictive maintenance, which the next generation is predictive, prescriptive, even preventative maintenance. You're making predictions or you're helping to avoid downtime.

[15:55] The third layer, which is really where our stack is fairly unique today in delivering is in asset performance optimization. How do I increase throughput? How do I reduce scrap? How do I improve worker safety? How do I get better processing of the data that my PLC can't give me so I can actually improve the performance of the machine?

[16:12] Ultimately, what we're finding is a couple things. One is you can look at individual asset optimization, process optimization, but there's another layer. Often we're deployed at two layers on premise. There's also the plant light optimization.

[16:24] We talked about wind farms before off camera. You've got the wind turbines. You can do a lot of things about turbine health, the blade pitch, the condition of the blade. You can do things on the battery, all the systems on the turbine.

[16:37] You also need a stack running like ours at that concentration point where there's 200 plus turbines that come together. The optimization of the whole farm, every turbine affects every other turbine. A single turbine can't tell you speed, rotation, things that need to change if you want to adjust the speed of one turbine versus the one next to it.

[16:58] There's also a plant‑wide optimization. Talking about autonomous driving. There's going to be five layers of compute. Almost called the ECU level, the individual subsystem in the car. The engine, how it's performing.

[17:10] You're going to have the gateway in the car to talk about things that are happening across the systems in the car. You're going to have the peer‑to‑peer connection over 5G to talk about optimization between vehicles. You're going to have the base station algorithms looking at a microcell or macrocell within a geographic area.

[17:27] Of course, you'll also have the cloud because you want to have the data on all the assets, but you don't want to send all that data to the cloud. You want to send the right metadata to the cloud.

Jeff:  [17:35] That's why they have big trunks full of compute now.

David:  [17:38] By the way, you mentioned one thing that I should really touch on which is we've talked a lot about what I call traditional brownfield automation and control‑type‑analytics machine learning. That's where we started in discreet manufacturing a few years ago.

[17:51] What we found is in that domain and in oil and gas, in mining, agriculture and transportation, in all those places the most exciting and new development this year is the movement toward video, 3D imaging, and audio sensing. Those sensors are becoming very economical.

[18:06] People have never thought about, "Well, if I put a camera and apply it to a certain application, what can I learn? What can I do that I never did before?" Often they even have cameras today that are not making use of any of the data.

[18:21] There's a very large customer of ours who has literally video‑inspection data in every product they produce every day around the world. This is in hundreds of plants. That data never gets looked at other than training operators, hey, you missed the defects this day. They write over that data after 30 days.

[18:39] Guess what? You can apply deep‑learning tensorflow algorithms by building a computational neural‑network model and essentially do the human visioning, rather than an operator staring at a camera or trying to look at training tapes three days later. I'm doing inferencing of the video image on the fly.

Jeff:  [18:56] Do your systems close loop back to the control systems now or is it more of a tuning mechanism for someone to go back and do it later?

David:  [19:04] Great question. I just got asked that this morning by a large oil and gas super major that Intel just introduced us to. The short answer is our stack can absolutely go right back into the control loop.

[19:13] In fact, one of our investors...I should mention our investors for series A was GE, Bosch, Yokogawa, Dell, and EMC. Our series B we did a year ago was Intel, Saudi Aramco, and Honeywell. We have one foot in tech and one foot in industrial. We're really trying to bring, as you said, IT and OT together.

[19:31] The short answer is you can do that, but typically in the industrial environment there's a conservatism about, "Hey, I don't want to affect the machine until I've proven it out." Additionally, people tend to start with alerting.

[19:46] We send an automatic alert back into the control system to say, "Hey, the machine needs to be retuned." Very quickly though, certainly for things that are not so time sensitive they will just have...

[19:56] Yokogawa, one of our investors, is actually putting us in PLCs. Rather than sending the data off the PLC to another gateway running our stack like X86 or armed gateway, we're actually those PLCs have raspberry pie plus capabilities.

Jeff:  [20:14] Doing what type of mechanism?

David:  [20:16] Typically, they're doing the IO and the control of the machine, but they have enough compute now that you can run us in a separate module like the little brain sitting next to the controller and do the AI on the fly. There you don't actually need to send the data off the PLC. We just reprogram the actuator. That's where it's heading.

[20:33] It could take years before people get comfortable doing this automatically, but what you'll see is that what AI represents in industrial is the self‑healing machine, the self‑improving process. This is where it starts.

Jeff:  [20:47] The other thing I think is so interesting is what are you optimizing for. There is no right answer. It could be you're optimizing for, like you said, a machine. You could be optimizing for the field.

[20:58] You could be optimizing for maintenance, but if there's a spike in pricing you may say, "Eh, we're not optimizing now for maintenance. We're actually optimizing for output because we have this temporary condition and it's worth the trade off." There are so many ways that you can reskin the cat when you have a lot more information and a lot more data.

David:  [21:16] That's right. What we typically like to do is start out with what's the business value. We don't want to go do a science project. "Oh, I can make that machine work 50 percent better."

[21:25] Yeah, but if it doesn't make any difference to your business operations, so what? We always start the investigation with what is a high‑value business problem where you have sufficient data where applying this kind of AI in the edge concept will actually make a difference. That's the proof of concept we like to start with.

Jeff:  [21:42] Again, to come full circle, what's the craziest thing an OT guy said, "Oh my goodness, you IT guys actually brought some value here that I didn't know."

David:  [21:51] I touched on video. Without going into the whole details of the story, one of our big investors is they have a very large oil and gas company who said, "Look, you guys have done some great work with I call it software to find SCADA." SCADA is the network environment for OT. SCADA is what the PLCs and DCSs connect over these SCADA networks. That's the controlled automation world.

[22:12] This investor said, "Look, you've already shown us that you've gone into brownfield SCADA environments, done deep mining of existing data, and shown value by reducing scrap, improving output, improving worker safety, all the great business outcomes for industrial.

"[22:27] You come into our operation, our plant people are going to say no, you're not touching my PLC. You're not touching my SCADA network. Come in and do something that's not invasive to that world." That's where we actually got started with video about 18 months ago.

[22:40] They said, "Hey, we've got all these video cameras and we're not doing anything. We just have human operators writing down I had a bad event." It's a totally non‑automated system.

[22:49] We went in and did a video use case around what we call flare monitoring. Hundreds of stacks of burning off oil and gas in a production plant, you've got 24 by‑teams of operators just staring at it writing down, "Oh, I think I had a bad flare."

[23:04] It's a very interesting overall process. By automating that and giving them an AI dashboard essentially of, "Oh, I've got a permanent record of exactly how high the flare was, how smoky was it, what was the angle."

[23:14] Then you can fuse that data back into plant data. What causes that? Also OSIsoft data, what was the gas composition? Was it in fact a safety violation? Was it in fact an environmental violation?

[23:25] By starting with video and doing that use case, we've now got dozens of use cases all around video. Oh, I could put a camera on this. I could put a camera on the rig. I could put a camera down a hole. I could put a camera on the pipeline, on a drone. There's just a million places a video can show up or audio sensing, acoustic.

[23:43] Video is great if you can see the event. I'm flying over the pipe. I can see corrosion. Sometimes a burn or an oven, I can't look inside the oven with a camera. There's no camera that could survive 600 degrees.

[23:57] What do you do? That's probably where you could do either vibration or acoustic. Inside the pipe you've got to go with sound. Outside the pipe you go video. These are the kind of things that people traditionally, how do they inspect pipe? Drive by.

Jeff:  [24:13] Fascinating story. I think at the end of the day you can make real decisions based on all the data in real time versus some of the data after the fact. Great conversation and look forward to watching the continued success of FogHorn.

David:  [24:29] Thank you very much.

Jeff:  [24:31] He's David King. I'm Jeff Frick. You're watching The CUBE. We're having a CUBE Conversation at our Palo Alto studio. Thanks for watching. We'll see you next time.

[24:37] [music]

Transcription by CastingWords