Should We Be Surprised that AI-at-the-Edge isn’t Deploying Yet?

There’s no question that publicity can drive investment. When generative AI burst on the scene, we saw a rush of major vendors like Amazon, Google, Meta, and Microsoft who wanted to at least hold a place for themselves in the space. We also saw a flood of startups, all hoping to be bought if AI proved out. AI is, in fact, sort-of-proving itself, though it’s far from certain (and even from likely) that it will sweep the industry into a new place. But in the long run, hype can’t drive investment, only ROI can, and that’s why the story that AI data centers are being deployed in a centralized rather than edge-centric way shouldn’t be a surprise.

What we have with AI and edge computing is the intersection of two fairly vague missions. Yes, we know that there’s a lot of potential in both areas. My own modeling says that the largest growth in data center deployments of any sort is certain to be associated with edge computing. The problem is that we still don’t have the kind of killer edge applications that could drive a major deployment. We obviously don’t know how much AI might help in as-yet-undefined edge applications either, and we don’t really know what kind of AI applications could be deployed at the edge because we don’t know the technology that would be needed to support those undefined AI apps there.

In the early days of cloud computing, where hope was strong and real revenue was only starting to develop, we didn’t see cloud providers deploying dozens of regions’ worth of data centers, we saw one or two. With AI, what we’re seeing from cloud providers is a modest augmentation of GPU deployments justified in part by the hopes for AI success but also by the fact that there were GPU missions before AI burst on the scene. That’s an almost-perfect replication of the early cloud situation.

Edge computing is a different story, though. The biggest difference between “edge” and “core” (in a computing or networking sense) is topological. The edge is the distributed piece, and so it follows that there has to be a reason to do the distributing in the first place. The normal reason in a network is that you always aggregate traffic starting where users connect, so if you have distributed users you have distributed network edges. In computing, the only accepted justification for edge deployment is control over latency. Since there are more edge points, a given compute demand has to be divided across more places, which means less density per place, which means lower economic efficiency. To gain something for that loss, you need to have a capability that people will pay for, incrementally.

That’s the challenge of the edge in a nutshell. There’s too much talk about edge computing that considers the cloud as a kind of spreading fungus, starting in a few deep and dark places and spreading out naturally. Baloney. It spreads out if there’s an economic justification, which means a mission that can justify a higher compute cost than traditional cloud computing would present. That mission seems to be tied explicitly to those applications that can’t tolerate process latency, which I believe means real-time stuff, meaning IoT.

Today, of 181 enterprises who told me about their “edge computing” missions, 173 said they were related to IoT (the others didn’t characterize them, so statistically it’s likely all were connected to IoT in some way). All of that group hosted their edge applications on-premises, with 158 hosting them on a server/appliance in the facility where the IoT activity was being used and the remainder in their data center. In all but one of those cases, the data center was close to the point of usage, so it’s clear that the current edge missions validate the point that the applications justifying edge computing are those with a strict latency budget.

It would seem logical that all the early edge opportunity would be drawn from this existing IoT stuff, and that any edge/AI connections would likely be visible there already, if they had any potential to drive public edge facilities for AI. Only 18 of the IoT enterprises said they used AI in their current applications, and 15 of those said they used them as a backstop for the processes using edge IoT, meaning for things like JIT inventory management in a factory. None said they were using AI in an actual process-control-short-latency-budget mission.

It’s always hard to get accurate information from enterprise on why they aren’t doing something. The best I can offer is commentary from 39 of the enterprises, offered without prompting. Of that group, 15 said that they had no need for AI in their application. 14 said that they could see potential value at some future point, and 10 said that they didn’t believe AI technology was designed for edge deployment. However, 88 of the enterprises who currently used on-prem edge/IoT said that their applications were deployed by an integrator or industrial specialist and 48 said they were deployed by the organization that installed the industrial or other gear, gear that included any IoT elements. That means that only about 45 enterprises actually drove their IoT application and infrastructure decisions, which is very close to the number who offered comments on AI.

What’s very interesting here is that a check of some of the sources of this enterprise edge/IoT gear shows that well over half actually have “simple” AI, in the form of machine learning or neural networks, included in their application. That suggests that enterprises (like many these days) conflate “AI” and “generative AI”, dismissing any of the already-pervasive early forms of AI.

That justifies a bit of a closer look at the ten enterprises who didn’t think AI was designed for edge deployment. I suspect (but can’t prove without asking pointed questions that in my experience then contaminate the responses) that this group represents the enterprises who believe that large-language-model generative AI might have a role, and have actually looked into it. The same application sources that said they used a form of AI/ML already told me that they were looking into LLMs, but that so far they couldn’t define a mission whose benefits would offset the incremental cost of providing LLM support on-premises.

OK, then, what we may have here is a determination that edge AI in the form of a public edge LLM service, could actually be helpful for roughly six percent of IoT users, if (and it’s a very big “if” at present) we could identify a mission for it that would justify its (so-far-unknown) cost. For this to work, though, we’d need not only a form of LLM processing that could meet control-loop-latency requirements, but also a connection mechanism that when combined with LLM processing still delivered a response within the delay budget. We are, friends, a long way from the thorough analysis of IoT application latency requirements that we’d need to size the opportunity. Since that sort of analysis is needed to prove that ROI from AI/edge deployment can justify investment, we’re still in a holding pattern.

Email and RSS:

Our Commitment: All the Facts, Always the Truth