Enterprises have said all along that AI network traffic that mattered would not come from carrying queries and replies, but from any information flow within the model, or data flow to the model from databases used in analysis or training. They’ve also said, more recently, that the expected any meaningful AI application to be self-hosted, and that some at least would involve distributed models linked in some way. We know that the former data flows would almost surely be within a data center, but what about the latter ones? We have limited enterprise comment on this (27 that were clearly based on first-hand knowledge and another 20 that seemed to be from qualified people), but it’s possible to see some interesting patterns from this.
The notion that AI might be distributed, with models/elements deployed in multiple points but engaged in a cooperative mission, arises from four factors. First, some missions for AI demand that some logic be local to a point of activity. You can’t have self-driving vehicles that depend on remote AI for things like collision avoidance. Second, some missions involve a larger data store for some functionality, and thus should logically be local to the data storage point. Third, simply looking at latency, it may be possible to group the functionality by latency requirements, then host each group at an optimum point for hosting economy of scale. Finally, issues of data governance or privacy may demand that some AI missions that could be partially cloud-hosted need to have governed components pulled out and placed under local control to meet compliance requirements.
When looking at these applications overall, enterprises have tended to look at “event flows”, which suggests that most of the distributed AI is expected to be used to handle real-time systems. There are a few examples that fit primarily into the forth factor of the last paragraph, but both types of distributed AI envision two models linked in a work or event flow, and in fact some enterprises note that it’s useful to think about the network needs of distributed AI by thinking of the AI elements simply as application components.
The enterprises dismiss the notion that a local/personal AI element would draw significant data from another location, noting that this relationship violated the second factor noted above. They also point out that if significant data is needed, then it is very likely that the result is not needed immediately because the data could not be analyzed in a short time. That would mean that there was no latency or reliability reason why any portion of the application needed to be co-located with, or carried by, the user.
The implication of this is that there seem to be two models of distributed AI, each with its own rules for traffic generation. In one model, an event-flow model, the presumption enterprises make (and some say they’ve already experienced) is that a “local” model handles some events that are time-critical, and requests help from a deeper model for some other events. This help is returned in the form of an answer or data than may then be further used by the local model. The deeper AI is, in nearly all cases, digesting information rather than forwarding everything, so there isn’t an expectation of (or experience with) a lot of traffic.
In the second model, which is more like a transactional model, the local model passes off something for deeper processing. It may be that this something is a transaction being passed to a normal software application, or it may be something that goes to AI. A confirmation is returned in either case, perhaps, and there may be some data associated with it, but no more than would accompany a software application’s handling of a transaction.
What about the notion of “token services”, some sort of low-latency service aimed at transporting AI model tokens? Enterprise types who are involved in budgeting for network services think this is, to quote one, “outlandish”. In AI, a token is a unit of data within an AI model, which means that unless the model hosting elements are distributed geographically, there’s no WAN service involved. Enterprises see no reason to distribute a model. They also don’t see a reason to have a model run somewhere distant from the model’s data sources. The theory seems linked to the idea that enterprises would use cloud-hosting of AI with their own data, which none say is a practical notion for performance, reliability, cost, and governance reasons. In any event, if they did need to push tokens over a WAN, they would not do so with a usage-priced service, especially since they have no control over how many tokens a model might want to send.
The notion of premium AI handling for pay is especially implausible for mobile services, they say. There is a value in mobile connections for IoT, when/if it expands offsite, but this is because of IoT application latency constraints and not AI, and it’s probably related to edge computing services. However, it’s interesting that some enterprises are thinking about supporting at least what might be called “metro-mobile IoT” by hauling the events to their own data center. Their theory is that if there are low-latency mobile services available, they’d be as good connecting to their data center as they would be to a metro hosting point for edge computing.
Why, given all of this, are we hearing so much about AI-specific services? I think the answer is that a lot of people, vendors, and telcos are trying to validate opportunities for their own products/services/interests by linking things to the current dominant hype wave, which is AI. AI is a way of creating functionality, applications or components thereof. It’s the mission that sets the connectivity requirements and not the technology used to meet it. If AI demanded a whole new set of premium services, enterprises would find it even harder to make a business case for it. Why would you toss application software that worked fine with inexpensive best-efforts connectivity, and toss all your hosting resources for it, to embrace something that would raise your communications and hosting costs?
We’ve all heard that telcos had a “field-of-dreams, build-it-and-they-will-come” mindset, which I think is clearly the case, We may be underestimating the impact of this. When you think of markets from the supply side, you frame out technologies you could offer, then try to find things that they can be used for. I remember, back in the 1980s, a meeting on ISDN (Integrated Services Digital Network, the first planned successor to plain old telephone services (POTS). One vendor came in, excited, and said “We have a new application for ISDN! It’s called ‘file transfer’!” Well, there’s a difference between what a service can be used for, and what justifies paying for it. ISDN learned that the hard way. AI is probably doomed to do the same.
