Enterprises, asked what the relevant 6G features could be, will say “low latency” almost 3:1 over all other options combined. But why is latency relevant? The answer, interestingly, of almost half those who cite it reduces to “I don’t really know”. The rest say that it would be relevant to promote the availability of hosted edge computing. OK, but would it? Let’s take a look.
The long-expressed 6G latency goal is a microsecond, so let’s start by looking at what that means. One one-millionth of a second, obviously, but in practical terms, it depends on how fast the things that are important to you are happening. A vehicle at 60 mph would travel about nine-one-hundred-thousandth of an inch. A person walking would travel five percent of that. A jet, ten times that far. So, it seems, we could argue that a microsecond latency goal is pretty hard to justify as a hard requirement.
It would be at least as hard to achieve. 6G radio waves in air would travel roughly 300 meters (a thousand feet, approximately), and data over fiber or copper roughly 680 feet. With apologies to edge enthusiasts, it is simply not economically reasonable to assume we could make cloud-like shared hosting resources practical if they had to be within a thousand feet of where they’re needed.
Suppose, though, we tie these two points together. What is a practical level of latency? I don’t have totally convincing or consistent views on this from enterprises and experts, but here are some. All these numbers represent estimates of reasonable round-trip delays for a control loop, including message propagation and processing.
For autonomous vehicle operation, the average answer is 50 milliseconds. In this amount of time, a vehicle would move about 4 feet, but it’s still at least 5 times faster than driver reaction.
For industrial process control, the average answer is 20 milliseconds, but some high-speed process control can require round-trip delays of a tenth of that, and others can tolerate up to five times that much.
Enterprises who offered actual local edge round-trip latency numbers report that both propagation and processing latencies in their applications are between 200 and 300 milliseconds. Only 11 enterprises said they achieved 50 ms, and only 2 said they had any applications that delivered 20 ms. In all cases, these local edge applications were connected to real-world systems using wired sensors, and there was no significant network handling. The edge hosts were, on the average, less than 150 meters from the processes they controlled, which means propagation delay could be ignored. The current primary source of latency, then, is the compute processes themselves.
Think on that a moment. In the average 250 ms latency users achieve today in an almost-processing-only situation, a vehicle would move 22 feet. An assembly line, say enterprises, would move an inch. Given this it would be easy to conclude that edge computing is hopeless for any new distributed applications, no matter what network latency was achieved, but that doesn’t consider two important factors, the distribution of event processing, and the nature of the real-time system overall.
If we assumed that an autonomous vehicle could achieve brake activation only after 22 feet of movement, you could visualize a lot of accidents, but the truth depends on what event stimulates that activation. Today, autonomous actions in vehicles are usually triggered in anticipation of a threatened condition. In my own car, I’ll get an alert when backing up if any motion is detected, and braking will activate if the motion is within a warning zone outside the actual vehicle path. In other words, the overall system builds in latency tolerance.
Distribution of event processing is often combined with this. If you want a bolt to be inserted in a hole, do you wait till a sensor sees the hole in front of the bolt-inserter? Hardly. You assume a consistent speed of hole-motion, and synchronize the insertion to the position of the bolt, considering all the latency factors. It’s like shooting at a moving target. But you could also use either a mechanical system or a simple local processor (think something like Raspberry Pi) to synchronize something that requires extreme precision. This is what those who actually achieve very short control loops (low round-trip delays) do.
User comments suggest to me that it’s worthless to look at network latency in IoT or real-time process control unless you can trace the totality of a control loop and assign latency to each piece. When you do that, or try to, it’s clear that the real issue in edge computing is less likely to be latency, and more likely to be processing delays and economic factors inherent in shared hosting of anything.
While a network device isn’t a computer, it’s a computer-driven element. As such it has its own delay contribution, ranging from roughly 15 microseconds per packet for simple switching to roughly 1.5 milliseconds for more complex processing. To this, you have to add any queuing delay. Current reported average latencies for an Internet-level amount of handling are approximately 2 ms per hundred miles excluding access serialization delay. LTE achieves total latencies reported as 12 ms, and 5G has some reports as low as 2 ms, but this has to be added to latency from the Internet, where used.
Overall, what seems important is to establish a kind of “layered” approach to event process hosting, and in particular lay out standardized APIs for the control loop’s threading through the layered elements. The presumption would be that where the loop length is less than roughly 10 ms, you’d attempt to support the loop using a locally hosted (on-board) element.
Beyond this local element there are a lot of questions, and there have been plenty of proposed answers but little consensus. About half of enterprises see relatively little need for anything in the way of a shared hosting service short of the cloud or data center, meaning that all the critical timing is handled by the local processing resource. The half who see a value see it where latency loop limits are no more than 50 ms. This seems in a network sense even assuming you had an edge host per major metro area (over 100k population), but it’s not clear that processing economy of scale could be achieved in all those locations, and the first cost to position resources at that level could be problematic.
That, friends, is the real issue in real-time services. It’s not network latency but pool efficiency, and that also relates to the applications’ structure and performance. To share a resource is to wait for something to become available, which means that you have to add to the control-loop length. Price a shared resource optimally, you have to run at a high utilization while sustaining a low resource queuing time, which means you have to have a lot of capacity and a lot of use. The total round-trip control loop length is really set by the distance you have to haul things to achieve pool optimality, factoring in a willingness to pay more for the premium latency. That distance determines whether the network really contributes, positively or negatively, to the overall latency picture and to the application’s value proposition and business case for buyer and seller.
The important point to make on 6G here is hopefully clear by now. Absent a solid and optimal model for process distribution, we can’t know what the actual requirements for 6G latency would be, and we can’t know whether there are other issues that are more important. One example enterprises offer is packet loss. Typically, QoS parameters like packet loss and latency are traded against each other, but in real-time applications packet loss is a potential disaster. 6G, then, may actually be aiming at something very different than the most important class of applications will need.