Most telcos and even telco vendors agree that there’s a need, an urgent need, for telcos to transform their infrastructure. The “legacy” way of building networks was built on paradigms that have long been challenged, but in order to displace it, telcos and vendors have to embrace some different model, and recognize different (and new) paradigms. That the latter requirement has to drive the former is clear, but what those paradigms are is far less than clear. I’ve had 57 telco comments on the nature of their future network infrastructure, and two models have dominated, each postulating a different path forward.
Legacy telecom networks, so telcos themselves say, were built based on too many boxes. The reason is simple; in the past, there was a sharp “bandwidth economy of scale” factor to be considered. A fat pipe cost a lot less than a skinny one, so aggregating traffic was essential. You couldn’t do that by any means other than a nodal element, an electrical device, and this created layers of network technology that were more expensive to operate because 1) there were more elements, and 2) the technologies for each layer tended to be optimized for their local mission, making them different from layer to layer.
Evolving away from this approach obviously demands having something to evolve toward. Of our 57 telcos, 39 say that their goal is a combination of capacity and “flattening”, and it’s based on a simple truth, which is that a big part of the cost of a network trunk is the cost of the transmission medium, including running it. Fiber has taken over most of the transmission other than at the network edge, so their view is to run fiber to a logical point of edge concentration, and then push as many bits through it as possible. The edge points would then be linked to the core network through a maximum of one additional layer, and the core network would be built with as much meshing and capacity as possible. One operator said this approach would reduce the number of devices in their network by 40% and by doing that, reduce “process opex” related to network management by 63%.
The underlying goal here is to cut costs in general, but in particular to cut opex while at the same time helping reduce churn, which is why most of the 39 supporting operators are largely mobile players or targeting mobile infrastructure. Implicit in this goal is the presumption that their earnings, meaning their return on infrastructure, can’t be raised much at the top line, so costs have to be reduced to show Wall Street there’s progress being made.
The other 18 telcos have a bit of a different view. They believe that cost-cutting isn’t going to help in the long run, that it’s already been taken about as far as it can go. That means that improving their financials and stock price means raising the top line, revenue, and that means offering something new. To them, there’s still a need to improve capacity and reduce layers and box counts, but it’s specifically aligned toward “service injection”.
It’s difficult to define credible new services that don’t involve some form of specialization or personalization. It’s difficult to do that if traffic is too aggregated, which means that you need to have a place somewhat close to the edge, but not so close that you end up distributing too much and losing capex and opex economies of scale. This group thinks in terms of where to do service injection, which largely turns out to be “in each metro area”. In the US, this generally aligns with standard metropolitan statistical areas (SMSA) or the old telco concept of the local access and transport area (LATA). This strategy thus focuses on metro deployment.
If service injection is the primary goal, then you need to think about service creation. There are two options to consider; the service may be created by a third party like an OTT, or the service may be created by the operator (including the possibility that some features of the service are from a third party and some from the operator). Service orchestration, feature creation, and interconnect are then the specific requirements. This combines to suggest that in this potential paradigm for the future telco, you’d need metro infrastructure that could involve servers and LAN switches as well as routers.
It’s my view, given the direction that mobile standards have taken and given the intent (if not realization) of NFV, that specialized appliances involved in things other than data-plane handling of traffic will be replaced by hosted software and servers. Given that, and given the likely dependence of metro (under the service-injection paradigm) on the same technologies, I suggest that metro-level concentration of almost everything other than basic data handling (routing, aggregation) is the logical goal. Put fairly dumb and cheap devices out in the edge network, but bigger but still dumb and cheap devices in the core, and put everything else in the metro.
If we were to assume that a lot of mobile-infrastructure functionality were housed in the metro area, we might see actual value in Open RAN in general and the RAN Intelligent Controller (RIC) and its real-time and non-real-time applications. Pushing the RIC domain toward the tower means losing much of the economy of scale that modern hosting demands. Pushing it into the metro opens the door to RIC control over feature hosting, both within the Open RAN model and beyond it.
This, I think, requires some additional thinking about edge features, thinking that frankly should have been done a long time ago as a part of mobile standards. There, we find the origin of the notion of control/data-plane separation, but the separation isn’t fully framed because some data-plane elements have interfaces specialized to mobile. What terminates them? We need to be thinking about a more modular notion of the features of white-box and proprietary data devices, so that we can deploy standard interfaces between data- and control-plane elements. In doing so, we would be allowing for the control plane to be metro-hosted.
I would also argue that we should be looking at defining APIs as those specialized interfaces, rather than presumptive physical interfaces. It doesn’t make sense to be accentuating the positives of software and hosting while treating connections and exchanges among elements as though those elements were still appliances. And having architecture diagrams that represented a “hosted” system as boxes connected by links doesn’t help either. NFV went off the rails in part because a diagram like that was interpreted literally, which should never have been done for a mission that demanded cloud-centric thinking.
What will turn out here? I think picking between these options is futile; the second is obviously the right one in the long term, but I’d argue that telcos haven’t made a single right choice in all the years I’ve worked with them. The first option will lead to change, but not enough to stave off commoditization of telecom services and, in some markets, subsidization. Not ideal, but we are all the sum of the decisions we make.