Before the US Labor Day holiday, I promised to look at a question operators have been asking for a couple of decades. Is there a way of building networks that would reduce opex? By that, I mean “Is it possible to design infrastructure in a way that’s less opex-intensive”, and that has two answers, depending on whether you mean simply whether it’s possible or whether you mean practical.
What, in the view of network planners themselves, would an optimum network look like? I got answers to that, with varying level of detail, from 52 operators, and you could fairly argue that 41 of them came up with essentially the same answer. You have access infrastructure that feeds a series of “network metroplexes” with a minimum of aggregation layers, by which most said they meant “one intermediate point”. In the US, for example, you’d likely have a bit under ten thousand feed points pushing traffic to between 200 and 300 metro points. In those metro points, you’d host all intelligence, provide all content caching, and connect to an optical core of essentially limitless capacity. One planner put it this way: “Ten thousand aggregation routers, 250 giant edge facilities, and an optical pool with no electrical handling at all.”
The challenge with this is multifaceted. First, it would almost certainly require a complete rebuilding of the network, displacing most or even all existing equipment. Obviously that’s an enormous capex hit. Second, you’d have to define a way to evolve to this, since fork-lifting an entire network, even if you could bear the displacement cost, would alienate all your customers. Third, you’d need both an accepted model for your new infrastructure and gear to populate it. I think that the only viable way to approach that optimum network goal would be through addressing the second of those points, the evolution, to devise a solution to the first and third points.
The most logical place to start an opex-optimized network infrastructure evolution is the core. The goal would be to create a DWDM agile optics (ROADM) framework throughout. In order to do that without a fork lift of core infrastructure, it would be logical to adopt optical routing interfaces with DWDM capability, such as are available from many router vendors today. This optical routing or packet optics level would be responsible for grooming traffic on/off wavelengths.
The evolutionary goal would be to push this on-ramp function outward to the metro level, but in order to do this we’d have to be able to create an optical mesh of the metroplex points. The wavelength capacity of a fiber strand is increasing, but today a maximum of about 128 could be expected. Two fiber rings could, in theory, allow full connectivity among a US-market-sized metroplex set of sites. It would also be possible to use a mesh configuration that would be optimized to account for the chances that some metroplexes would be more likely to communicate with a few others. Of course, the specific number of fiber strands involved in a ring would depend on the capacity of each wavelength.
The fact that there would likely be clusters of metroplexes that were more likely to communicate among themselves means that it would be possible to create an evolutionary step toward an optical core by paralleling current core facilities within the clusters, then evolving out the current gear as the new core picked up the traffic. This would mean that the current equipment available (ROADMs, optical routers, packet optics, and maybe even photonic switches) could almost surely be used, and equipment could be expected to improve over time.
The metroplexes are an integral piece of the optical core strategy, but it would be easier to get to an optical core if we could get to the metroplexes via an independent route. That means that we’d need to view metro hosting of features and content as the drivers of edge computing, and the placement of powerful resource pools near the edge, perhaps to play a role in a social metaverse or a metaverse-of-things digital twinning strategy. This would create a new set of giant network devices, routers, in metro areas, and would generate more router opportunity than current core networks. These metro-edge points could then be logically candidates for optical meshing, and the result would be a lower-latency network overall.
While all of this is potentially good news, it still leaves some questions to be answered. Operators tell me that the average network core router has 4.3 years of remaining useful life, which means one of three paths to the future. First, you could simply displace the existing devices, which would be expensive in terms of write-down. Second, you could parallel them as I noted above, which would be expensive in terms of creating a redundant path set. Finally, you could displace as the gear was written down, which would mean not realizing any simplification benefits for some time.
Operators have hoped that things like 5G, which had a budget and at least a target for new service revenues, would create a justification for a core modernization, but 5G has not produced the hoped-for revenue boost. Some still hope that IoT, edge computing, gaming, and other “new services” could justify building out at least a cluster-of-metroplex model. However, only 19% of operators who offered a view on all of this believed that new service revenues could be made to justify core evolution. The remainder think that a new core model would have to evolve over time, or be justified by potential opex savings.
There are obvious operations challenges associated with an evolving set of core resources. Optical networks and router networks are very different, and are managed differently. A possible solution to this reflects back to my earlier blog on intent models. If we assumed that we could create a set of abstract objects that represented an evolving core infrastructure, we could transform the whole to an optical network with limited management impact by transforming the implementation of the intent models. Management practices and higher-level tools that reflected the abstract structure wouldn’t have to change.
While all this seems at least within the realm of reason, there’s a problem. We’ve not managed to evolve out of the core/metroplex into the access network, and that’s the area where the greatest opex challenges can be found. Capacity and redundancy can be added easily deeper in the network where there are fewer devices and pathways. Out toward the user, things eventually funnel down to a one-per-user connection map. The user end of the network is also harder to protect from outside problems, the classic cable-seeking backhoe example, because the cost of an installation not prone to outside interruption is high where there are a lot of connections involved.
Operators tell me that they are working to harden everything except the proverbial last mile, meaning to them any part of the access network that’s carrying aggregated traffic. There is likely little that can be done in a network infrastructure sense beyond that, at least nothing that would likely be cost-effective. However, I think that extending the intent modeling concept would open another avenue to reducing opex and even to improving customer care.
One of the things I did in my ExperiaSphere project was to try the concept of “derived operations” interfaces, which used a combination of a central database to collect raw operations data and a custom proxy to query the database to obtain helpful management views. This approach could be used, in combination with the intent-model notion, to create a customer-specific view of service behaviors and infrastructure conditions. In fact you could create multiple views of the same data, to support the customer, customer care personnel called in via an escalation process, NOC personnel, and so forth. This creates management abstractions from infrastructure functional abstractions, and the result is a tuning of management views and at the same time a compartmentalized management framework where functional elements self-manage to the greatest extent possible.
I think this is the ultimate in tuning infrastructure and management to minimize opex. It might also be the salvation of optical vendors. While Ciena’s current quarter was strong, it’s attributable more to sales to cloud providers than to network operators. Every market is based on an adoption model, every adoption model is driven by cost/benefit analysis, and every cost/benefit analysis process plucks low apples first and stops when the apples, in the form of returns, get beyond tolerance. Hockey sticks are found in hockey, not in business. The current network model is already showing signs of depending on apples increasingly out of reach. 5G shows that, and so does optical vendor earnings.
Something needs to give networks a boost, some new benefit to augment the cost/benefit picture. However, the question of just how much opex could be saved with an optical transformation, versus savings by intent-modeled abstractions and their management, looms large. So does the question of whether there are any edge applications, like the metaverse or metaverse-of-things, that could drive metro transformation and create a value for low-latency national connectivity. These are the issues we’ll need to be watching in 2024, because the growth of networking depends on the answers.