If telcos and other network operators are facing major profit challenges, cost management is clearly one option to address them. Cost management includes operations costs, and historically operators have spent more on opex than on capex. Even though the gap between the two has narrowed considerably over the last decade, every telco I’ve talked with says they believe that more could be wrung out of their opex budgets. The obvious question is “How?”
Nobody really likes to talk about it, but the primary way of reducing opex is to reduce headcount. Think back over the last 30 or 40 years, and you’ll remember (if you’re old enough) a time when humans were involved in your network services. When I got my first broadband connection, it was installed by a real person who handled even interior setup. When I got my most recent connection, I got a tech who created an interface for me to plug a router into, and left me to do the rest. I’ve not made a support call to a telco in a decade that resulted in me talking to a human, and my most recent support experiences were via an app. Obviously a lot of humans have been wrung out of operations, and every telco believes that more can be done there. That gets us back to the “How” question. The operators themselves see 3 potential pathways to reducing opex.
The first path in terms of stated support, with 88% of operators mentioning it, is broader use of artificial intelligence in support missions. This group’s view is that past work on opex reduction has focused on “craft” areas, meaning field personnel, and on customer service and tech support call centers. The former area has been largely mined for reduction, and the latter really needs some new ingredient to have further potential impact on costs. AI, operators believe, could be that new ingredient.
Some companies already use AI chatbots for primary handling of online and text support needs, and the new generative AI technology shows great promise in improving automated systems’ ability to be truly helpful. However, “promise” is a vague characterization, and the reason for vagueness is that only about a third of the operators who favor this approach have taken any meaningful steps to evaluate it, and only 18% have actually done testing. That group is universally concerned with the error rate of public generative AI models, and only 4 operators have even looked at anything beyond that so far. When could this path to opex reduction create meaningful results? Only 11% of operators think it could happen this year and only 21% think next year. The rest say 2025.
The second path to reducing opex is improving infrastructure management systems, which 79% of operators say is a viable approach. Interestingly, almost all of that 79% say that artificial intelligence is likely to be an element of any improvement. The interesting thing about this path to reducing opex is that it’s really focused on “routine” tasks in the classic FCAPS (fault, configuration, accounting, performance, and security) management model. The operators see this focused on the network operations center (NOC), where they see their key management personnel working. Outside-the-NOC activities, things involving actually touching facilities, are a secondary target at this point, though I am hearing more interest in combining all infrastructure support tasks into a common target.
The problem here is that same vagueness I noted in the first path to opex reduction, but for a different reason. Here the issue is that operators are hostages to vendors who supply the management systems, and they’re not generally happy with their vendors’ responses. In fact, 65% of operators say their management systems vendors are “not fully responsive”. The remainder aren’t so much letting vendors off the hook as admitting to other factors’ impact. Most say that current management systems (OSS/BSS in particular) are “antiquated”, and I’m seeing a resurgence of the old “modernize-or-start-over” discussion in the OSS/BSS area. But remember that OSS/BSS systems aren’t doing infrastructure management, and the stuff that is doesn’t come under the CIO that handles OSS/BSS anyway. It’s part of network operations, COO-handled, and it interfaces with OSS/BSS.
For specific network management, operators are struggling with overall vision. For 32% of operators, intent modeling is the over-arching concept that they want. For 25%, it’s artificial intelligence, and for the remainder it’s “defining a unified NOC platform” without further technology specified. I think this division reflects in large part the influence of the network equipment vendors, who see management software as a major differentiator and also a way of sustaining their account incumbency. Anyone who’s read my blogs know how strongly I feel about intent-based management systems, so obviously I’m disappointed that there’s such a low level of interest in it here.
Where AI gets a nod is in the ever-growing issue of configuration errors. When a NOC responds to a problem, the response is in most cases a change in network/device parameters, meaning a reconfiguration. The number one cause of outages, according to operators, is the errors that are made at this point. What operators would like is to have AI “suggest” causes and solutions, and “review” changes before they’re made. Anywhere BGP is involved there’s great specific interest in managing its configuration.
The final pathway to opex reduction that operators cite (64% do) is architecting networks to minimize operations challenges. Generally, this means trying to reduce the number of devices, the number of different device types, and reducing choke points or capacity limits. The direction almost all this group takes is toward increased use of optics to generate more capacity. Don’t manage bits, oversupply them. We clearly have the ability to transport terabits these days, so use that to prevent congestion and create more alternate paths, which would then simplify “traffic management” and reduce the risk that an error would actually cause a persistent outage.
If anything, this concept’s support is fuzzier than the others already cited, for two main reasons. First, it’s far from clear what the architecture of an opex-optimized network would look like. That surely makes deploying one difficult. Second, realistic plans to deploy such a network would have to address the financial challenge of installed equipment. If you’ve not fully depreciated a piece of gear, you have to take a write-off when you replace it, and that cost adds to the cost of your infrastructure modernization. Optical routing, meaning the use of DWDM interfaces on routers to allow them to couple directly to the optical network, offers a pathway here, but it’s one that likely has to build out from current optical core locations toward the edge to have a meaningful impact on device counts and management complexity. That will likely take some time, given that operator budgets are losing the 5G component that’s given operators some flexibility in spending.
The current challenge for telcos isn’t that they have no options to manage costs, but that they have no options that admit to a quick adoption. Technology constraints aren’t the problem, financial ones are. There’s always a challenge in deploying new gear because doing too much at once raises costs by displacing assets not yet appreciated. Doing too little dilutes the impact of the change. Operations costs are probably the only area where this problem is minimal, if we leave out the idea of reconfiguring the actual network. Thus, I think we can expect operators to push more on opex, likely focusing more on AI, and at the same time think about infrastructure remodeling steps that could let them go even further.