What happens when a price leader cannot lead, or maybe even follow? In the world of carrier networking, we may be about to find out. Whatever you think about the war between the US Government and Huawei, the impact on Huawei seems to be increasing, and that could have a major impact on telecom, not only on the capex budgets but on network technology itself. In fact, the only answer to the end of the price-leader paradigm may be a new architecture.
Operators today spend between 18 and 25 cents per revenue dollar on capital equipment, something that’s been an increasing burden as their revenue stagnates and costs seem to be either static or even increasing. Of the 77 operators I’ve interacted with in the last 6 months, 73 have said that they are in a “critical profit-per-bit” squeeze. For most, capital budgets have been an attractive target.
The problem, of course, is how you attack them. A decade ago, some operators were experimenting with hosting router instances on servers, and about seven years ago they launched the Network Functions Virtualization initiative. Neither of these has proved out as a means of significant capex reduction. Only 19 of those 77 operators think that either of these initiatives will ever lower their capex significantly.
It’s obvious where Huawei comes into this picture. They’ve consistently been the price leader in the network equipment space. Back in 2013 when I was attending a networking conference in Europe, I met with a dozen operator experts on NFV and transformation, and one made the comment that the 25% capex improvement that some NFV proponents were promising wasn’t enough. “If we were satisfied with 25%, we’d just beat Huawei up on price” was the key comment. Technology change has failed; let’s go for discount pricing.
That’s the problem in a nutshell. The best we’ve come up with using new technology so far hasn’t measured up in terms of capex reduction. It can’t match what operators could hope to get in the way of extra discounts from Huawei. If Huawei is off the table as a supplier, even if competitors like Ericsson or Nokia were willing to cut the same 25% discount, their starting price is often at least that 25% higher. Operators are feeling the stress of dying financial options, so they need some options to develop.
Bringing Huawei back is an option none of them can really control, so there’s no point talking about that. Our easiest response would then be to resurrect either router instances or NFV, so we have to ask why these two failed and whether we could address those issues.
Router instances running on commercial servers, and virtual-function versions of routers, have the same issues, as it turns out. First, commercial services aren’t the right platform to host the data plane of high-capacity network devices. The current-market shift of focus to white-box technology is a result of this truth. You need specialized data-plane chips to do efficient packet-handling in multi-terabit devices. Second, all these hosting-of-routers-and-functions stuff has been a miserable failure in an operations sense.
The NFV realists among the telcos tell me that they’re finding NFV to be so operationally complex that they believe they’d lose more in opex than they’d save in capex. Think about it; in the router-in-a-box days, you managed a network by managing the routers. With NFV, you still need to manage the routers, but you also have to manage the servers, the server platform software, the NFV management and orchestration tools, and the virtual-network resources that connect all the pieces of functionality.
White boxes could fix some of these problems, but not all of them. If we were to look at the hosted-router model, assuming white-box deployments, we would expect to save the price difference between a vendor router platform and a white box (about 60-70%). We still need the software, and many router-software vendors want to license on a per-box basis, so that eats up about a quarter or more of the capex savings. We can get some savings, in the net, but we’ve become an integrator or we have to hire one, which further reduces the savings. We might also have to customize the router software for our black-box choice. This still leaves some savings, so we have to ask why it’s not being adopted, and the answer lies in the limitations of those white boxes.
There’s a router for every mission these days. Home, office, edge, aggregation, core, whatever. Most router vendors have at least three models, and all of them are based on a different architecture. There are many different white-box platforms too, and most of them are also based on a different architecture. Operators aren’t excited about trying to match software-routing licenses to white-box architectures themselves.
There’s a different problem at the high end, which is lack of an available platform. Cisco and Juniper offer models that offer over 200 Tbps capacity. Try to find a 200 Tbps white box. In fact, the big router vendors don’t build their high-capacity routers as a monolithic box anyway, they use a chassis design that combines fabric switching and routing elements. Operators could in theory build a chassis router from the right white boxes, but could they do the software?
Then there’s the big question; who operationalizes this whole mess? One advantage of the single-vendor approach is that they make everything fit. Just getting all the pieces to run within the same operational framework is a challenge (called “onboarding” in NFV with virtual functions) that’s defeated a whole industry. A big part of the problem is that open platforms tend to develop in microcosmic pieces, and operators have to deploy ecosystems. Nothing fits in the open community, because nothing is designed to fit. There’s no over-arching vision.
An over-arching vision is what we need, a vision of what cloud-native network function deployment would really look like, and how it would create superior network functionality and operational efficiency. What the heck does the network of the future, the great alternative to low-priced vendors, even look like? Even before we had NFV, we had SDN, and it articulated or implied some principles about the network of the future. The net of them all was that you don’t build that network of the future by redoing the same software and hardware model that you’re trying to replace. Cisco and Juniper both talk about “cloud principles” in their stuff, but they’re mostly focusing on interfaces to cloud tools for orchestration and management, not on the way that the devices themselves are built and deployed.
You can’t easily apply cloud principles to the data plane. That means that you can’t apply them to networks built from software instances that don’t separate the data and control planes. It also means that once you separate the data and control planes to apply cloud principles, you have to somehow bring the two together to coordinate the overall functionality of the network. You also have to decide how a cloud-native control plane would look, and surely it would be a collection of microservices that implement individual control-plane features. That’s not how we build routers today; they’re monolithic and not microservice-ized. I did a whole blog on this, as applied to 5G, HERE.
This problem was recognized early on. At their very first meeting, the NFV ISG considered the question of whether you had to disaggregate the features of physical devices before you virtualized them. The idea was almost literally shouted down, and we were left with the notion that a virtual network function was a hostable form of a physical network function, meaning an existing network device. That decision has hobbled our attempts to rebuild networking by forcing us to accept current devices as our building-blocks. If you build a network of router instances, it’s still a router network.
The telco world, in the 5G architecture, is admitting that you have to build on virtual principles and not on box principles, but the 5G work not only utilizes NFV and all its limitations, its answer to avoiding box principles is to create virtual boxes. Gang, a box is a box. But 5G does separate control and data planes, and it does have at least some recognition that signaling is, in the end, an event-driven application. It would be possible to take current 5G specs and convert them into something that’s truly a virtual powerhouse. That’s a good start, but the cloud-centric design is not being carried forward enough in our implementations, which have largely come from (you guessed it) box vendors.
Price leaders, as a concept, will always exist, but how much price leadership we’ll see is surely going to diminish as long as we stay within the old network model. Is there truly a more economical networking paradigm out there? If there is, I contend that it’s not created by stamping out “router device cookies” from different dough. We need a whole new recipe. If there’s a hope that new network architecture will create an alternative to a network-vendor price leader, then that new recipe is our only path forward. Otherwise, the overused phrase “Huawei or the highway” may be true.