I get a lot of comments and feedback from vendors, enterprises, and network operators. Recently, there’s been an uptick on topics related to Network Functions Virtualization, NFV. Some of this has apparently come out of the increased activity around Google’s Nephio initiative, which aims at creating a Kukbernetes-based management and orchestration framework. Some has come out of the realization that 5G deployments will increasingly commit operators to an approach to function hosting, and many realize that the work of the NFV ISG, while seminal in conceptualizing the value proposition and approach, isn’t the technical answer.
It’s important to get a “technical answer” to network function hosting because, whatever the mechanism, it’s clear that the industry is moving toward being able to compose services from functional elements. That means that it will be necessary to define service models and then orchestrate their deployment. If we’re going to go through that admittedly complex process, we don’t want to do it more than once. We need a model of service orchestration and management that fits all.
What, then, is “all?” One interesting thread that runs through the comments is that we aren’t entering the debates on function hosting with a consistent set of definitions and goals. It’s difficult to get to the right answers without asking the right questions, and more difficult if you can’t agree on basic terminology, so let’s try to unravel some of the knots that these comments have uncovered.
The first point is that while the great majority of operators seem to accept the need for “cloud-native” function hosting, they don’t have a solid definition for the term. That’s not surprising given that the same could be said for the industry at large. In fact, the “purist” definition and the popular meaning of the term seem to be diverging.
When I queried the operators who used the term, most said that “cloud-native” meant “based on native cloud computing technology”. Containers, Kubernetes, and the like were their examples of what went into a cloud-native function hosting framework. To cloud people, the term usually means “stateless and microservice-based”. In an earlier blog, I noted that many of the “network functions” that needed to be hosted didn’t fit with that purist definition. They shouldn’t be required to, because to do so would likely result in service failures.
Stateless microservices are a great strategy to support a human-interactive application and some other event-driven applications, including the control-plane part of networking. They’re not suitable, IMHO, for data plane applications because splitting functionality and introducing network latency between functional elements will impact performance and actually reduce reliability.
Containers are probably a good, general, way of hosting network functions. The big advantage of containers is that they’re a kind of self-describing load unit, which means that it’s possible to configure the deployed elements using generalized tools rather than requiring tweaking by human operations management. However, containers aren’t as valuable where something is going to be deployed in a static way on specific devices rather than on a pool of resources. Containers also have a higher overhead, meaning that flow-through traffic may not be ideal.
What we really need for function hosting is a universal deployment and management framework that works for containers, virtual machines, bare metal servers, and white boxes. That framework has to be able to deploy network functions and also on-network functions, because even the Internet, when considered as a service, has both. I’d argue that there are already initiatives that satisfy that for Kubernetes, and that’s one reason I am so interested in the Nephio project and its Kubernetes-centricity. What Nephio proposes to do is to create hosting generalization without sacrificing common orchestration/management, and that’s critical.
This point is important when addressing another theme of the comments, which is that achieving “cloud-native” (or whatever you’d like to call it) for function hosting is something that has to come out of the work of the NFV ISG. I disagree with that, strongly, and the reason is that I believe that the early work of the ISG has made it difficult (if not impossible) for it to embrace the Kubernetes-centricity that function hosting needs.
If future network services are ever going to be differentiable, different, then that’s going to have to be achieved through a unification of network functions and the functions that create and sustain experiences. Networking today, from the perspective of the user, is all about experience delivery. The front-end of those experiences is already largely hosted in the cloud, where cloud-centric practices (including/especially Kubernetes) prevail.
Despite the fact that operators believe that function hosting will enrich their services and make them more differentiable (raising revenues), they have spent little time trying to identify what specific functions would accomplish those goals. As I pointed out in my blog yesterday, they seem to believe that abstract function hosting will do the job, which is essentially a claim that providing a mechanism to do something is equivalent to actually doing it. They haven’t confronted specific service examples that mingle network functions and other hosting. In fact, they’ve really not considered network functions themselves, on any broad scale other than that outlined in places like the 5G standards. That’s likely why they aren’t seeing Kubernetes-centricity as pivotal as it is.
Operator standards make glaciers seem to move a breathtaking speeds by comparison. The cloud has done the opposite, compressing application development schedules and creating an explosion of tools and techniques to support the ever-changing needs of businesses. If Internet-centric experience delivery is what’s really driving the cloud and networking, then can telcos afford to have their own function hosting framework lagging so far behind what’s driving services? I don’t think so.
The pace of cloud progress can be traced to two things. First, there is a compelling mission for the cloud, even though it’s not the mission most think of when they hear “cloud computing”. It’s not about replacing the data center, but about extending applications’ user relationships out of the data center and closer to “the Internet”. Second, the cloud has produced application development, hosting, deployment, and management tools and techniques that favor rapid progress. They’ve done that by replacing “standards” with open-source projects.
A few operators, and more vendors, tell me that the “standards” activity itself is creating a problem. There’s an established operator reliance on standards and on their work in defining them. For mobile networking, the 3GPP has almost literally ruled in defining how equipment works and interoperates. This flies in the face of the fact that IP networking is utterly dominant today, and that the IETF and not any carrier-centric standards body, drives IP specifications. It’s not unreasonable to see the rise of the Open RAN (O-RAN) initiatives as an indicator that even the 3GPP may be losing influence. However, “losing” doesn’t mean “lost”, and so we can’t hope for quick progress in turning operators away from their own standards initiatives. Especially when there are a lot of people in operator organizations whose career is based on those initiatives.
Vendors, I think, can and must be the solution here. I’ve advocated an approach of blowing kisses at ETSI NFV while working hard on developing a true cloud-centric function hosting framework. I think (I hope) that Nephio does just that, but what will decide the fate of Nephio, and perhaps that of function hosting and even telecom, is how vendors support the initiative, and that’s not something that’s easy to assess.
Nephio is a hybrid of cloud and telecom in more than just a technical sense. Most of the truly seminal elements of the cloud were created because a primary vendor did something and made it open-sourced. Rarely has a collection of vendors somehow come together on a concept successfully. The old “horse-designed-by-committee” analogy sure seems to apply. Vendors not only make up the majority of participants in open network-industry bodies, they tend to supply the majority of the resources. What happens next for “cloud-native” in telecom depends on how those vendor participants work to advance something useful, fast enough to matter to a rapidly changing market.