Recent discussions on LinkedIn that relate to how NFV and O-RAN fit seem to show that there’s still a wide range of views on the way that hosted features relate to network services. Since that may well be the most important topic in all of networking, and surely is for network infrastructure, I want to weigh in on the issue.
NFV, or “network functions virtualization”, came about because of a 2012 Call for Action paper put out by ten network operators. If you look at the paper, and at the early end-to-end architecture model of NFV that’s still largely in place today, you find that the high-level goal of NFV was to substitute “virtual network functions” or VNFs for network appliances, or “physical network functions”. The paper in particular seems to focus on “appliances” rather than on the broader “network devices” category. The Call For Action paper, for example, talks about “many network equipment types” as the target. It also focuses on equipment other than switches and routers (see Figure 1 of the paper if you have access to it) and the functional software elements that replace “appliances” are called “virtual appliances” in that figure. All this is why I contend that NFV was targeted at replacing discrete devices with virtual devices.
The architectural focus of the white paper and the end-to-end model means in part that NFV management strategy was aligned with the goal of meshing VNFs with existing element management systems (EMSs) that were used for the appliances the VNFs would replace, and with OSS/BSS tools. Most of the early proof-of-concept projects approved for NFV also targeted “universal CPE”, which of course is almost exclusively relevant in the business service space. It’s that combination of alignments that I contend focuses NFV on customer-specific business service deployments.
This sort of stuff is understandable in light of the original NFV mission, but it collides not only with the way that hosting is used in later service standards (notably 5G), and even in some ways with the original Call for Action white paper. Some of the appliance types that the Figure 1 I previously referenced include are community- rather than customer-focused, and I contend those demand a different management model. That was in fact one of the things I addressed in the PoC I submitted (on behalf of a collection of vendors). I called services directed at community elements like 5G components “Infrastructure Services”, meaning that they represented services that were deployed as part of overall infrastructure, for community use, rather than per-user.
A traditional element management model implies that a single management element controls a specific network element, and where that network element is a shared resource that cannot be the case. In fact, any management interface to a shared element poses a risk that shared use will create a kind of “management-denial-of-service” attack, where too many users/services reference a management API and the workload causes the element to collapse. The notion of “derived operations”, which was based on the (sadly abandoned) IETF draft called “Infrastructure to Application Exposure” or i2aex, addressed this by maintaining a management database that was queried and updated by users through one of any number of software/virtual MIB proxies, but was alone referencing the actual MIBs.
The other issue presented by the NFV architecture is one I’ve mentioned before, which is a deviation from the cloud’s onward evolutionary processes. Cloud computing at the time the NFV launched was largely based on virtual machines and “infrastructure as a service” or IaaS. Today it’s much broader, but a major part of NFV was explicitly tied to the IaaS/VM approach. The notion of creating a functional unit (a virtual device or a VNF) by composing features was actually raised in the spring 2013 NFV ISG meeting, but it would have required vendors to “decompose” their current appliance software since applications weren’t built that way then. Today, of course, cloud applications are regularly implemented as compositions of feature components, and standards like 5G and O-RAN presume that same sort of feature-to-service relationship. To get terminological on you, NFV was about functions which are collections of features, and the cloud is about features and how to collect them.
My references to 5G and O-RAN here should demonstrate why I’m bothering with this topic now. The fact is that when NFV launched, we had network devices that we were trying to virtualize. Now, we have network features that we’re trying to integrate, and that sort of integration is exactly what the cloud is doing better and better every year. We have orchestration tools and concepts for the cloud that make NFV’s Management and Orchestration (MANO) look like Star Trek’s “bearskins and stone knives”. Projects like Nephio and Sylva are advancing cloud principles in the network space, and Sylva is explicitly targeting edge computing applications, in which group I’d place O-RAN.
As one of my LinkedIn contacts noted recently, we do have successful applications of NFV, examples of how it can be modernized. That doesn’t mean that we have a justification to continue to try to adapt it to current feature-hosting missions when we have cloud technology that’s not only addressing those missions already, it’s evolving as fast as new missions are emerging.
Standards groups are the wrong place to try to introduce software, and open-source projects are the right place. That doesn’t mean that simply morphing NFV into an open-source project would have done better, though. In my view, both Nephio and Sylva are current examples of open-source projects directed at a telco mission, and neither of them is proceeding in what I believe to be the optimum way. It’s always difficult to understand why that sort of thing happens, but my personal view is that telco participation in the projects is essential because telcos are the target, but telcos don’t have software architects driving their participation. As a result, they drive sub-optimal decisions without realizing it.
So is there no hope? I think there is, but I think that the “hope” we have is the hope that the processes we need will evolve without the telcos’ involvement, and that because of that those processes will be focused more on the public cloud than on the telco. There will be renewed cries of “disintermediation” from the telco side, of course, but in this evolving edge and function hosting model as in past service initiatives, the telcos have succeeded in disintermediating themselves. If they want to stabilize their profit per bit, they need to get with the software program in earnest.