Why Didn’t NFV Fix the Box-Bias Problem?

Another question that my series of blogs on telecom standards raised with telcos themselves is “Why couldn’t NFV have fixed these problems?” On the surface, the concept of virtualizing network functions, which is what I advocated in THIS blog, would seem to be what NFV aimed to do. So why didn’t it work?

The easy reason is that the 3GPP box-centricity had gone too far by the time NFV came along. But 5G didn’t come along, at least in a deployment sense, until 2019, and NFV launched in earnest in 2012. True, but LTE had already solidified much of the structure of 5G because it was essential that 5G retain backward compatibility with LTE, just as 6G is expected to have to be compatible with 5G. In any case, if the goal of an initiative is to virtualize functions, the presumption (explicit in NFV) has to be that the functions aren’t already virtual, but embodied in boxes or what NFV called “physical network functions”. We need to dig deeper.

The original goal of NFV was to virtualize appliances, as shown in its seminal paper “Network Functions Virtualization: An Introduction, Benefits, Enablers, Challenges & Call For Action”. The key figure in the paper, identifying the targets of the initiative, did indeed include many of the elements of wireless infrastructure, and so it’s totally reasonable to think that the initiative could have targeted some of the key things that, in mobile standards in particular, went wrong.

As a very early participant in NFV, my view of why it didn’t do what it might have done is that the “proof of concept” or PoC process itself. The problem with PoCs is that they’re implementations, which probably should come along only when an overall framework for NFV had been defined. There were PoCs that were useful in defining that framework, and even in addressing some of the elements of mobile infrastructure that we needed to address to fix the box-centricity problem, but early focus ended up on “virtual CPE” or “universal CPE”, which was the use of a general “white box” combined with loadable software to replace customer edge appliances in particular.

The reason this was a problem is that it box-bound the whole notion of NFV, which should have been to unbind from boxes. It’s tough to say why things got so u/vCPE focused, but one factor was surely a hope by telcos that by creating a facile way of deploying edge services like firewalls, they could build some incremental revenue on top of their comfortable connectivity-based services. Again, this isn’t a bad mission, but rather a mission that should have been an application of a generalized model that was never really solidified.

But wasn’t there an overall NFV model released? There was, in an end-to-end architecture that, as is often the case, was depicted as a series of functional blocks linked by interfaces. It looked like boxes linked by connections, and the whole initiative started talking about NFV elements in terms of specific monolithic pieces of functionality. When the Linux Foundation released OPNFV, an “open platform to accelerate NFV” it retained, in fact adopted and implemented, the function monolith model.

A function monolith is like a box-less box, it’s functionality that should be fully virtual made to fit into a single component implementation. MANO, or NFV’s management and orchestration function, should have been kept as a logical function that could be made up of many things, including existing cloud orchestration tools like Kubernetes.

Decades ago, I was managing a project to automate a rather vast and complex enterprise activity, working with a dozen programmers and a set of end-user experts. One part of the task was to automate the contract piece of an information service, and the end-user methods analyst flow-charted it as a series of steps that mirrored the manual processes. The programmer assigned to do the work came to me to say that this wasn’t the way to do it, and probably wouldn’t even work, and so I did a redesign that took the functions and re-framed them in a way that optimized IT efficiency and agility, which did work. That’s the problem with visualizing processes as a series of functional steps; the steps you visualize constrain the implementation.

I’m not surprised that the NFV process went off-track like this; it’s how almost all telecom initiatives went off-track as we attempt the essential process of box-think to software-think transition. I’m a little more surprised that the OPNFV project didn’t address the problem, but the main contributors to it came from the NFV community and so likely thought of the “right” implementation in the way the initial architecture document described. The current OPNFV project seems to have worked to correct the bias, but again it’s hard to step away from work already done and approved by the base body, the NFV ISG itself.

Services, as in any sort of as-a-service, are the cooperative outcomes of complex systems. To create them, you should be able to define the service as such, meaning through the use of a template that calls on intent-model components. They are also event-driven, both to the service user and the service manager. That means that the whole ecosystem needs to be a state/event processes controlled by tables or graphs. This is the only way to get to the goal of function virtualization, since it makes every service feature, in-service or management, a task run in response to an event. Where the task gets run is a matter of operations efficiency, so there’s no requirement to stick it in a monolithic piece of software or in a hard-wired appliance.

The Open Network Automation Platform (ONAP) is another project that fell prey to monolithic-ism. It should have started off with a service template or model and built itself around that by using the intersection of element states and events to define action sets. Instead, it was built on a monolithic event-handler, and that made the whole system nothing more than a virtual box, which we now realize has all the impediments to agility and scalability that an actual box presents.

It’s also true that these points are equally applicable to any real-time system, which means that we should visualize IoT systems as digital twins defined by templates and processing (and generating) events. That makes it all the more important to start dealing with things from a software perspective, given software’s role in providing functionality and making business cases.

OK, can we summarize the reasons NFV didn’t fix the box-centric problem telcos have? Two things, I think. First, too many of those involved in NFV were box-centric, so their notion of a software-driven future was a future where boxes were replaced by virtual boxes, not by abstract functions and features. Second, the nature of the PoC process biased the work toward things that got a lot of vendor participation, and vendors are rarely willing to disrupt their own markets. It’s not hard to see that these same factors created the standards framework that broke things to start with, so unless they’re dealt with in future initiatives, we’re carrying the seeds of defeat forward with us.

Email and RSS:

Our Commitment: All the Facts, Always the Truth