When I was a newly-minted programmer I saw a cartoon in the Saturday Evening Post (yes, it was a long time ago!) A programmer came home from the office, tossed his briefcase on the sofa, and said to his wife “I made a mistake today that would have taken a thousand mathematicians a hundred years to make!” This goes to show that automation doesn’t always make things better when it makes them faster. Sometimes it doesn’t make them better at all.
Automation is a necessary part of virtualization, and we can certainly envision how a little service automation done wrong could create chaos. One area stands out in terms of risk, though, and that’s security/compliance. We learned with the cloud that security is different in a virtual world. Will it be different for SDN and NFV, and how? What will we do about it? There are no solid answers to these questions yet, but we can see some signposts worth reading.
Virtualization separates function from realization, from resources. When you create services with virtualization, you are freed from some of the barriers of service-building that have hampered us in the past, most notably the financial and human cost of making changes to accommodate needs or problems. Adding a firewall to a virtual network is easier than adding one to a device network. However, you also create issues to go with your opportunities. The potential for malware in the virtual world is very much higher because you’ve converted physical devices with embedded code into something not very different from cloud components—something easier to get at.
I would propose that security in the virtual world is a combination of what we could call “service security” and “resource security”. The former would involve changes to the “black-box” or from-the-outside-in experience of the service and how those changes would either improve or reduce security. The latter would involve the same plus-or-minus, but relating to the specific processes of realizing the resource commitments associated with each function. Service security relates to the data plane and any control or management traffic that co-resides with it, and resource security relates to the plane where virtualization’s resources are connected.
Service security differences would have to arise from one of three problems—spoofing an endpoint, joining a limited-connectivity service, or intercepting traffic. To the extent that these are risks exposed within the data plane of the service (or co-resident control/management planes) you would have the same exposure or less with SDN or NFV. SDN, for example, could replace adaptive discovery with explicit connectivity. Thus, at the basic service level, I think you could argue that virtualization doesn’t have any negative impact and could have a positive one.
Virtualization’s agility does permit added security features. NFV explicitly aims at being able to chain in security services, and this approach has been advocated for enterprises in a pure SDN world by Fortinet. You could augment network security by chaining in a firewall, encryption, virus-scanning on emails, DNS-based access control, or other stuff without sending a tech out or asking the customer to make a device connection on premises. However, remember that you can have all these features today using traditional hardware, and that’s usually the case with businesses and even consumers. It might be easier to add something with virtualization, but in the end we end up in much the same place.
If we carried virtualization to its logical end, which is application- and service-specific networks built using OpenFlow and augmented with NFV, you could see the end to open connectivity and the dawn of very controlled connectivity, almost closed-user-group-like in its capabilities. Given this, I believe that at the service level at least, virtualization is likely to make security better over time, and perhaps so much better in the long term that security as we know it ceases to be an issue. I want to stress the point that this isn’t a near-term outcome, but it does offer hope for the future.
If service security could be better with virtualization, the resource side of virtualization is another story. At some point in the setup of a virtualization-based service, you have to commit real resources and connect them. This all happens in multi-tenant pools, where the isolation of tenants from each other and management/control pathways from tenant data planes can’t be taken for granted.
The first fundamental question for either SDN or NFV is the extent to which the resource domain’s control traffic is isolated from the service data plane. If you can address or attack resource-domain components then there’s a security risk of monumental proportions being added when you add virtual infrastructure, and nothing much you do at the service level is going to mitigate it. I think you have to think of the resource domain as the “signaling network” of virtualization and presume it is absolutely isolated from the services. My suggestion was to use a combination of physical-layer partitioning, virtual routing, and private IP addresses. Other approaches would work too.
If you isolate the signaling associated with virtualization, then your incremental resource risks probably come from either malware or “maloperators.” There is always a risk of both these things in any operations center, but with virtualization you have the problem of needing a secure signaling network and then needing to let things onto it. The VNFs in NFV and perhaps even management elements (the VNF Managers) would be supplied by third parties. They might, either through malice or through error, provide a pathway through which security problems could flow into the virtualization signaling network. From there, there could be a lot of damage done depending on how isolated individual VNF or service subnets of virtual resources were from each other and from central control.
To me, there are three different “control/management domains” in an SDN/NFV network. One is the service data plane, which is visible to the user. The second is the service virtual resource domain, which is not visible to the user and is used to mediate service-specific resource connections. The third is the global management plane, which is separate from both the above. Some software elements might have visibility in more than one such plane, but with careful control. See Google’s Andromeda virtual network control for a good example of how I think this part has to work.
Andromeda illustrates a basic truth about virtualization, which is that there’s a network inside the “black box” abstractions that virtualization depends on. That network could boost flexibility, agility, resilience, and just about everything else good about networking, but it could also generate its own set of vulnerabilities. Yet, despite the fact that both Amazon and Google have shown us true black-box virtual networking with their cloud offerings, we’re still ignoring the issue.
The critical point here is that most virtualization security is going to come down to proper management of the address spaces behind the services, in the resource domain. Get this part right and you have the basis for specifications on how each management process has to address its resources and partner processes, and where and how it crosses over into other domains. Get it wrong and I think you have no satisfactory framework for design and development of the core applications of SDN or NFV and you can’t ever hope to get security right.
We need an architecture for SDN and NFV, in the fullest sense of the word. We should have started our process with one, and we aren’t going to get one now by simply cobbling products or ideas together. The question is whether current work, done without a suitable framework to fit in, can be retrofitted into one. If not, then will the current work create so much inertia and confusion that we can’t remedy the issues? That would be a shame, but we have to accept it’s possible and work to prevent it from happening.