There’s been a lot of news recently about “security” or “governance” in the cloud, SDN, or NFV. It’s certainly fair to ask how new technologies are going to support long-standing requirements in these areas, but I wonder whether we’re not imposing not only old practices but old rules on a set of very new technologies, all based on the principle of “virtualization” that breaks so many boundaries.
In a virtual world we have services or capabilities represented by abstractions (“virtual machine”) that can be instantiated on real resources on demand, by assigning from a pool of such resources. Breaking the barrier of fixed resources can solve the problem of under-utilization and higher costs. We’ve done this with services for ages—“virtual circuits” or “virtual private networks” appear to be circuits or private networks but are created on a pool of shared facilities.
When we think about security or governance in our new technologies we should start right there with virtualization. It demands a pattern, an abstraction. It demands a pool of suitable resources, and a mechanism for resource commitment and ongoing management. Every one of these things are an element of virtualization, and every one is a risk.
To start with, who did these abstractions? They’re recipes in a sense, so we’d want them to be from a trusted cook. Same with ingredients, tableware, etc. To mix metaphors, think of a virtual environment as a bank vault; you’ve got to be sure what’s getting in.
Authenticity is the first requirement to virtualization security, and that means that everything in a virtual world has to be on-boarded explicitly, and only by trusted personnel through trusted processes. The logical assumption here is that when something is on-boarded, it’s also credentialed so that as it’s presented for use it’s collaterally recertified. We think of on-boarding of applications or virtual functions or even devices as being a process of wrapping it in the proper technical package, but it’s got to be more than that.
Credentialing something isn’t necessarily easy; if you take a device like a router or a software load, you can’t just slap a badge on it. In most cases, the thing you’re certifying is the functional aspect of the device or software, which means that you have to develop a hash key that can tell you whether the thing you’re adding is the version it represents itself to be and hasn’t been changed or diddled elsewhere.
The second issue we face is related to the processes that manage virtual environments. Not only do you have to credential the resources, the recipes, and the components, you have to credential the software elements that drive the virtual management and deployment. And you have to certify the things you’re running as “payload” like cloud components or VNFs. If you have a flawed process it can circumvent authentication of the elements of your infrastructure.
A corollary to this is that you have to be very wary of application components or virtual functions having direct control over resources or the ability to load other components directly. You can check the credentials of something if it’s introduced through your deployment processes or spun up by a management system, but how do you insure that every application that loads another copy of itself or of a neighbor component will properly check?
One of the things this raises is the need to protect what I’ll call interior networks, the network connections among the applications’ components or services’ features. If I put the addresses of all the elements of a virtual application or service into the user’s address space, why can’t they address it and perhaps through that compromise not only their own application but perhaps even the infrastructure? We don’t hear much about the idea that a multi-component application or a set of virtual network functions should be isolated completely from the service network except at the points where these components/functions are actually connected to users. We should.
Virtual environments also require tighter control over resources that exchange topology and reachability data. All too many network devices trust adjacent connections, but we all know that false route advertising is a problem for the Internet. We have to accept that in virtual structures, we can’t trust anyone without verification or the whole trust and certification process will crumble around us.
You’ll note that I’ve not mentioned things like secure links, etc. Encryption and security in the traditional sense of protecting payloads from interception, or firewalls to limit what addresses can access something, are all fine but they’re also things we understand at the high level. What we need to be doing is focusing not on how to make today’s end-to-end security measures work in an SDN or NFV or cloud world, but on why the differences in those worlds might make them NOT work. For example, can you trash central SDN control by presenting a lot of packets for which there are no forwarding rules? It could create a controller denial of service attack. Yet how do you adapt traditional security measures to prevent this, particularly if you expect that packets will normally be presented “outside the rules” and require central handling authorization?
Even basic things like SLAs and MIBs, which can be part of a compliance program because they define application QoE and therefore worker productivity and response, can be difficult in a virtual world. A real firewall has explicit performance and availability numbers associated with it, and repair or replacement can help define an explicit MTTR. If we make that firewall virtual, then we don’t even know whether the components of it at a given moment in time are the same as the moment before. In fact, we might have three or four firewall components behind a load-balancer. Calculating the MTBF and MTTR in these situations is non-trivial, and knowing when something has changed may be outside the range of expected management responses. Does your firewall tell you when it reroutes flows among its logical elements? How then do you know when a virtual one does?
We’re exploring our revolutionary strategies in a very limited way right now, and that’s letting a lot of users and providers dodge general issues that will inevitably arise if we broaden our use of this new stuff. If we don’t want to stall everything short of the goal, then we have to look a bit closer at how we’re getting to that goal.