You’re familiar by now with my rants on how we’ve decomposed ecosystemic shifts like the cloud, SDN, and (eventually perhaps but hopefully not) NFV. What has to be considered as a whole new cooperative model for IT and networking is instead being looked at as a bunch of disconnected product changes that might lead to a future but quite likely won’t lead to the optimum one.
There’s probably no better example of this need/risk profile than the security of the future “fusion” infrastructure; network and IT as one layer and not two. Today we have security issues that arise largely from two sources; contamination of content (malware on websites) and contamination of access (hacking). With the cloud we have a set of new issues as well as an exacerbation of many of the old ones.
Generally, the cloud introduces a new risk of resource contamination, and a new source of risk in signaling hacking. The IT/networking fusion that’s coming is a distributed and cooperative community, and cooperation in a distributed anything doesn’t come easy. When we talk about resource pools we assume that we know where everything in the pool is and what its status is, for example. Could somebody enroll themselves in a pool and become in effect a part of our applications? Why not, if we presume that cloud resources are dynamically allocated? The dynamism means there’s a process for resource enrollment. But even easier, why couldn’t you attack a resource pool by spoofing status messages that made the resource appear to be down, or make a single resource look “available” to everyone no matter what its state?
And it gets worse. If we have centralized network processes, we can hack them and attack them with denial of service. We can spoof them too. One little change in a forwarding table can send packets down a totally different path. One fork in the table can monitor everything we do. A virtualized network process can be the firewall logic we just offloaded from our gadget, or it could be the firewall logic planted by a hacker.
It’s not that the cloud, or SDN, or NFV creates a problem as much as the combination of those things exposes problems we’ve had all along. There are some fundamental truths about the network of the future that we should be able to see today, and truth number one is you cannot have user traffic sharing a network with service signaling. “The Internet” is a user network so the Internet can’t be how we control services, intra-service (between components) or inter-service (between services, or between services and users).
Truth number two if the new fused IT/network world is that telemetry from resource/network monitoring has to be managed in volume and also separated from user networks. Operators are already saying that they face applications where monitoring traffic exceeds service traffic. You can’t have that if you want to control costs, but you also can’t have that if you want to have available and secure networks. Any monitoring problem in a pooled-resource, central-controlled, environment has the effect of telling a lie to the controller, and your cloud can’t be built on a web of resource lies.
Truth number three is that you can’t add security to an insecure environment. We are focused too much today on “making the network secure” by adding components that are designed to identify and block bad things. We need to focus on networks that can be driven to do good things as their standard behavior. One thing that means for things like the cloud or SDN or NFV is that there is no such thing as inherited trust. Any contributor to anything has to be validated by the federating process that joins them to the community.
You can draw a picture of where this has to go, which is the notion of a network at two layers, the structure I’ve called “Cloudnet”. With Cloudnet every user accesses the cloud through agent processes that the user can explicitly trust. These agent processes in turn trust resources and make the connection. There is no access without a trust transfer, and the process of mediating resources happens inside the inner cloud layer where it’s never accessible to user traffic. SDN central control happens inside here, and so does cloud resource commissioning and network functions hosting.
There’s still more, I think. All the furor over how the cloud or SDN is going to demand all these network probes collecting intelligence for delivery to a central point creates a commercial problem with profitability even if we can solve the security issues of spoofing monitoring points to lead to bad control decisions. With SDN, for example, we could create domains of manageable size controlled by a central element and presenting themselves as virtual devices to a higher-level collecting domain in much the way that internal routing areas/domains can be aggregated into ASs with border gateways. We need to be looking, in our review of SDNi (SDN interconnect), on how we communicate state/status between the domains and how we make global routing decisions by aggregating local ones. There’s been some work done on this in the past in protocols as old as SNA and ATM, and since we’re now looking at a future where we re-create some of the problems of the past, we should look at how they were solved and adapt the strategies.
We are never going to realize the cloud’s potential, or SDN’s potential, or NFV’s potential, if we create a future infrastructure that can’t be commercially viable, available, and secure. We can’t assure that with bottom-up design, and so we need to be looking to vendors and standards bodies to address these high-level concerns, to be sure that our steps to the future don’t take us over a cliff along the way.