We’re hearing again about the goal of applying OpenFlow to manage optical networks, and the interest surely reflects the value that “converging” layers of the network might have to network operators. I’ve commented before on the fact that a packet-match-and-forward architecture like OpenFlow is hardly suited to applications where packet examination isn’t possible because data is contained inside an opaque envelope. Could you make it visible? Sure, but it would radically increase the cost of the equipment, and the question is whether there’s any value there to justify that.
If you look at the network of the future as a circus act, then the application network layer is the clowns, running here and there in magnificent apparent disorder. It’s highly agile which means you have to have very low-inertia processes to manage it. But as you go down, you move through the jugglers and people on stilts, and you eventually end up with the elephants. They’re powerful, magnificent, but not particularly fast-moving. So it is with networks. Move from the top toward the physical layer and you lose the need for a lot of agility. Yes, “agile optics” has to be agile in optical terms, but hardly as agile as application networks that have to respond to every connectivity change.
Another factor is that lower-layer facilities almost always carry aggregations of those at the higher layer. A virtual switch network is aggregated onto Ethernet and then downward onto optics. At each layer the stuff joins similar traffic from other sources to create efficient trunking. By the time you get to the bottom where optics lives, you have a collection of many different network missions on the same trunk. So which of these missions controls the trunk itself? The only answer can be “none of the above”.
Lower-layer flows and paths have to be managed for the collective good, which means that what happens there becomes less a matter of application connectivity and more one of traffic engineering. Logically you’d want to establish grades of service at lower layers and traffic-manage to meet their SLAs. The higher layers would consume those grades of service, and changes in allocation at the top would impact the policies at the bottom only to the extent that changes in load might impact QoS. If that happens, it happens collectively and not by application.
The reason this is important is that SDN principles of central control, if applied to lower network layers, would necessarily have a different mission than when applied to higher layers. Do we want to manage traffic in an OpenFlow network by changing all the forwarding rules around? I doubt it. It’s looking more and more like there’s going to be a fairly agile top-of-network connectivity service and a more efficiency-driven bottom. That suggests that far from collapsing everything into a single layer (which would force us to address all the issues there) we might actually find multiple layers valuable because the lower layers could be managed differently, managed based on aggregate traffic policies.
Others have pointed out that the application of SDN principles to networks might be easier at the virtual layer, in the form of an overlay virtual network. Since this layer wouldn’t be visible to or managed by the real network equipment you couldn’t do traffic engineering there anyway. The question, then, is whether we have created a two-layer virtual-network model where the top layer is a software virtual network of vSwitches and the bottom layer is a big traffic management pool whose internal structure is both disconnected (by reason of OSI necessity) from connectivity at a layer above it, and disconnected from connectivity because it’s not providing connections but policy-managed routes.
This raises a major question on SDN design, IMHO. First, it reinforces my view that we really don’t need to control lower layers with SDN technology. Second, it casts the role of OpenFlow into a smaller opportunity space. If all the agility is at the software virtual network layer, we could still manage forwarding and connectivity there, but would we? If we can’t control network routing, only vSwitch routing, do we need to control anything at all? I think that there’s still a very strong argument for OpenFlow SDN utility in the data center, but I think that this whole argument calls the value of it end-to-end more into question. I’m not ready to throw out OpenFlow and SDN principles, but I am more and more convinced that we’re making OpenFlow and SDN a goal and not a route. We don’t have to do this; it will be done to the extent that it can offer tangible network benefits. Just making it possible to do more with OpenFlow or SDN doesn’t make it justifiable. We need to look at those justifications harder before we start writing standards that assume we’ve made the business case and we’re arguing over the details.
This is the problem with a bottom-up approach in a hype-driven market. If you say “OpenFlow” or “SDN” to the press you get a nice positioning on the article they write about you. Thus, you say it, whether the association is valuable or not, or even whether you mean it or not. That sort of thing can distort a market, but the cardinal sin here isn’t that distortion, but the fact that we’re still assuming our objective is to consume technology. Maybe that works for yuppies with iPhones, but not in data center or carrier networks. Being cool isn’t enough there, you have to be valuable.