Over a year ago, I wrote a blog for Network World called “Why it makes sense for Broadcom to buy VMware”. I postulated that VMware might combine with Broadcom-based data center switches to build a bridge between cloud and data center. Suppose, just suppose, they did. What might that mean for the cloud? A lot, but I want to do another layer of supposing here. Just suppose there’s more to it than that, suppose that a bigger and more important bridge might be on the table. Suppose it was the future of the telco network. I had a recent conversation with a Tier One technology lead that helped me organize my thoughts on this.
What we have today is a world of telcos and OTTs, of features and connections. The bottom level, the world of telcos and connections, is based on routers and fiber and other long-lived assets, on standards created over decades, all owned and deployed by companies that used to be public utilities and still have more than a whiff of that mission in their DNA. Failures here are not an option, and even major changes could demand writing off of billions in gear. The top level is build by startups, venture capitalists, The OTTs, the users, the features, are based on technology is really software, and often open-source, and things can be created, funded, and deployed in six months or so. If it works, people walk away and try something different. Two different worlds doesn’t cover it. Try two different universes…or maybe dimensions.
What’s linking these universes or dimensions is the reality of IP networking. Whether we’re talking about the Internet or a VPN or a combination of these, or perhaps something living inside the cloud and invisible, it’s still an IP network. The services created by the stuff at the bottom are IP services, consumed by the stuff that’s over the top. It’s assumed for the OTTs, and anything other than it can’t even be consumed, so it’s what the telcos have to generate. It’s what’s been commoditizing, generating ever-declining profit per bit.
The obvious response to a profit-per-bit problem is to try to sell something other than naked bits, but what? In our current world, what isn’t in the network is on it. What’s on it is content, cloud, and stuff that first of all isn’t what telcos know about and second is already being provided by a bunch of other players whose names are household words. Not a great thing to try to break into. Anything new in the on-the-network space would be at least as available to exploit by those OTTs and cloud providers. What may be needed is a new vision of the network, which would allow a more software-intensive and even software-centric view of what’s in the network, and is thus the rightful domain of the telcos. However, any such initiative would generate a significant transformation cost, and that would demand a fairly certain and rapid payback.
The logical way for telcos to approach this would be to identify a driver application set that would represent a major future opportunity and then frame a service model to address it with as much agility as possible. As I said in an earlier blog, I think that the next step in services is related to integrating them with real-time user/worker behavior, an IoT or digital-twin approach that would resemble the industrial/social metaverse model. To support that, the features would have to be highly scalable and distributable, making it likely they’d be like functions or microservices and would also likely be linked in a service mesh. That could mean that a “service-mesh service” would be attractive.
A service mesh is a logical connection web created to link dynamic software elements. Google’s Istio is the best-known, but there are other options out there. The typical service mesh uses a sidecar software element that’s a proxy for the network and that provides connection management and (often) load balancing. All this is built, of course, on an IP network, and there’s a necessary mapping of IP addresses and logical component identities.
But here’s the interesting thing. Broadcom and other switching chip vendors build chips that are essentially flow machines, programmable to move any identifiable set of packets based on any defined properties. You could flow-program a chip to move packets with no IP addresses or IP network features. So, suppose that you created a software-process-centric identifier in place of an IP address. Suppose that your orchestration tool, Kubernetes for example, generated the necessary identifier as a part of deployment, and suppose that it could then (remember that Kubernetes supports virtual-network plugins) communicate the necessary flow programming to the specific devices that made up the mesh. What you’d be doing is to replace the traditional IP control plane with a control plane based on software deployment. If your goal is to connect software components, that’s very logical, and it has potentially significant benefits.
The first benefit is that you now don’t have a classic example of the dog/tail paradox. The goal of a mesh is to connect software, but the process of placing software and the process of connecting it are disconnected. What you really want is the software dog to wag the network tail, which in the current setup requires complicated coordination that takes time and resources. A direct software-centric connection control process would eliminate all of that.
This approach would work within a cloud/cluster configuration, with white-box devices based on switching chips, and a control-plane link to Kubernetes. It would work within and between racks, and even between data centers. It would also work in mobile networks, substituting the 5G control plane for the Kubernetes control plane, or using Kubernetes (supplemented perhaps by Nephio) to deploy the 5G elements.
The second benefit is the reduction in complexity and latency. Where something is, what it is, and how to reach it are typically different things in a mesh of components, each requiring its own level of decoding and each having to be associated to create a unified structure. In a flow-centric approach, everything is flattened into a single structure that’s automatically consistent.
One argument that would almost surely be raised against this is that the things the network connects, whether software or CPE, would expect to see an IP interface, but all that means is that the expected interface would have to be emulated at the points of attachment where IP was expected. Same would be true for IP services like DNS or DHCP.
So what does this have to do with Broadcom and VMware? Obviously Broadcom is a giant in the switching chip space, and their stuff is the basis of much of (and IMHO the best of) the white-box devices. If anyone could handle the flow programming needed here, they’d be the ones. VMware is a giant in the virtualization space, one of if not the major provider of enterprise software tools that support things like microservice deployments. If anyone could create a flow-control-plane adjunct to Kubernetes they’d be the ones. Put the two together and you have a potentially game-changing player.
Potential isn’t realization, though. First, there are still questions on whether the deal with VMware will go through by the deadline; China has been holding it up. Second, there’s no guarantee that Broadcom sees things as I’ve described, and finally there’s no certainty that even if they do, they’ll be able to execute. The whole concept, and perhaps the opportunity for telcos to re-imagine their future, could be at risk if any of those questions aren’t answered properly. That’s because it’s hard to see how any other player could assemble enough of the pieces to be assured a big win if they take the risk of promoting the idea. It may literally depend on Broadcom.