Amazon’s New Model for Data Center Switching

All the talk about the need to upgrade data center networks, like most talk these days, seems focused on AI. That’s just changed with what might be a very important announcement from Amazon, that talks about a major potential change in data center architecture and isn’t linked to AI at all.

Traditional data center LANs have been built as a tree, with layers of switches that connect in a multi-link way with those at the next layer down. This provides a resilient way of creating what’s effectively a mesh, but it imposes a greater and greater latency and cost burden as the size of the data center increases. A long-standing graph theory says that the best approach would be to create a “flat” network in which the switches connected among themselves at random. Such a network requires fewer switches and presents a nice predictable and linear relationship between network capacity loss and switch failure. The problem is that attempts to realize this happy outcome have been unsuccessful.

Amazon came up with an approach that mixes true randomness and deterministic behavior with what it calls “spraypoint”. In this approach, a source switch picks a neighbor at random and sends a packet to it. That’s the random part. The receiving neighbor uses traditional shortest-path rules to send the packet onward, which is the deterministic part. The connectivity is structured in “rings” connected by a “ShuffleBox”, which on one side connect too switches and on the other to other ShuffleBoxes. This means that when a new server or rack is added, you simply connect it to the local ShuffleBox, and no other cabling is needed. The way the rings and ShuffleBoxes are designed is based on modeling Amazon has built through simulation, so the data center operator can input the number of servers and the performance required, and the result is a ring/ShuffleBox configuration.

I’ll mention this again, because it’s important, but this is not an actual fabric. The ring-and-shuffle model means that the traffic will still take hops, and the number of rings and shuffles will influence latency. A true fabric would deliver better latency performance, but of course many “fabric” switches aren’t really any-to-any non-blocking. Just keep this in mind.

Amazon started proving this in at the end of 2024 in one data center, and in April of this year it was adopted as the default architecture for all new AWS data centers. It reduces cabling complexity, operational errors during updates to the data center, failures, and the number of switches needed versus the tree-hierarchy approach. The latter, of course, may be why a data center user like Amazon came up with this rather than a network equipment vendor.

I think Amazon’s move is a proof point for something I’ve said in a past blog; data center traffic is driven by more than just AI, and in fact “horizontalization” of application component traffic may be for most users the greater driver. It also, I believe, demonstrates that it’s inside the data center where AI models are hosted that the greatest network impact of AI is likely to be found, at least for the moment. The new strategy seems to answer some network and traffic questions, but not all of them.

First, it appears that this approach could be used for traditional and AI data centers, as long as you had a handle on the traffic loads to be handled. That’s something that’s possible through simulation but easier for those who already have a tree hierarchy in place supporting an application/server mix, and want to expand or improve it. Some of the enterprises who mentioned this approach to me had concerns that the dynamism of application configuration and usage might drive changes that would impact the design, but admit that’s true for any data center network model.

Second, some enterprises wonder whether the Amazon model might also reduce the number of servers needed to meet QoE objectives, by reducing horizontal latency. Would the traditional approach to a scaling problem perhaps involve adding servers to reduce response times? Amazon has not commented on this so far, but if they’d like to do so on my LinkedIn post or to me via email, I’m all ears.

Third, given the relentless focus of network vendors on AI traffic, could Amazon’s approach help or hurt vendors? It does look like you could buy less network gear with this model, and since data center switches are increasingly a target for revenue-hopeful vendors, might this derail some confidence in predicting sales growth due to AI? It’s clear that Amazon is already targeting a reduced switch spending target, and I hear both Google and Microsoft are doing the same.

Fourth, does this all mean that vendors who offer both servers/platforms and network switches will have an easier time? If there is a move to address AI or other horizontal traffic growth by remodeling networks to reduce switch count, the vendor who can also supply other gear like the servers generating the traffic will have greater influence and incentives than the one who has only network gear in their inventory. This could help HPE/Juniper, and potentially hurt Cisco, whose server business isn’t all that active. Or, perhaps, make Cisco get more server-aggressive?

Fifth, is this a full and best-of-the-lot solution? There are still hops, still potential variations in latency, versus a true any-to-any fabric. There also appears to be a risk that improper setup or careless changes in the configuration, including adding and removing things or even accidental unplugging of a cable, might set up a chain reaction. Still, as I’ve said in many customer tutorials in the past, “There’s no substitute for knowing what you’re doing”. It’s just this is a different sort of “doing” than most will be used to. For large-scale data centers who want to avoid full-fabric costs or who simply can’t get a true fabric with enough capacity, this seems to be a great idea. For smaller ones, I think it may offer minimal advantage over traditional layered switch models, and for AI model hosting that spreads across servers, the latency could be more of an issue.

The final, and perhaps most important question, is whether the Amazon move suggests we’re extending the more-for-less thinking that’s dominated enterprise IT planning for the last couple of decades. Enterprises have already shifted to the cost-savings model of planning, are the hyperscalers now doing the same? Could it be that AI hype is coming home to a lot of roosts?

Wall Street believes there’s a major risk that AI is a bubble. I’m of the view that it’s a PR-and-media bubble, and that it may well be approaching being over-invested based on current opportunities for revenue. If that’s true, one of the responses could be to try to cut costs without dissing the AI hype that’s keeping so many stocks afloat. A better response would be to work to find future opportunities for revenue, like those in real-world-real-time services.

Email and RSS:

Our Commitment: All the Facts, Always the Truth