Thursday, October 2, 2014

Is it the end of the TOR as we know it?

There are two driving forces that may break the TOR (Top of Rack) switch market.

The first one is the spiraling fall of switch prices. Competition among white box switch makers Cumulus and Pica8, incumbents such as Juniper and Cisco and new startups Arista are pushing margins ever so slim. Pretty much all the vendors sell the same exact TOR switch. Currently it’s a 40x10GE port switch with 4x40G uplinks with a Broadcom chip set. The only difference really is the Operating System. All this competition basically means a race to the bottom. I see a few outcomes of this market. 

If you’re an enterprise like Facebook or Google, you have the technical savvy and economics of scale, to OEM just the physical hardware components and put on it a stripped down version of linux that does specifically what you need. Facebook already has a blue print for one called the Wedge. This means, none of the vendors can sell to these customers.

Next you have a customer that is DevOps savvy, but doesn’t want to build their own switches. These can be startups or large customers that have a good engineering team. They probably buy a TOR switch such as Pica8 or Cumulus and automate on top of these boxes. They don’t need much support.

Last you have a customer who just needs support so they just buy a brand named switch. These can be large Enterprises that are not technical enough to program their switches. These are the Coca-Colas and traditional companies where technology is not in their wheel house. I mean does Coca-Cola need SDN? 

The second driving force is on the physical hardware side. Intel has announced the roadmap of the Rack Scale Architecture (RSA). They want to dominate the cloud computing infrastructure by providing mega scale data centers with a highly scalable solution. To understand what this means we have to see the trends of the current data center.




Currently in a typical rack you have several servers. Each server is usually kept in a server tray or server blade. If you break it down, a server has the following components. A CPU, Memory, Storage (HDD or SSD) and a Network Card. You might have some other peripherals, but you don’t really need them. All of these components interconnect via a PCIe fiber connection. Now this PCIe fiber connection is the key component.






If you disaggregate servers, you can create pools of resources. You can have a pool or tray of CPUs, a pool of Memory and a pool of storage. When you need to upgrade, you pull out the tray and  swap in a new tray of resources. To connect all these together, Intel created silicon photonics PCIe (SiPh). These have speeds between 40-50Gbps in a single optical fiber. If you think about it, a majority of traffic runs east and west in a datacenter. With SiPh you can send traffic, intra-Rack at 50Gbs. 


You don’t need a TOR switch because you can do this using a virtual switch and SiPh. You can bypass ethernet connections by connecting directly to another logical server over the PCIe connection. Why waste time going out an ethernet port up to a ToR switch and go back down to another NIC to get to a server when you can directly. You can shave even more nano seconds of networking speed.

Now you can say that traffic going between logical servers could eat up CPU, therefore you need a ToR switch to speed this up.  If you need hardware acceleration, then Intel could just place a Fulcrum Switch Chip in the rack as a resource. North/South traffic or Inter-Rack traffic would egress the RSA via uplink ethernet connections. Again with RSA, you could create a separate Ethernet Resource pool of NIC for interconnects. Some people may wonder why? I mean have separate servers does the job just nicely. Also with all the resources pooled together, you really need  DevOps team that can program the Rack. Well if I was Intel I'd target the Financial Industry. They would pay for the shaving of speeds that RSA could save. They have an Army of DevOps who can do this.

No comments:

Post a Comment