We are witnessing a fundamental shift in the computing landscape in recent years as much of the world's application workloads move to massive "cloud" data centers. These data centers enable the services we rely on daily, such as web search, social networking, and Internet commerce. They also require huge investments to deploy and operate. This has spurred significant interest in the industry and the research community in innovation for data centers, and in particular, for data center networks.
A crucial feature of a data center network is its transport mechanism: the method by which data is transferred from one server to another. Ideally, this should occur at the highest possible rate and with the lowest possible latency. But there is an inherent tension between these requirements, especially with today's state-of-the-art Transmission Control Protocol (TCP) in data centers. In this talk, I will present three data center transport designs with increasing levels of sophistication. First, I will describe Data Center TCP (DCTCP), a new congestion control algorithm for data centers that is now shipping with Windows Server 2012. DCTCP uses a simple modification to the TCP algorithm to achieve full throughput while reducing queueing latency in the switches by ~10x compared to TCP. I will present a control-theoretic analysis of DCTCP's control loop. I will then discuss HULL, an architecture that builds on DCTCP to deliver near-zero fabric latency: only propagation and switching latency, no queueing. Finally, I will describe pFabric, a minimalistic design that leverages application-level hints and very simple switch mechanisms to achieve near theoretically optimal scheduling of flows in the data center fabric.
Mohammad Alizadeh is a Researcher at Insieme Networks (recently acquired by Cisco Systems). He received his Ph.D. in Electrical Engineering from Stanford University where he was advised by Balaji Prabhakar. Before that, he completed his undergraduate degree in Electrical Engineering at Sharif University of Technology. His research interests are broadly in networked systems, data center networking, and cloud computing. His dissertation work focused on designing high performance data center transport mechanisms. His research has garnered significant industry interest: the DCTCP algorithm has been implemented in Windows Server 2012; the QCN congestion control algorithm has been standardized as the IEEE 802.1Qau standard; and most recently, the CONGA adaptive load balancing mechanism has been implemented in Cisco's new flagship Application Centric Infrastructure products. Mohammad is a recipient of a Stanford Electrical Engineering Departmental Fellowship, the Caroline and Fabian Pease Stanford Graduate Fellowship, and the Numerical Technologies Inc. Prize and Fellowship.