Abstract: Cloud computing promises flexibility, high performance and low cost. However, despite its prevalence, most datacenters hosting cloud services still operate at very low utilization, posing serious scalability concerns.
The goal of my work is to improve the scalability of these systems, by increasing their utilization, while guaranteeing high performance for each submitted application. A crucial system component to achieve this goal is the cluster manager; the system that orchestrates where applications are placed and how many resources they receive. In this talk, I will describe a new approach in cluster management that relies on two main insights. First, it automates resource management by leveraging practical data mining techniques. Second, it simplifies the responsibility of the user through a high-level, declarative interface that centers around performance, not resource reservations. Using these insights, I designed and built a datacenter scheduler (Paragon), a cluster manager (Quasar), and scalable provisioning systems for public clouds. In settings with several hundred servers, I demonstrated that this approach achieves high application performance and improves system utilization by over 2x. Several production systems, including Twitter and AT&T, have since adopted similar cluster management approaches.
Bio: Christina Delimitrou is a PhD candidate in the Electrical Engineering Department at Stanford University, working in computer architecture and systems. As part of her PhD work, she built practical systems for cluster management and scheduling in large-scale datacenters. She is the recipient of a Facebook Research Fellowship and a Stanford Graduate Fellowship. Christina has earned an MS from Stanford and a diploma in Electrical and Computer Engineering from the National Technical University of Athens.