This thesis addresses fundamental tradeoffs in the design of dispatching policies in large-scale distributed service systems, motivated by examples such as cloud computing facilities and multi-core processors. A canonical framework for modeling such systems is provided by a parallel queueing model with N servers, where service requests arrive stochastically over time as a single stream of jobs of rate proportional to N, and where a central controller is responsible for all decisions.
The central controller makes decisions based on limited information about the state of the queues, which is conveyed through messages from servers to the dispatcher, and stored in a limited local memory. Our objective is to understand the best possible performance of such systems (in terms of stability region and delay) and to propose optimal policies, with emphasis on the asymptotic regime when both the number of servers and the arrival rate are large.
We study the tradeoffs between the resources available to the controller (memory size and message rate) and the achievable queueing delay performance and stability region of resource constrained dispatching policies. Our main findings are as follows.
1) Queueing delay vs. resources tradeoff: We propose a family of dispatching policies, indexed by the size of their memories and by the average message rate, and show that the expected queueing delay vanishes as N goes to infinity when either (i) the number of memory bits is of the order of log(N) and the message rate grows superlinearly with N, or (ii) the number of memory bits grows superlogarithmically with N and the message rate is at least as large as the arrival rate. Moreover, we develop a novel approach to show that, within a certain broad class of "symmetric" policies, every dispatching policy with a message rate of the order of N, and with a memory of the order of log(N) bits, results in an expected queueing delay which is bounded away from zero, uniformly as N goes to infinity.
2) Stability region vs. resources tradeoff: We propose a dispatching policy that requires a memory size (in bits) of the order of log(N) and an arbitrarily small (but positive) message rate, and show that it is stable for all possible server rates for which the entire system is underloaded. Moreover, we show that within a certain broad class of "weakly symmetric" policies, every dispatching policy with a message rate of the order of N, and with a memory size that grows sublogarithmically with N, results in a reduced stability region.
Thesis Supervisors: Profs. David Gamarnik and John N. Tsitsiklis