We study a series of topics involving approximation algorithms and the presence of uncertain data in optimization. Under the first theme of approximation, we derive performance bounds for rollout algorithms. Interpreted as an approximate dynamic programming algorithm, a rollout algorithm estimates the value-to-go at each decision stage by simulating future events while following a heuristic policy, referred to as the base policy. While in many cases rollout algorithms are guaranteed to perform as well as their base policies, there have been few theoretical results showing strict improvements in performance. We provide a probabilistic analysis of knapsack problems, proving that rollout algorithms perform significantly better than their base policies.
Next, we study the average performance of online greedy matching algorithms on random bipartite graphs. In the online model, vertices on one side of the graph are given up front while vertices on the other side arrive sequentially. When a vertex arrives, its neighboring edges are revealed and it must be immediately matched or dropped. We determine the asymptotic matching sizes obtained by greedy algorithms on both Erdős–Rényi and random regular graphs. Under the G(n,n,p) model, we show that the basic greedy algorithm, which assigns each vertex to a random unmatched neighbor, has a performance ratio of at least 0.837 for all monotonic functions p = p(n). We show that the Ranking algorithm of Karp, Vazirani, and Vazirani, which has a worst-case competitive ratio of 0.632, performs equivalently to the greedy algorithm under G(n,n,p), and thus has an average-case performance ratio of at least 0.837. Ordinary graphs are also considered.
Moving to the second theme of uncertainty, we analyze losses resulting from uncertain transition probabilities in Markov decision processes. We assume that policies are precomputed using exact dynamic programming with the estimated transition probabilities, but the system evolves according to different, true transition probabilities. Given a bound on the total variation error of estimated transition probability distributions, we derive a general tight upper bound on the loss of expected total reward.
Finally, we consider a randomized model for minmax regret in combinatorial optimization under uncertainty. The minmax regret problem can be viewed as a zero-sum game played between an optimizing player and an adversary, where the optimizing player selects a solution and the adversary selects costs with the intention of maximizing the regret of the player. Existing minmax regret models consider only deterministic solutions/strategies, and minmax regret versions of most polynomial solvable problems are NP-hard. We analyze a model where the optimizing player selects a probability distribution (corresponding to a mixed strategy) over solutions and the adversary selects costs with knowledge of the player's distribution, but not its realization. We show that under this randomized model, the minmax regret version of any polynomial solvable combinatorial problem becomes polynomial solvable. This holds true for both interval and discrete scenario representations of uncertainty.
Thesis Supervisor: Prof. Patrick Jaillet