Doctoral Thesis: Optimization and Generalization of Minimax Algorithms
34-401B
Sarath Pattathil
Abstract:
This talk explores robust formulations of machine learning and multi-agent learning problems, focusing on algorithmic optimization and generalization performance. The first part of the talk delves into the smooth convex-concave minimax problem, analyzing well-known algorithms such as Extra-Gradient (EG) and Optimistic Gradient Descent Ascent (OGDA). We derive optimal convergence rates for these algorithms in the convex-concave setting. We show that these algorithms work effectively due to their approximation of the Proximal Point (PP) method, which is vastly superior but impractical to implement. We will also talk about some extensions to structured nonconvex-nonconcave problems. The second part of the talk addresses the issue of generalization. In many cases, such as GANs and adversarial training, the objective function for finding the saddle point can be written as an expected value over the data distribution. However, since we often do not have direct access to this distribution, we solve the empirical problem instead, which involves averaging over the available dataset. We aim to evaluate the quality of solutions to the empirical problem compared to the original population problem. Existing metrics employed to assess generalization in the minimax setting are found to be inadequate, prompting the proposal of a new metric, the primal gap, that overcomes these limitations. This novel metric is then utilized to investigate the generalization performance of popular algorithms like Gradient Descent Ascent (GDA) and Gradient Descent-Max (GDMax).
Committee:
Asu Ozdaglar, EECS, MIT (advisor)
Costis Daskalakis, EECS, MIT
Gabriele Farina, EECS, MIT
Kaiqing Zhang, ECE, UMD
Details
- Date: Tuesday, May 23
- Time: 10:00 am - 11:30 am
- Category: Thesis Defense
- Location: 34-401B