Currently, in machine learning there is intense interest in nonconvex optimization. This interest is fueled by the rise of deep neural networks, and also by other more complex tasks in related areas. Although an understanding of why neural networks work so well remains elusive, there has been impressive progress in algorithms, software, and systems for nonconvex optimization.
But in today's talk, I want to take a step back from algorithmic advances (fast stochastic gradient, escaping saddle-point, etc.) --- I want to instead draw your attention to a new set of tools that expand our repertoire of nonconvexity. In particular, my focus will be on using geometry to develop a rich subclass of nonconvex problems that can be solved to global optimality (or failing that, at least solved numerically more efficiently).
This subclass is built on the notion of geodesic convexity, a concept that generalizes the usual vector-space (linear) convexity to nonlinear spaces. I will outline how geometric thinking leads to improved models or insights for fundamental tasks in machine learning and statistics, including large-scale principal components analysis, metric learning, and Gaussian mixture models. I will outline both theoretical and practical aspects, wider connections of our results, and conclude with a broad outlook and open problems.