Doctoral Thesis: Algorithms and Barriers in Inference in High-Dimensional Statistics and in Random Combinatorial Structures
Eren C. Kizildag
Problems involving randomness and high-dimensionality are ubiquitous across many scientific disciplines, spanning from machine learning and statistics to medicine and social sciences. While such problems are based on solid foundational mathematical theories, such as high-dimensional probability and high-dimensional statistics, computational tasks involving such “average-case” problems often appear quite challenging, lacking formal algorithmic hardness.
In this thesis, we focus on certain algorithmic problems arising from the study of random combinatorial structures and high-dimensional statistical inference tasks. We study them from a computational perspective through the insights gained from statistical physics. For the purposes of this talk, we present an excerpt from the thesis, focusing on three models.
Our first focus is on the symmetric binary perceptron model (SBP), a toy one-layer neural network model and a random constraint satisfaction problem. The SBP exhibits a dramatic statistical-to-computational gap (SCG) as well as a striking conundrum: the existence of polynomial-time algorithms coincides with an extreme form of clustering, where almost all of its solutions are isolated singletons at all subcritical densities. This conundrum challenges the view that clustering implies algorithmic hardness. In order to explain true algorithmic tractability of this model, we conduct a different landscape analysis. Guided by insights from statistical physics, we show that the SBP exhibits the multi Overlap Gap Property (m-OGP), an intricate geometrical barrier for large classes of algorithms. Our analysis shows the m-OGP threshold (a) is well below the satisfiability threshold; and (b) asymptotically matches the performance of the best polynomial-time algorithms. Next, we establish that m-OGP is a barrier to stable algorithms and we conjecture that the onset of m-OGP marks in fact the onset of true algorithmic hardness. The proof of this barrier uses Ramsey Theory from extremal combinatorics. We furthermore show that a known algorithm by Kim-Roche for the perceptron model is stable. Unlike prior models, this algorithm does not appear to be captured by known classes of algorithms, such as the approximate message passing and low-degree polynomials.
Our next focus is on the random number partitioning problem (NPP), a problem that has many applications, including the design of randomized controlled trials and cryptography. The NPP exhibits yet another SCG. In order to address this gap, we establish the presence of OGP and subsequently leverage it to show that (a) stable algorithms fail to find near-optimal solutions and that (b) a very natural Monte Carlo Markov Chain dynamics mixes slowly. When m is constant, we show the presence of m-OGP for linear energy exponents as well as its absence for sub-linear exponents. Interestingly, however, by considering overlaps with growing values of m, we show the presence of OGP for a much broader range of energy exponents.
Our final focus is on the Sherrington-Kirkpatrick (SK) spin glass model, a mean-field model for disordered magnetic alloys. We establish that the algorithmic problem of exactly computing the partition function of the SK model is #P-hard on average under both a real-valued computational model as well as a finite-precision arithmetic one. To the best of our knowledge, this is the first statistical physics model with a rigorous hardness guarantee based on standard complexity-theoretical assumptions.
- Date: Tuesday, May 3
- Time: 1:30 pm - 3:00 pm
- Location: E60-112