A major challenge in machine learning is to reliably and automatically
discover hidden structure in data with little or no human intervention.
Many of the core statistical estimation problems of this type are, in
general, provably intractable for both computational and
information-theoretic reasons. However, much progress has been made over
the past decade or so to overcome these hardness barriers by focusing on
realistic cases that rule out the intractable instances. In this talk,
I'll describe a general computational approach for correctly estimating a
wide class of statistical models, including Gaussian mixture models, Hidden
Markov models, Latent Dirichlet Allocation, Probabilistic Context Free
Grammars, and several more. The scope of the approach extends beyond the
purview of previous algorithms; and it leads to both new theoretical
guarantees for unsupervised learning, as well as fast and practical
algorithms for large-scale data analysis.
Daniel Hsu is a postdoc at Microsoft Research New England. Previously, he
was a postdoc with the Department of Statistics at Rutgers University and
the Department of Statistics at the University of Pennsylvania from 2010 to
2011, supervised by Tong Zhang and Sham M. Kakade. He received his Ph.D.
in Computer Science in 2010 from UC San Diego, where he was advised by
Sanjoy Dasgupta; and his B.S. in Computer Science and Engineering in 2004
from UC Berkeley. His research interests are in algorithmic statistics and