Most applications of machine learning across science and industry rely
on the holdout method for model selection and validation. Unfortunately,
the holdout method often fails in the now common scenario where the
analyst works interactively with the data, iteratively choosing which
methods to use by probing the same holdout data many times.
In this talk, we apply the principle of algorithmic stability to design
reusable holdout methods, which can be used many times without losing the
guarantees of fresh data. Applications include a model benchmarking tool
that detects and prevents overfitting at scale.
We conclude with a bird's eye view of what algorithmic stability says
about machine learning at large, including new insights into stochastic
gradient descent, the most popular optimization method in contemporary
Moritz Hardt is a senior research scientist at Google Research. After
obtaining a PhD in computer science from Princeton University in 2011,
he worked at IBM Research Almaden on algorithmic principles of machine
learning. Then he moved to Google to join the Foundations of Applied
Machine Learning group where his mission is to build guiding theory and
scalable algorithms that make the practice of machine learning more
reliable, transparent, and effective.