Transparency has become a key desideratum of machine learning. Properties such as interpretability or robustness are indispensable when model predictions are fed into mission critical applications or those dealing with sensitive/controversial topics (e.g., social, legal, financial, medical, or security tasks). While the desired notion of transparency can vary widely across different scenarios, modern predictors (like deep neural networks) often lack any semblance of this concept, primarily due to their inherent complexity. In this thesis, we focus on a set of formal properties of transparency and design a series of algorithms to build models with these specified properties. In particular, these properties include:
(i) the model class (of oblique decision trees), effectively represented and trained via a new family of neural models,
(ii) local model classes (e.g., locally linear models), induced from and estimated jointly with a black-box predictor, possibly over structured objects, and
(iii) local certificates of robustness, derived for ensembles of any black-box predictors in continuous or discrete spaces.
The contributions of this thesis are mainly methodological and theoretical. We also emphasize scalability in large-scale settings. Compared to a human-centric approach to interpretability, our methods are particularly suited for scenarios that require factual verification or cases that are challenging to subjectively judge explanations by humans (e.g., for superhuman models).
COMMITTEE: Professors Luca Daniel, Stefanie Jegelka, Tommi Jaakkola
To attend this defense, please contact the doctoral candidate at guanghe at mit dot edu or the lab assistant at cataldo at csail dot mit dot edu