Doctoral Thesis: Discovering, Optimizing, and Designing Novel Proteins
32-G449: Patil/Kiva Seminar Room
By: Itamar Chinn
Details
- Date: Wednesday, April 29
- Time: 2:30 pm - 4:30 pm
- Location: 32-G449: Patil/Kiva Seminar Room
Additional Location Details:
I will discuss machine-learning methods for discovering, optimizing, and designing proteins with biomolecular functions. Recent advances in protein modeling have made it possible to predict structures at scale and learn powerful sequence representations, but many of the questions that matter most for biology and engineering remain fundamentally functional: what a biomolecule does, in a specific biochemical and cellular context, and how that function can be discovered or redesigned under real experimental constraints.
This thesis develops models for protein function across multiple linked settings. First, I describe reaction-conditioned methods for enzyme discovery, including CLIPZyme, which frames enzyme discovery as retrieval in a shared reaction–protein embedding space, and an ML-guided discovery pipeline in which we identified enzymes for nucleobase amino acid biosynthesis. Second, I present MutaGen, a ranking-based discrete flow matching approach to directed evolution that learns from ranked sequence pairs rather than brittle scalar fitness measurements, enabling protein optimization in low-budget and noisy assay settings. Finally, I discuss models of protein function in cellular and systems context, including sequence-based prediction and design of condensate localization and additional work on virtual metabolism under genetic perturbations.
Host
- Itamar Chinn
- Email: itamarc@mit.edu