Eric Mitchell – Making Language Models Useful

Wednesday, March 13
11:00 am - 12:00 pm

Hewlett 32-G882

Abstract
Large pre-trained language models such as GPT-3 are the engines of knowledge and capability underpinning powerful systems such as ChatGPT, Gemini, and Claude. Yet much like building a safe, comfortable vehicle requires more than a powerful engine, building a useful, beneficial language system requires additional techniques to promote key attributes such as controllability, factuality, and updatability. This talk will share my work towards imbuing large language models with these traits. I will first share the direct preference optimization algorithm, an algorithm for training language models to follow instructions in accordance with human preferences far more simply than prior methods. I will next discuss approaches for improving the factual reliability of language models, which is challenging even for models that generally follow user instructions well. Finally, I will share my work towards methods for updating individual model behaviors or beliefs that have fallen out-of-date or are otherwise problematic. I will conclude with several important topics for future work toward more useful, dependable AI systems, including unsupervised continual learning, scalable oversight, and robust reasoning.

Bio
Eric Mitchell is a final-year PhD student in Stanford’s Computer Science department, advised by Chelsea Finn and Christopher Manning. His research uses tools from machine learning to improve the usefulness and reliability of language models, in particular by developing techniques that enhance their controllability, factuality, and updatability. His work has appeared in ICML, NeurIPS, ICLR, and EMNLP, being recognized with an outstanding paper runner-up award at NeurIPS ‘23. His work, in particular the direct preference optimization algorithm, has been used widely in state-of-the-art open source and proprietary language models. He is a former Knight-Hennessy Scholar and received his BS from Princeton University.

Details

  • Date: Wednesday, March 13
  • Time: 11:00 am - 12:00 pm
  • Category:
  • Location: Hewlett 32-G882

Host