Doctoral thesis: On Structure, Parallelism, and Approximation in Modern Neural Sequence Modeling

Friday, February 6
9:30 am - 10:30 am

Star 32D-463, Stata

Doctoral Thesis Title: On Structure, Parallelism, and Approximation in Modern Neural Sequence Modeling

Presenter: Morris Yau 

Presenter’s Affiliation (CSAIL, RLE, LIDS, MTL, etc.): CSAIL

Thesis Supervisor(s): Jacob Andreas, Readers: Yoon Kim, Stefanie Jegelka, Ankur Moitra 

Date: 2/06/2026

Time: 9:30-10:30 

Location if in person: Star 32D-463, Stata 

Abstract:  Is there an algorithm that learns the best fit parameters of a Transformer to any dataset? If I trained a neural sequence model and promised you it is equivalent to a program, how would you even be convinced? Modern RNNs are functions that admit parallelizable recurrence; what is the design space of parallelizable recurrences? Are there unexplored function families that lie between RNNs and Transformers? We explore these questions from first principles starting with state, polynomials, and parallelism.

Details

  • Date: Friday, February 6
  • Time: 9:30 am - 10:30 am
  • Category:
  • Location: Star 32D-463, Stata