Thesis Defense: Elena Sergeeva, Building small domain-specific masked language models vs. large generative models for clinical decision support and their effects on users.

Friday, May 2
10:00 am - 11:30 am

32-D463

Doctoral Thesis Title: Building small domain-specific masked language models vs. large generative models for clinical decision support and their effects on users.
Presenter: Elena Sergeeva
Presenter’s Affiliation CSAIL
Thesis Supervisor(s): Peter Szolovits

Date: 05/02/2025
Time: 10:00-11:30
Location if in person: 32-D463


Abstract:

The frequently adopted definition of knowledge defines it as “justified true belief”. As one may notice this definition presents some issue when applied to AI: it is unclear to which degree it is justified to use “humanizing” vocabulary like “belief” or “justification” when describing the performance of an AI system. Traditional explicit knowledge-representation based AI involves reasoning over symbolic representation of statements standing for such “justified true beliefs” , the modern connectionist methodology however replaces explicit reasoning with making a prediction based on a set of computations done over weighted continuous representations of the inputs. The continuous representations learned by such systems remain “black box-like” , where the only elements directly understandable by the human user are the model inputs and the model outputs. In the first part of this thesis I introduce a set of Masked-Language model transformer based models for a diverse set of medical natural language processing tasks including Named Entity Recognition, Negation Extraction and Relation extraction that perform as well or better then bigger prompt-and-generate transformer-based causal language models. In the second part of the thesis, I discuss the modern “prompt-and-generate” approach to natural language processing where both the inputs and the outputs of the model are word-like elements commonly referred to as “tokens”. I explore the nature of token based representation of input and look at the way token “meaning” is refined at each layer of the successive transformer computation. With respect to the outputs, I explore how people engage with AI generated sequence of tokens that people perceive as “explained” predictions.

Details

  • Date: Friday, May 2
  • Time: 10:00 am - 11:30 am
  • Category:
  • Location: 32-D463