The availability of large textual data sets has made it possible to employ increasingly sophisticated statistical models to improve performance on language tasks. However, oftentimes these more complex models come at the cost of expanding the complexity of the underlying decoding problem. In this talk, I focus on the question of decoding in large-scale, statistical natural language systems, and describe algorithms that extend beyond common heuristic approaches to yield optimality guarantees.
The main tool I utilize, Lagrangian relaxation, is a classical idea from the field of combinatorial optimization. I will begin by giving a general background introduction to the method and describing its application to basic models in natural language processing. The body of the talk will focus on two core tasks in NLP, dependency parsing and machine translation, and will show how variants of this technique can be used to develop improved decoding algorithms for several widely-used versions of these tasks.
Committee: Michael Collins, Regina Barzilay, Tommi Jaakkola