T 4-6, 34-303
Prof. Prof. Robert Berwick, NE43-838, x8918 and Visiting Assistant Professor Dekang Lin
3-0-9
Prerequisites: Either previous experience with programming languages and compiler theory or previous experience with syntactic theory at the graduate level (e.g., 24.951). Permission of instructor required.
Enrollment limited to 15.
This course covers the design and use of "principle-based parsers" for computation in natural language, implementing linguistic theories, and developing new computational models that can efficiently process large corpuses. In particular, we implement the linguistic theory for universal grammar, develop parsing and compilation methods for universal grammar, and build a high-level research tool for modern linguistic theory. So far, with just 32 principles and 12 parameters, such systems have been able to reproduce many of the differences, some subtle, some not so subtle, between English, Japanese, Dutch, German, French, and Spanish, Bangla, Hindi, Korean, and Portuguese (ranging over thousands of sentence types).
How many more languages are possible to cover this way? How much more modern linguistic theory is doable? Can we retain computational efficiency without compromising linguistic theory or representation? How easy is it to modify and debug linguistic theory?
In this special topics course we explore these questions, at the same time teaching computer scientists how to do the linguistics and teaching linguistic and cognitive scientists how to do the computer work to extend such systems. The course is project-oriented.
Student teams can extend two implemented systems to cover (portions of) new languages or syntactic phenomena including, but not limited to, Dutch; German (and dialects); Danish; Yiddish; English; Swedish; Italian; Portuguese (Brazilian and other dialects); Arabic (including Lebanese and Palestinian dialects); Russian; Polish and other Slavic languages; Bangla; Hindi; Japanese; Mandarin Chinese; Welsh; Scots Gaelic; Hungarian, etc. Alternatively, for the computationally oriented, projects may include computational comparisons between alternative algorithms for parsing, including parallel computational designs. Cognitive scientists can formulate projects regarding the psycholinguistic implications of the systems.
The first few weeks of the course will introduce the two implemented systems that we will use in the lab, one based on Prolog, the other implemented in C. No prior knowledge of Prolog is assumed. Next, the computational and linguistic design of the two systems will be described, via the current working implementation as it relates to English and Japanese. (Supplementary instruction will be available as needed to introduce nonlinguists to current linguistic theory.) Following this introduction, students will gain familiarity with the use/debugging facilities of the two systems. Hands-on computer laboratory experience will be used to understand the linguistic theory as it is implemented and to gain experience with the systems, leading to project design and implementation.
An introductory organizational meeting will be held Friday, September 8, 4pm, in NE43-773.
In addition to the lectures, additional laboratory and tutorial time will be scheduled.
|
Created: Aug 21, 1995
|
Modified: Aug 25, 1995
|
Your comments
and inquiries are welcome.