Barzilay & team devise new software to tackle ancient language decipherement

June 30, 2010

Regina Barzilay, EECS associate professor and principal investigator in the Computer Science and Artificial Intelligence Lab (CSAIL), and Ben Snyder, a grad student in her lab, and the University of Southern California’s Kevin Knight will present a paper at the 2010 meeting of the Association for Computational Linguistics in Sweden on a new computer system that, in a matter of hours, deciphered much of the ancient Semitic language Ugaritic. In addition to helping archeologists decipher the eight or so ancient languages that have so far resisted their efforts, the work could also help expand the number of languages that automated translation systems like Google Translate can handle.

As reported by the MIT News Office, June 30, 2010, the new software makes several assumptions. The first is that the language being deciphered is closely related to some other language: In the case of Ugaritic, the researchers chose Hebrew. The next is that there’s a systematic way to map the alphabet of one language on to the alphabet of the other, and that correlated symbols will occur with similar frequencies in the two languages.

The system makes a similar assumption at the level of the word: The languages should have at least some cognates, or words with shared roots, like main and mano in French and Spanish, or homme and hombre. And finally, the system assumes a similar mapping for parts of words.

Recognizing the limits of the software, Barzilay noted about this latest software: “ is a powerful tool that can aid the human decipherment process.” And, a variation of it could also help expand the versatility of translation software.

Read more:

MIT News Office, June 30, 2010, reported by Larry Hardesty: Computer automatically deciphers ancient language

"A Statistical Model for Lost Language Decipherment" (pdf)