MIT Department of Electrical Engineering & Computer Science

E E C S

Information Retrieval and Linear Algebra

Ravi Kannan
Yale University

Thursday, September 30, 1999
4:15 PM (refreshments 4:00)
Room NE43-518
Theory of Computation Seminar

Abstract

We are concerned with rendering large document collections into human digestible form. Clustering has been successfully used as a tool for this. Most clustering algorithms are of a combinatorial flavor.

The talk will describe a different idea which relates clustering (in this and other contexts) to a Linear Algebra problem - the Singular Value Decomposition (SVD). Traditional SVD algorithms are too time consuming to be used here.

We describe a new randomized Monte Carlo algorithm for SVD where by doing Linear Algebra on a random sub-matrix of a matrix, one can infer an approximate SVD of the whole matrix.

We have implemented a system based on this to classify the documents returned by a key word(s) search by standard search engines. The system is fast enough to be used as a web search tool. Some thoughts on using the randomized Linear Algebra algorithm for Image Compression and other applications will also be described.

Joint Work with : P. Drineas, A. Frieze, S. Vempala, V. Vinay.


URL of this page: http://www-eecs.mit.edu/AY99-00/events/4.html
Created: Sep 24, 1999  | Modified: Sep 24, 1999
This event is from the MIT EECS 1999-00 archive.  | Current events
To MIT EECS home page  | Your comments and inquiries are welcome.