Abstract: Modern technological developments have enabled the acquisition and storage of increasingly large-scale, high-resolution, and high-dimensional data in many fields. Yet in domains such as biomedical data, the complexity of these datasets and the unavailability of ground truth pose significant challenges for data analysis and modeling. In this talk, I present new unsupervised geometric approaches for extracting structure from large-scale high-dimensional data. By looking deep within the spectrum of the graph-Laplacian, we define a new robust measure, the Spectral Embedding Norm, to separate clusters from background, and demonstrate its application to both outlier detection and data visualization. This measure further motivates a new greedy approach based on Local Spectral Viewpoints for identifying overlapping high-dimensional clusters while disregarding noisy clutter. We demonstrate our approach on two-photon calcium imaging data, successfully extracting hundreds of individual neurons. Finally, to address the computational complexity of constructing graphs from large-scale data, we present a new randomized near-neighbor graph construction. Compared to the traditional k-nearest neighbor graph, using our near-neighbor graph for spectral clustering on datasets of a few million points is two orders of magnitude faster and more memory-efficient, while achieving similar clustering accuracy.
Bio: Gal Mishne is a Gibbs Assistant Professor in the Applied Mathematics program at Yale University, working with Ronald Coifman. She received her Ph.D. in Electrical Engineering in 2017 from the Technion, under the supervision of Israel Cohen. She holds B.Sc. degrees (summa cum laude) in Electrical Engineering and Physics from the Technion, and upon graduation worked as an image processing engineer for several years. Gal is a 2017 Rising Star in EECS and Emerging Scholar in Science.
Host: Elfar Adalsteinsson