Doctoral Thesis: Efficient Deep Learning with Sparsity: Algorithms, Systems, and Applications

Tuesday, April 30
2:00 pm - 4:00 pm

Grier Room B (34-401B)

By: Zhijian Liu

Supervisor: Song Han

Details

  • Date: Tuesday, April 30
  • Time: 2:00 pm - 4:00 pm
  • Category:
  • Location: Grier Room B (34-401B)
Additional Location Details:
Abstract:

Deep learning is used across a broad spectrum of applications. However, behind its remarkable performance lies an increasing gap between the demand for and supply of computation. On the demand side, the computational costs of deep learning models have surged dramatically, driven by ever-larger input and model sizes. On the supply side, as Moore’s Law slows down, hardware no longer delivers increasing performance within the same power budget.

In this thesis, I will discuss my research efforts to bridge this demand-supply gap through the lens of sparsity. I will begin with my research on input sparsity. First, I will introduce algorithms that systematically eliminate the least important patches/tokens from dense input data, such as images, enabling up to 60% sparsity without any loss in accuracy. Then, I will present the system library that we have developed to effectively translate the theoretical savings from sparsity to practical speedups on hardware. Our system is up to 3 times faster than the leading industry solution from NVIDIA. Following this, I will touch on my research on model sparsity, highlighting a family of automated, hardware-aware model compression frameworks that surpass manual solutions in accuracy and reduce the design cycle from weeks of human efforts to mere hours of GPU computation. Finally, I will demonstrate the use of sparsity to accelerate a wide range of computation-intensive AI applications, such as autonomous driving, language modeling, and high-energy physics.

Bio:

Zhijian Liu is a Ph.D. candidate at MIT, advised by Song Han. His research focuses on efficient machine learning and systems. He has developed efficient ML algorithms and provided them with effective system support. He has also contributed to accelerating computation-intensive AI applications in computer vision, natural language processing, and scientific discovery. His work has been featured as oral and spotlight presentations at conferences such as NeurIPS, ICLR, and CVPR. He was selected as the recipient of the Qualcomm Innovation Fellowship and the NVIDIA Graduate Fellowship. He was also recognized as a Rising Star in ML and Systems by MLCommons and a Rising Star in Data Science by UChicago and UCSD.

Host