Abstract: The field of computer vision is arguably seeing one of its most transformative changes in recent history. Convolutional neural networks (CNNs) have revolutionized the field, reaching super-human performance on some long-standing computer vision tasks, such as image classification. The success of these networks is fueled by massive amounts of human-labeled data. However this paradigm does not scale to a deeper and more detailed understanding of images, as it is simply too hard to collect enough human-labeled data. The issue is not that we humans don't understand the image, but we often struggle to convey enough information to successfully supervise a vision system.
In this talk I show how computer vision can go beyond massive human supervision. This involves designing better models that deal with fewer labels, exploiting easier and more intuitive annotations, or coming up with novel optimizations to train deep architectures with far fewer human annotations, or even without any at all. I'll focus on three long standing computer vision problems: semantic segmentation, intrinsic image decomposition and dense semantic correspondences.
Bio: Philipp Krähenbühl is a postdoctoral researcher at UC Berkeley. He received a B.S. in Computer Science from ETH Zurich in 2009, and a PhD in Computer Science from Stanford University in 2014. Philipp's research interests lie in Computer vision, Machine learning and Computer Graphics. He is particularly interested in deep learning, efficient optimization techniques, and structured output prediction.