“Small computer vision by combining optics and computation”
Wednesday, April 14 at 1:00pm
Email email@example.com for Zoom info
Computational visual sensing combines efficient computation and novel optics to provide computer vision systems that minimize size, weight, and power while maximizing speed, accuracy, and functionality. In these systems, signals are processed both optically and electronically, using the strengths of each to maximally exploit the visual characteristics of an environment. Computational visual sensors are often inspired by nature, where invertebrates and other small animals have evolved with optics and neural wirings that synergize to perceive with extremely small size and power.
In this talk, I will present a class of small computational sensors for passively measuring depth. They are inspired by the eyes of jumping spiders, which have specialized, multi-layered retinae and sense depth using optical defocus. We make three main contributions in inventing these sensors. First, we propose a novel, physics-based depth cue leveraging differential changes of image defocus. The depth cue takes the form of a simple analytical expression and is robust to optical non-idealities. Second, we incorporate the physics-based depth cue into the design of a neural network, which yields an efficient computational graph that predicts per-pixel depth and confidence at below 700 floating-point operations (FLOPs) per output pixel. Third, we designed two optical setups to pair with the computation. One setup consists of a deformable lens and has an extended working range due to its ability to perform optical accommodation. The other uses a specially-designed metasurface, which is an ultra-thin, flat nanophotonic device with versatile wavefront shaping capability. It is one of the world’s first nanophotonic depth sensors. Both setups are monocular and compact, and both can measure scene depth at 100 frames per second. Our computational depth sensors are one example of how optics and computation will evolve in the future to achieve artificial visual sensing on tiny platforms where vision is currently impossible.
Qi Guo is completing his Ph.D. degree in electrical engineering at Harvard University. His research combines emerging optical technologies with algorithms for visual sensing that are both derived from physics and driven by data. He received his bachelor’s degree in automation from Tsinghua University, China, and he has interned at Nvidia and Facebook Reality Labs. He was awarded the Best Student Paper as a co-author at the European Conference on Computer Vision (ECCV) in 2016 and the Best Demo at the International Conference on Computational Photography (ICCP) in 2018.