Plenary Speakers

On Generating Image and Video Hallucinations
Sabine Süsstrunk (EPFL)

“Hallucination” is a term used in the AI community to describe the plausible falsehoods produced by deep generative neural networks. It is often considered a negative, especially in relation with large language models or medical image reconstruction. Yet, in many computational photography applications, we rely on such hallucinations to create pleasing images. It often does not matter if all (or any) information was present in the real world if the produced falsehoods are visually plausible. Starting from that premise, I will present our recent work on hallucinations in image reconstruction, semantic image editing, and novel view synthesis, using different generative models such as diffusion networks, neural radiance fields, and neural cellular automata. With a nod to the dangers some of these hallucinations might pose, I will also discuss our work on deep fake detection. 

Bio: Sabine Süsstrunk is Full Professor and Director of the Image and Visual Representation Lab in the School of Computer and Communication Sciences (IC) at the Ecole Polytechnique Fédérale (EPFL), Lausanne, Switzerland. Her main research areas are in computational photography and imaging, color computer vision, and computational image quality and aesthetics. She received the IS&T/SPIE 2013 Electronic Imaging Scientist of the Year Award for her contributions to color imaging, computational photography, and image quality, and the 2018 IS&T Raymond C. Bowman and the 2020 EPFL AGEPoly IC Polysphere Awards for excellence in teaching. Sabine is a Fellow of IEEE and IS&T, and President of the Swiss Science Council (SSC).

Embodied Foundation Models
Vincent Vanhoucke

Google

Large multimodal models are increasingly demonstrating state of the art performance against the most exacting computer vision benchmarks. Thanks to their ability to provide learned priors across diverse competencies such as language, logical reasoning, geometry, and visual semantics, they are also increasingly becoming a foundation for many other capabilities, ranging from language grounding to common sense understanding, planning and robot control. I’ll discuss our experience leveraging large multimodal models for embodied applications and how this may impact the direction of computer vision, robotics, and embodied AI at large.

Bio: Vincent Vanhoucke is a Distinguished Scientist and Senior Director of Robotics at Google. His research has spanned many areas of artificial intelligence and machine learning, from speech recognition to deep learning, computer vision, and robotics. His Udacity lecture series has introduced over 100,000 students to Deep Learning. He is President of the Robot Learning Foundation, which organizes the Conference on Robot Learning, now in its seventh year. He holds a doctorate from Stanford University and a diplôme d’ingénieur from the École Centrale Paris.

Imaging the Universe: Reflections on the Webb Telescope
Scott Acton

Ball Aerospace, Boulder, Colorado, USA

The James Webb Space Telescope (Webb) is currently operating ~ 1 million miles from Earth, at the second Lagrangian point. Webb’s infrared imagers are designed to look back into the early universe almost 13.5 billion years ago. Since the first observations were released in the summer of 2022, Webb has continued to produce visually stunning images. The analysis of these images is changing humanity’s understanding of the Universe; rewriting textbooks. This talk will present an overview of the Webb telescope and its imagery obtained during the past year. As a deployable segmented telescope, Webb had to be aligned after launch, via an image analysis-based “Wavefront Sensing and Controls” (WFSC) process. The talk will also highlight how, starting with positioning uncertainties on the order of a few millimeters, the alignment process correctly positioned Webb’s optics to within a handful of nanometers.

Speaker Biography

Scott Acton was the Wavefront Sensing and Controls Scientist for JWST and is currently a staff consultant at Ball Aerospace and Technologies Corp., in Boulder, CO, where he has worked for the past 20 years. Previously, Acton worked in the field of Adaptive Optics for the W. M. Keck Observatory, and for the Lockheed Missiles and Space Co. Acton studied Physics at Abilene Christian University, earned a PhD in Physics from Texas Tech, and served as a post-doc at the Kiepenheuer-Institut fuer Sonnenphysik in Germany. In 2016, Acton took a year off from his job to execute the “James Webb Space Telescope World Bicycle Tour.” He currently resides in Niwot, Colorado.