“Understand the Multimodal World with Minimal Supervision,” a Keynote Presentation from Yong Jae Lee

Yong Jae Lee, Associate Professor in the Department of Computer Sciences at the University of Wisconsin-Madison and CEO of GivernyAI, presents the “Learning to Understand Our Multimodal World with Minimal Supervision” tutorial at the May 2024 Embedded Vision Summit.

The field of computer vision is undergoing another profound change. Recently, “generalist” models have emerged that can solve a variety of visual perception tasks. Also known as foundation models, they are trained on huge internet-scale unlabeled or weakly labeled data and can adapt to new tasks without any additional supervision or with just a small number of manually labeled samples. Moreover, some are multimodal: they understand both language and images and can support other perceptual modes as well.

Professor Yong Jae Lee from the University of Wisconsin-Madison presents recent groundbreaking research on creating intelligent systems that can learn to understand our multimodal world with minimal human supervision. He focuses on systems that can understand images and text, and also touches upon those that utilize video, audio and LiDAR. Since training foundation models from scratch can be prohibitively expensive, he discusses how to efficiently repurpose existing foundation models for use in application-specific tasks.

Lee also discusses how these models can be used for image generation and, in turn, for detecting AI-generated images. He concludes by highlighting key remaining challenges and promising research directions. You will learn how emerging techniques will address today’s neural network training bottlenecks, facilitate new types of multimodal machine perception and enable countless new applications.

See here for a PDF of the slides.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top