“Vision-language Representations for Robotics,” a Presentation from the University of Pennsylvania

Dinesh Jayaraman, Assistant Professor at the University of Pennsylvania, presents the “Vision-language Representations for Robotics” tutorial at the May 2023 Embedded Vision Summit.

In what format can an AI system best present what it “sees” in a visual scene to help robots accomplish tasks? This question has been a long-standing challenge for computer scientists and robotics engineers. In this presentation, Jayaraman provides insights into cutting-edge techniques being used to help robots better understand their surroundings, learn new skills with minimal guidance and become more capable of performing complex tasks.

Jayaraman discusses recent advances in unsupervised representation learning and explains how these approaches can be used to build visual representations that are appropriate for a controller that decides how the robot should act. In particular, he presents insights from his research group’s recent work on how to represent the constituent objects and entities in a visual scene, and how to combine vision and language in a way that permits effectively translating language-based task descriptions into images depicting the robot’s goals.

See here for a PDF of the slides.

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

1646 N. California Blvd.,
Suite 360
Walnut Creek, CA 94596 USA

Phone
Phone: +1 (925) 954-1411
Scroll to Top