Steve Teig, Chief Technology Officer at Xperi, presents the "Beyond CNNs for Video: The Chicken vs. the Datacenter" tutorial at the May 2019 Embedded Vision Summit.
The recent revolution in computer vision derives much of its success from neural networks for image processing. These networks run predominantly in datacenters, where the training data consists mostly of photographs. Because of this history, the networks used for image processing fail to exploit temporal information. In fact, convolutional neural networks are unaware that time exists, leading to overly complex networks with strange artifacts. Remarkably, even the lowly chicken knows better, bobbing its head while walking to integrate information over time in modeling the world. Isn’t it time we learned from the chicken? In this presentation, Teig explores how we can.