"Using Deep Learning for Video Event Detection on a Compute Budget," a Presentation from PathPartner Technology

Praveen Nayak, Tech Lead at PathPartner Technology, presents the "Using Deep Learning for Video Event Detection on a Compute Budget" tutorial at the May 2019 Embedded Vision Summit.

Convolutional neural networks (CNNs) have made tremendous strides in object detection and recognition in recent years. However, extending the CNN approach to understanding of video or volumetric data poses tough challenges, including trade-offs between representation quality and computational complexity, which is of particular concern on embedded platforms with tight computational budgets. This presentation explores the use of CNNs for video understanding.

Nayak reviews the evolution of deep representation learning methods involving spatio- temporal fusion from C3D to Conv-LSTMs for vision-based human activity detection. He proposes a decoupled alternative to this fusion, describing an approach that combines a low-complexity predictive temporal segment proposal model and a fine-grained (perhaps high- complexity) inference model. PathPartner Technology finds that this hybrid approach, in addition to reducing computational load with minimal loss of accuracy, enables effective solutions to these high complexity inference tasks.

If you're building AI or vision-enabled products, you've come to the right place.

“Using Deep Learning for Video Event Detection on a Compute Budget,” a Presentation from PathPartner Technology

Pages

Topics

Contact

Address

Phone