Tom Michiels, System Architect for ARC Processors at Synopsys, presents the “How Transformers Are Changing the Nature of Deep Learning Models” tutorial at the May 2023 Embedded Vision Summit.
The neural network models used in embedded real-time applications are evolving quickly. Transformer networks are a deep learning approach that has become dominant for natural language processing and other time-dependent, series data applications. Now, transformer-based deep learning network architectures are also being applied to vision applications with state-of-the-art results compared to CNN-based solutions.
In this presentation, Michiels introduces transformers and contrasts them with the CNNs commonly used for vision tasks today. He examines the key features of transformer model architectures and shows performance comparisons between transformers and CNNs. He concludes with insights on why his company thinks transformers will become increasingly important for visual perception tasks.
See here for a PDF of the slides.