Dwith Chenna, Member of the Technical Staff and Product Engineer for AI Inference at AMD, presents the “DNN Quantization: Theory to Practice” tutorial at the May 2024 Embedded Vision Summit.
Deep neural networks, widely used in computer vision tasks, require substantial computation and memory resources, making it challenging to run these models on resource-constrained devices. Quantization involves modifying DNNs to use smaller data types (e.g., switching from 32-bit floating-point values to 8-bit integer values). Quantization is an effective way to reduce the computation and memory bandwidth requirements of these models, and their memory footprints, making it easier to run them on edge devices. However, quantization does degrade the accuracy of CNNs.
In this talk, Chenna surveys practical techniques for DNN quantization and shares best practices, tools and recipes to enable you to get the best results from quantization, including ways to minimize accuracy loss.
See here for a PDF of the slides.