Raghuraman Krishnamoorthi, Software Engineer at Facebook, presents the “Practical DNN Quantization Techniques and Tools” tutorial at the September 2020 Embedded Vision Summit.
Quantization is a key technique to enable the efficient deployment of deep neural networks. In this talk, Krishnamoorthi presents an overview of techniques for quantizing convolutional neural networks for inference with integer weights and activations.
Krishnamoorthi explores simple and advanced quantization approaches and examine their effects on latency and accuracy on various target processors. He also presents best practices for quantization-aware training to obtain high accuracy with quantized weights and activations.
See here for a PDF of the slides.