Arnaud Collard, Technical Leader for Embedded AI at 7 Sensing Software, presents the “Cutting-edge Memory Optimization Method for Embedded AI Accelerators” tutorial at the May 2024 Embedded Vision Summit.
AI hardware accelerators are playing a growing role in enabling AI in embedded systems such as smart devices. In most cases NPUs need a dedicated, tightly coupled high-speed memory to run efficiently. This memory has a major impact on performance, power consumption and cost. In this presentation, Collard dives deep into his company’s state-of-the-art memory optimization method that significantly decreases the size of the required NPU memory. This method utilizes processing by stripes and processing by channels to obtain the best compromise between memory footprint reduction and additional processing cost.
Through this method, the original neural network is split into several pieces that are scheduled on the NPU. Collard shares results that show this technique yields large memory footprint reductions with moderate increases in processing time. He also presents his company’s proprietary ONNX-based tool that automatically finds the optimal network configuration and schedules the subnetworks for execution.
See here for a PDF of the slides.