This blog post was originally published at Lattice Semiconductor’s website. It is reprinted here with the permission of Lattice Semiconductor.
As you are no doubt aware, Embedded Vision Summit has been the premier conference and expo devoted to practical, deployable computer vision and visual artificial intelligence (AI) and machine learning (ML). Not surprisingly, Lattice is a regular exhibitor and presenter at the summit as it gives us a great opportunity to highlight our latest and greatest devices, tools, technologies, and solutions.
I was a speaker at the event this year, and presented about how flexibility in the AI/ML development flow is key to bringing products to market on time, within budget, and with the right value proposition. In particular, I focused on AI/ML pipelines that are targeted towards resource-constrained devices to be deployed on the Edge where the internet meets, and interfaces with, the real world.
Most Edge applications feature different types of sensors involving a variety of electrical interfaces and communications protocols. The data acquired from these sensors typically requires pre-processing and aggregation before being fed into the AI/ML engine for inferencing purposes. In many cases, it is required to perform sensor fusion; the process of combining sensor data derived from disparate sources as the combined information has less uncertainty than if those sources were to be analyzed individually.
Similarly, the neural network AI/ML inference engine itself needs to be flexible. New network architectures are constantly being introduced and new types of operators are continually arriving on the scene. If a development team’s AI/ML engine isn’t flexible enough to be able to take full advantage of these new developments, then their competitors are going to “eat their lunch” as the old saying goes.
Following inferencing, the results need to be presented to the target audience, which may be humans and/or other machines and systems. In all cases, there will be a need to post-process the inferencing results before presenting them to their intended audience(s).
The bottom line is that there’s a need for flexibility at every step in the development pipeline. This is why we created our Lattice sensAI™ solution stack.
The sensAI solution stack
This stack includes hardware platforms, IP cores, software tools, reference designs and demos, and custom design services. In the case of hardware platforms, developers can choose from a variety of FPGA families, including the Lattice iCE40™ UltraPlus (for ultra-small, ultra-low-power applications), MachXO3D™ (for platform management and security), CrossLink™-NX (for embedded vision applications), and ECP5™ ( for general-purpose applications).
FPGAs provide programmable input/outputs (I/Os) that can be configured to support different electrical interface standards, thereby allowing them to interface with a wide variety of sensors. Lattice also offers a cornucopia of hard and soft IP blocks to support different communications protocols
The next step is to perform preprocessing and data aggregation, for which the sensAI stack offers a suite of IP cores for tasks like cropping, resizing, and scaling. The programmable FPGA fabric allows these computationally intensive data processing tasks to be performed in a massively parallel fashion, thereby offering high performance while consuming little power.
As was previously noted, the neural network itself requires a lot of flexibility to be able to support new network topologies and operators. The sensAI stack includes a suite of soft IP neural network cores and accelerators that can be modified quickly and easily to fully address the evolving AL/ML landscape.
As yet another example of flexibility, among the sensAI stack’s multiple implementation examples are two of particular interest. In one, the AI/ML inferencing engine is implemented in programmable fabric, thereby providing a low-power, high-performance solution suitable for high-speed image processing applications. In the other, the neural network inferencing engine is implemented on a RISC-V processor core, thereby providing an ideal solution for AI/Ml tasks that can run quietly in the background, such as predictive maintenance applications.
Introducing sensAI Studio
One of the really “hot news” items Lattice is announcing at the Embedded Vision Summit is that the latest and greatest version of the stack, sensAI 4.0, includes support for the new Lattice sensAI Studio design environment, which facilitates end-to-end AI/ML model training, validation, and compilation.
New to the sensAI solution stack is Lattice sensAI Studio, a GUI-based tool for training, validating, and compiling ML models optimized for Lattice FPGAs. The tool makes it easy to take advantage of transfer learning to deploy ML models.
This web-based framework, which can be hosted in the cloud or on the developers’ own servers, supports multiple simultaneous users working on the same or different projects. The sensAI Studio provides an easy-to-use GUI-based environment that allows users to select the target FPGA, configure I/Os, drag-and-drop AI/ML model IP, and pre- and post-processing IP, and connect everything together. This new version of sensAI also supports the Lattice Propel design environment for accelerating embedded RISC-V processor-based development.
In addition to TensorFlow AI/ML models, sensAI 4.0 includes support for TensorFlow Lite to reduce power consumption and increase data co-processing performance in AI/ML inferencing applications (TensorFlow Lite runs anywhere from 2 to 10 times faster on a Lattice FPGA than it does on an ARM® Cortex®-M4-based MCU). Furthermore, by leveraging advances in ML model compression and pruning, sensAI 4.0 can support image processing at 60 FPS with QVGA resolution or 30 FPS with VGA resolution.
sensAI studio also supports the latest and greatest AI/ML design techniques, such as transfer learning in which a model developed for one task is reused for a related task, or to provide the starting point for a model to perform a new task. Transfer learning is also of interest with regard to taking an AI/ML model you’ve already trained on one processor, like a microcontroller, and transferring that model to something smaller with higher performance that consumes less power, like a Lattice FPGA.
Sreepada Hegade
Senior Manager for ML Software and Solutions, Lattice Semiconductor