Snapdragon Summit’s AI Highlights: A Look at the Future of On-device AI

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm.

Qualcomm Technologies sets new standards in AI performance for its latest mobile, automotive and Qualcomm AI Hub advancements

Our annual Snapdragon Summit wrapped up with exciting new announcements centered on the future of on-device artificial intelligence (AI). With decades of experience in research and product development, Qualcomm Technologies continues to set the standard for innovative system solutions.

While much of the attention was on new platforms such as the Snapdragon 8 Elite or Snapdragon Ride Elite, an undercurrent of all the news was how all that extra power would go towards enabling AI and how that would impact and improve our lives.

Over the course of our three-day event, we welcomed technology media, analysts and industry-leading partners from around the globe to showcase our latest in mobile, automotive and Qualcomm AI Hub advancements. But we wanted to make this experience as accessible as possible, so here’s a glimpse of what we are bringing to the table in the ever-changing world of on-device AI.

Welcoming Snapdragon 8 Elite

With an exciting kick-off to Snapdragon Summit, Alex Katouzian, group general manager of mobile, compute and XR for Qualcomm Technologies, introduced our new Snapdragon 8 Elite Mobile Platform. Designed to deliver world-class performance and experiences, the newly architected Qualcomm Hexagon NPU (neural processing unit) brings significant advancements in on-device AI and support for multi-modality. It has 45% faster AI performance and 45% better power efficiency compared to the Snapdragon 8 Gen 3.

For the first time, Katouzian demonstrated an on-device multi-modal AI assistant that doesn’t just listen to you but also sees what you see, creating a more intuitive and immersive experience. This sample application was showcased with a live demonstration in which individuals can interact with objects on the camera live preview. The demo also showed someone pointing their camera at a receipt and asking the AI assistant to calculate a tip while splitting the bill between everyone at the table, saving time and hassle.

Siddhika Nevrekar, senior director of product management for Qualcomm Technologies, presented a deep dive into how Snapdragon 8 Elite works under the hood.

Going even deeper

Continuing the momentum of the Snapdragon 8 Elite announcement was Siddhika Nevrekar, senior director of product management for Qualcomm Technologies, who took a deep dive into how the mobile platform works under the hood and maintains its leadership in on-device multi-modal generative AI. With our new and improved Qualcomm AI Engine, we’re supporting the availability of advanced applications and multimodal AI assistants directly into the palm of your hands.

By introducing the Qualcomm Oyron CPU into our AI Engine, its amazing multi-tasking capabilities help the other compute cores focus on their respective AI tasks. The Qualcomm Oyron CPU is great at processing first-inference latency-critical tasks like your productivity apps, but also plays a role in initializing and distributing AI workloads to other compute cores, which is extremely important for on-device AI inferencing.

A supercharged NPU

We also made serious improvements to the Hexagon NPU, the heart of the Qualcomm AI Engine:

First, we’ve added higher throughput across all accelerator cores that resulted in faster AI inference performance.
We’ve also added more cores to our scalar and vector accelerators to support the growing demand for generative AI, particularly for Large Language Models (LLM) and Large Vision Models (LVM).

With all the AI performance uplift, we’re seeing up to 100% faster token rate performance on foundational LLMs and getting up to 70 tokens per second on specific LLMs.

The AI assistant example that Katouzian demonstrated is powered by a mix of advanced AI models:

Automatic Speech Recognition (ASR),
LLMs,
LVMs and
the new Large Multimodal Models (LMMs).

As they all run on different cores of our AI Engine, this allows the assistant to understand your voice, interpret visual inputs and respond instantly. The Snapdragon 8 Elite supports a longer input window which is measured by how many tokens you can input.

Acting like building blocks of information, these tokens can be in the form of text, voice or even photos or videos. By providing the AI assistant with more tokens, more context is available, which allows the assistant to complex reason.

Our improvements wouldn’t be complete without advancements to the Qualcomm Sensing Hub. Being the gateway to support a personalized AI assistant, it’s now 60% faster in AI performance with 34% more memory for better performance and efficiency.

Snapdragon Cockpit Elite’s dedicated AI Engine

Nakul Duggal, group general manager, automotive, industrial and cloud for Qualcomm Technologies, started off strong on Day 2 by announcing the Snapdragon Cockpit Elite for automotive platforms. Designed for power efficiency and performance, this platform is dedicated to transformer acceleration and E2E network architectures that support large foundational models.

Our latest NPU integrated in the Snapdragon Cockpit Elite system on a chip (SoC) is a dedicated AI engine designed to reach up to 12 times improvement over our previous flagship cockpit SoC. This powerful multimodal AI engine is designed to support applications built on large language models with billions of parameters like Llama, Gemini, Phi3, Bloom and more.

Utilizing one of these foundational models, use cases such as vehicle preventative maintenance can be done with retrieval augmented generation — where the model is trained on the vehicle manual, and an AI assistant can retrieve the answer regarding an unknown icon on the screen.

Durga Malladi, senior vice president and GM of technology planning and edge solutions for Qualcomm Technologies, welcomed advanced models and new community collaborators on the Qualcomm AI Hub.

New models and service collaborators on the Qualcomm AI Hub

Lastly, Durga Malladi, senior vice president and general manager of technology planning and edge solutions for Qualcomm Technologies, welcomed our Qualcomm AI Hub advanced models from our collaborators and new community collaborators. Qualcomm AI Hub is excited to share many new models, specifically LLMs from our newly announced model maker collaborations. These collaborations also open the door for many new models coming soon to Qualcomm AI Hub. We encourage you to download and contact us for these models.

Mistral AI: Qualcomm AI Hub worked with Mistral AI to feature two state-of-the-art models, Mistral v0.3 7B and Mistral 3B (optimized for Qualcomm Technologies). Developers can harness the power of these advanced models to create innovative and efficient AI applications directly on our platforms. Additionally, we’re collaborating to bring the latest Ministral 3B and Ministral 8B models to the Qualcomm AI Hub.
Tech Mahindra: We’re excited to have worked with Tech Mahindra to optimize its Indus 1.1B model for Snapdragon 8 Eltie. Indus-Q is a benchmark-breaking model designed specifically to address many languages and dialects as part of the highly performant LLM. This collaboration can empower developers to leverage Tech Mahindra’s cutting-edge AI capabilities and build sophisticated, industry-leading solutions optimized for our platforms. We are excited to see what developers will build with access to this unique model.
Preferred Networks: We are pleased to have worked with Preferred Networks to support the availability of Preferred Networks’ PLaMo 1B model, optimized to run on-device on Snapdragon 8 Elite. This is Plamo’s first small language model in the PLaMo Lite series, specifically designed for high-quality Japanese and English text data generation. Developers will be able to seamlessly integrate this advanced AI technology into their projects, delivering unparalleled performance and efficiency on our platforms.
IBM watsonx: We’re thrilled to have collaborated with IBM to optimize the watsonx’s Granite-3B-Code-Instruct model for Snapdragon 8 Elite. This can enable developers to harness IBM’s renowned AI expertise and build intelligent, reliable applications tailored for our platforms. Additionally, we’re working on integrating the recently announced small LLM Granite 2B and 8B ‘workhorse’ models into our Qualcomm AI Hub.
G42: JAIS 6.7B is a bilingual LLM for both Arabic and English developed by Inception, a G42 company in partnership with MBZUAI and Cerebras. This 6.7 billion parameter LLM is now on Qualcomm AI Hub. It’s trained on a dataset containing 141 billion Arabic tokens and 339 billion English/code tokens.
Zhipu: We’re excited to announce the collaboration together with several models: GLM-4-9B, GLM4v-nano and GLM4v-mini models optimized for running on-device on Snapdragon 8 Elite.
Upstage AI: Qualcomm AI Hub developers will be able to try out the addition of Solar Mini and other models, optimized for Snapdragon 8 Elite.
Meta: Introducing Llama 3.2 on Qualcomm AI Hub. Explore now or continue building with Llama 3.1, 3 and 2. The leading open-source AI model you can fine-tune and deploy anywhere is now available in more versions.
Amazon SageMaker and Bring Your Own Data (BYOD): We are proud to announce a new collaboration that provides developers with the ability to train, fine-tune and deploy AI models specific to their use case, on device. With BYOD, developers can customize and fine tune their model using Amazon SageMaker’s robust cloud-based service and then bring their model (BYOM) to Qualcomm AI Hub for compilation and optimization, ready to be deployed onto Qualcomm and Snapdragon platforms. This collaboration enables rapid model iteration and equips developers with powerful tools to customize their model for deployment across edge devices.
Dataloop: Qualcomm AI Hub and Dataloop have joined forces to empower AI developers by streamlining the entire AI lifecycle. Using Dataloop, developers will be able to create an automated pipeline that includes data curation, labeling, model finetuning all optimized for deploying models on-device on Snapdragon 8 Elite.

Another thrilling announcement is that we’re making the Snapdragon 8 Elite available for developers on AI Hub via Qualcomm Device Cloud. This means developers can dive right in and start creating their incredible applications today

There’s more to come in the world of on-device AI. Check back with the OnQ blog for the latest developments in this rapidly evolving area.

Ruby Hagin
Staff Marketing Communications Specialist, Qualcomm Technologies

If you're building AI or vision-enabled products, you've come to the right place.

Snapdragon Summit’s AI Highlights: A Look at the Future of On-device AI

Qualcomm Technologies sets new standards in AI performance for its latest mobile, automotive and Qualcomm AI Hub advancements

Welcoming Snapdragon 8 Elite

Going even deeper

A supercharged NPU

Snapdragon Cockpit Elite’s dedicated AI Engine

New models and service collaborators on the Qualcomm AI Hub

Pages

Topics

Contact

Address

Phone