LLMs and MLLMs
The past decade-plus has seen incredible progress in practical computer vision. Thanks to deep learning, computer vision is dramatically more robust and accessible, and has enabled compelling capabilities in thousands of applications, from automotive safety to healthcare. But today’s widely used deep learning techniques suffer from serious limitations. Often, they struggle when confronted with ambiguity (e.g., are those people fighting or dancing?) or with challenging imaging conditions (e.g., is that shadow in the fog a person or a shrub?). And, for many product developers, computer vision remains out of reach due to the cost and complexity of obtaining the necessary training data, or due to lack of necessary technical skills.
Recent advances in large language models (LLMs) and their variants such as vision language models (VLMs, which comprehend both images and text), hold the key to overcoming these challenges. VLMs are an example of multimodal large language models (MLLMs), which integrate multiple data modalities such as language, images, audio, and video to enable complex cross-modal understanding and generation tasks. MLLMs represent a significant evolution in AI by combining the capabilities of LLMs with multimodal processing to handle diverse inputs and outputs.
The purpose of this portal is to facilitate awareness of, and education regarding, the challenges and opportunities in using LLMs, VLMs, and other types of MLLMs in practical applications — especially applications involving edge AI and machine perception. The content that follows (which is updated regularly) discusses these topics. As a starting point, we encourage you to watch the recording of the symposium “Your Next Computer Vision Model Might be an LLM: Generative AI and the Move From Large Language Models to Vision Language Models“, sponsored by the Edge AI and Vision Alliance. A preview video of the symposium introduction by Jeff Bier, Founder of the Alliance, follows:
If there are topics related to LLMs, VLMs or other types of MLLMs that you’d like to learn about and don’t find covered below, please email us at [email protected] and we’ll consider adding content on these topics in the future.
View all LLM and MLLM Content
What’s Next in On-device Generative AI?
This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. Upcoming generative AI trends and Qualcomm Technologies’ role in enabling the next wave of innovation on-device The generative artificial intelligence (AI) era has begun. Generative AI innovations continue at a rapid pace and are being woven into
Mission NIMpossible: Decoding the Microservices That Accelerate Generative AI
This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. Run generative AI NVIDIA NIM microservices locally on NVIDIA RTX AI workstations and NVIDIA GeForce RTX systems. Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more accessible
BrainChip Demonstration of the Power of Temporal Event-based Neural Networks (TENNs)
Todd Vierra, Vice President of Customer Engagement at BrainChip, demonstrates the company’s latest edge AI and vision technologies and products at the 2024 Embedded Vision Summit. Specifically, Vierra demonstrates the efficient processing of generative text using Temporal Event-based Neural Networks (TENNs) compared to ChatGPT. The TENN, an innovative, light-weight neural network architecture, combines convolution in
AI Developer Workflows, Simplified: Empowering Developers with the Qualcomm AI Hub
This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. With over 100 pre-optimized AI and generative AI models, the Qualcomm AI Hub is a developer’s gateway to superior on-device AI performance Generative AI has been evolving to run on device, in addition to the cloud. It
AMD to Acquire Silo AI to Expand Enterprise AI Solutions Globally
Europe’s largest private AI lab to accelerate the development and deployment of AMD-powered AI models and software solutions Enhances open-source AI software capabilities for efficient training and inference on AMD compute platforms SANTA CLARA, Calif. — July 10, 2024 — AMD (NASDAQ: AMD) today announced the signing of a definitive agreement to acquire Silo AI,
Decoding How the Generative AI Revolution BeGAN
This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. NVIDIA Research’s GauGAN demo set the scene for a new wave of generative AI apps supercharging creative workflows. Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more
“Deploying Large Language Models on a Raspberry Pi,” a Presentation from Useful Sensors
Pete Warden, CEO of Useful Sensors, presents the “Deploying Large Language Models on a Raspberry Pi,” tutorial at the May 2024 Embedded Vision Summit. In this presentation, Warden outlines the key steps required to implement a large language model (LLM) on a Raspberry Pi. He begins by outlining the motivations… “Deploying Large Language Models on
The Next Frontier in Education: How Generative AI and XR will Evolve the World of Learning in the Next Decade
This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. (Ai)Daptive XR empowers students through real-time personalization and collaborative learning Envisioning the future of education, and the art of learning overall, is nothing new. Over 120 years ago, French artist Jean-Marc Côté suggested how learning may look
“Transforming Enterprise Intelligence: The Power of Computer Vision and Gen AI at the Edge with OpenVINO,” a Presentation from Intel
Leila Sabeti, Americas AI Technical Sales Lead at Intel, presents the “Transforming Enterprise Intelligence: The Power of Computer Vision and Gen AI at the Edge with OpenVINO” tutorial at the May 2024 Embedded Vision Summit. In this talk, Sabeti focuses on the transformative impact of AI at the edge, highlighting… “Transforming Enterprise Intelligence: The Power
Axelera AI Raises $68 Million Series B Funding to Accelerate Next-generation Artificial Intelligence
News Highlights: Powering Global Innovation: Mass adoption of Axelera AI’s Metis™ AI Processing Unit (AIPU), the world’s most powerful AIPU for edge devices, drives next-gen AI inference solutions for computer vision and generative AI. Accelerating Market Expansion: Europe’s largest, oversubscribed Series B funding in fabless semiconductors propels Axelera AI into new markets, including automotive and
“Challenges and Solutions of Moving Vision LLMs to the Edge,” a Presentation from Expedera
Costas Calamvokis, Distinguished Engineer at Expedera, presents the “Challenges and Solutions of Moving Vision LLMs to the Edge” tutorial at the May 2024 Embedded Vision Summit. OEMs, brands and cloud providers want to move LLMs to the edge, especially for vision applications. What are the benefits and challenges of doing… “Challenges and Solutions of Moving
“Optimized Vision Language Models for Intelligent Transportation System Applications,” a Presentation from Nota AI
Tae-Ho Kim, Co-founder and CTO of Nota AI, presents the “Optimized Vision Language Models for Intelligent Transportation System Applications” tutorial at the May 2024 Embedded Vision Summit. In the rapidly evolving landscape of intelligent transportation systems (ITSs), the demand for efficient and reliable solutions has never been greater. In this… “Optimized Vision Language Models for
How Edge Devices Can Help Mitigate the Global Environmental Cost of Generative AI
This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. Exploring the role of edge devices in reducing energy consumption and promoting sustainability in AI systems The economic value of generative artificial intelligence (AI) to the world is immense. Research from McKinsey estimates that generative AI could add the
Nota AI and Advantech Sign Strategic MOU to Pioneer On-Device GenAI Market
Nota AI and Advantech sign MOU for edge AI collaboration. Partnership focuses on generative AI at the edge. Joint marketing and sales activities planned to expand market share. SEOUL, South Korea, June 7, 2024 /PRNewswire/ — AI model optimization technology company Nota AI® (Nota Inc.) has signed a strategic Memorandum of Understanding (MOU) with global industrial AIoT
Intel AI Platforms Accelerate Microsoft Phi-3 GenAI Models
Intel, in collaboration with Microsoft, enables support for several Phi-3 models across its data center platforms, AI PCs and edge solutions. What’s New: Intel has validated and optimized its AI product portfolio across client, edge and data center for several of Microsoft’s Phi-3 family of open models. The Phi-3 family of small, open models can run on
“What’s Next in On-device Generative AI,” a Presentation from Qualcomm
Jilei Hou, Vice President of Engineering and Head of AI Research at Qualcomm Technologies, presents the “What’s Next in On-device Generative AI” tutorial at the May 2024 Embedded Vision Summit. The generative AI era has begun! Large multimodal models are bringing the power of language understanding to machine perception, and transformer models are expanding to