Multimodal Large Language Models

LLMs and MLLMs

The past decade-plus has seen incredible progress in practical computer vision. Thanks to deep learning, computer vision is dramatically more robust and accessible, and has enabled compelling capabilities in thousands of applications, from automotive safety to healthcare. But today’s widely used deep learning techniques suffer from serious limitations. Often, they struggle when confronted with ambiguity (e.g., are those people fighting or dancing?) or with challenging imaging conditions (e.g., is that shadow in the fog a person or a shrub?). And, for many product developers, computer vision remains out of reach due to the cost and complexity of obtaining the necessary training data, or due to lack of necessary technical skills.

Recent advances in large language models (LLMs) and their variants such as vision language models (VLMs, which comprehend both images and text), hold the key to overcoming these challenges. VLMs are an example of multimodal large language models (MLLMs), which integrate multiple data modalities such as language, images, audio, and video to enable complex cross-modal understanding and generation tasks. MLLMs represent a significant evolution in AI by combining the capabilities of LLMs with multimodal processing to handle diverse inputs and outputs.

The purpose of this portal is to facilitate awareness of, and education regarding, the challenges and opportunities in using LLMs, VLMs, and other types of MLLMs in practical applications — especially applications involving  edge AI and machine perception. The content that follows (which is updated regularly) discusses these topics. As a starting point, we encourage you to watch the recording of the symposium “Your Next Computer Vision Model Might be an LLM: Generative AI and the Move From Large Language Models to Vision Language Models“, sponsored by the Edge AI and Vision Alliance. A preview video of the symposium introduction by Jeff Bier, Founder of the Alliance, follows:

If there are topics related to LLMs, VLMs or other types of MLLMs that you’d like to learn about and don’t find covered below, please email us at [email protected] and we’ll consider adding content on these topics in the future.

View all LLM and MLLM Content

What’s Next in On-device Generative AI?

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. Upcoming generative AI trends and Qualcomm Technologies’ role in enabling the next wave of innovation on-device The generative artificial intelligence (AI) era has begun. Generative AI innovations continue at a rapid pace and are being woven into

Read More »

BrainChip Demonstration of the Power of Temporal Event-based Neural Networks (TENNs)

Todd Vierra, Vice President of Customer Engagement at BrainChip, demonstrates the company’s latest edge AI and vision technologies and products at the 2024 Embedded Vision Summit. Specifically, Vierra demonstrates the efficient processing of generative text using Temporal Event-based Neural Networks (TENNs) compared to ChatGPT. The TENN, an innovative, light-weight neural network architecture, combines convolution in

Read More »

AMD to Acquire Silo AI to Expand Enterprise AI Solutions Globally

Europe’s largest private AI lab to accelerate the development and deployment of AMD-powered AI models and software solutions Enhances open-source AI software capabilities for efficient training and inference on AMD compute platforms SANTA CLARA, Calif. — July 10, 2024 — AMD (NASDAQ: AMD) today announced the signing of a definitive agreement to acquire Silo AI,

Read More »

Decoding How the Generative AI Revolution BeGAN

This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. NVIDIA Research’s GauGAN demo set the scene for a new wave of generative AI apps supercharging creative workflows. Editor’s note: This post is part of the AI Decoded series, which demystifies AI by making the technology more

Read More »

The Next Frontier in Education: How Generative AI and XR will Evolve the World of Learning in the Next Decade

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. (Ai)Daptive XR empowers students through real-time personalization and collaborative learning Envisioning the future of education, and the art of learning overall, is nothing new. Over 120 years ago, French artist Jean-Marc Côté suggested how learning may look

Read More »

“Transforming Enterprise Intelligence: The Power of Computer Vision and Gen AI at the Edge with OpenVINO,” a Presentation from Intel

Leila Sabeti, Americas AI Technical Sales Lead at Intel, presents the “Transforming Enterprise Intelligence: The Power of Computer Vision and Gen AI at the Edge with OpenVINO” tutorial at the May 2024 Embedded Vision Summit. In this talk, Sabeti focuses on the transformative impact of AI at the edge, highlighting… “Transforming Enterprise Intelligence: The Power

Read More »

Axelera AI Raises $68 Million Series B Funding to Accelerate Next-generation Artificial Intelligence

News Highlights: Powering Global Innovation: Mass adoption of Axelera AI’s Metis™ AI Processing Unit (AIPU), the world’s most powerful AIPU for edge devices, drives next-gen AI inference solutions for computer vision and generative AI. Accelerating Market Expansion: Europe’s largest, oversubscribed Series B funding in fabless semiconductors propels Axelera AI into new markets, including automotive and

Read More »

“Optimized Vision Language Models for Intelligent Transportation System Applications,” a Presentation from Nota AI

Tae-Ho Kim, Co-founder and CTO of Nota AI, presents the “Optimized Vision Language Models for Intelligent Transportation System Applications” tutorial at the May 2024 Embedded Vision Summit. In the rapidly evolving landscape of intelligent transportation systems (ITSs), the demand for efficient and reliable solutions has never been greater. In this… “Optimized Vision Language Models for

Read More »

How Edge Devices Can Help Mitigate the Global Environmental Cost of Generative AI

This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. Exploring the role of edge devices in reducing energy consumption and promoting sustainability in AI systems The economic value of generative artificial intelligence (AI) to the world is immense. Research from McKinsey estimates that generative AI could add the

Read More »

Nota AI and Advantech Sign Strategic MOU to Pioneer On-Device GenAI Market

Nota AI and Advantech sign MOU for edge AI collaboration. Partnership focuses on generative AI at the edge. Joint marketing and sales activities planned to expand market share. SEOUL, South Korea, June 7, 2024 /PRNewswire/ — AI model optimization technology company Nota AI® (Nota Inc.) has signed a strategic Memorandum of Understanding (MOU) with global industrial AIoT

Read More »

Intel AI Platforms Accelerate Microsoft Phi-3 GenAI Models

Intel, in collaboration with Microsoft, enables support for several Phi-3 models across its data center platforms, AI PCs and edge solutions. What’s New: Intel has validated and optimized its AI product portfolio across client, edge and data center for several of Microsoft’s Phi-3 family of open models. The Phi-3 family of small, open models can run on

Read More »

“What’s Next in On-device Generative AI,” a Presentation from Qualcomm

Jilei Hou, Vice President of Engineering and Head of AI Research at Qualcomm Technologies, presents the “What’s Next in On-device Generative AI” tutorial at the May 2024 Embedded Vision Summit. The generative AI era has begun! Large multimodal models are bringing the power of language understanding to machine perception, and transformer models are expanding to

Read More »

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone
Phone: +1 (925) 954-1411
Scroll to Top