Software for Embedded Vision
Accelerating LLMs with llama.cpp on NVIDIA RTX Systems
This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. The NVIDIA RTX AI for Windows PCs platform offers a thriving ecosystem of thousands of open-source models for application developers to leverage and integrate into Windows applications. Notably, llama.cpp is one popular tool, with over 65K GitHub
“Adventures in Moving a Computer Vision Solution from Cloud to Edge,” a Presentation from MetaConsumer
Nate D’Amico, CTO and Head of Product at MetaConsumer, presents the “Adventures in Moving a Computer Vision Solution from Cloud to Edge” tutorial at the May 2024 Embedded Vision Summit. Optix is a computer vision-based AI system that measures advertising and media exposures on mobile devices for real-time marketing optimization.… “Adventures in Moving a Computer
Redefining Hybrid Meetings With AI-powered 360° Videoconferencing
This blog post was originally published at Ambarella’s website. It is reprinted here with the permission of Ambarella. The global pandemic catalyzed a boom in videoconferencing that continues to grow as companies embrace hybrid work models and seek more sustainable approaches to business communication with less travel. Now, with videoconferencing becoming a cornerstone of modern
“Bridging Vision and Language: Designing, Training and Deploying Multimodal Large Language Models,” a Presentation from Meta Reality Labs
Adel Ahmadyan, Staff Engineer at Meta Reality Labs, presents the “Bridging Vision and Language: Designing, Training and Deploying Multimodal Large Language Models” tutorial at the May 2024 Embedded Vision Summit. In this talk, Ahmadyan explores the use of multimodal large language models in real-world edge applications. He begins by explaining… “Bridging Vision and Language: Designing,
Qualcomm Partners with Meta to Support Llama 3.2. Why This is a Big Deal for On-device AI
This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. On-device artificial intelligence (AI) is critical to making your everyday AI experiences fast and security-rich. That’s why it’s such a win that Qualcomm Technologies and Meta have worked together to support the Llama 3.2 large language models (LLMs)
“Introduction to Depth Sensing,” a Presentation from Meta
Harish Venkataraman, Depth Cameras Architecture and Tech Lead at Meta, presents the “Introduction to Depth Sensing” tutorial at the May 2024 Embedded Vision Summit. We live in a three-dimensional world, and the ability to perceive in three dimensions is essential for many systems. In this talk, Venkataraman introduced the main… “Introduction to Depth Sensing,” a
Deploying Accelerated Llama 3.2 from the Edge to the Cloud
This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. Expanding the open-source Meta Llama collection of models, the Llama 3.2 collection includes vision language models (VLMs), small language models (SLMs), and an updated Llama Guard model with support for vision. When paired with the NVIDIA accelerated
“Advancing Embedded Vision Systems: Harnessing Hardware Acceleration and Open Standards,” a Presentation from the Khronos Group
Neil Trevett, President of the Khronos Group, presents the “Advancing Embedded Vision Systems: Harnessing Hardware Acceleration and Open Standards” tutorial at the May 2024 Embedded Vision Summit. Offloading processing to accelerators enables embedded vision systems to process workloads that exceed the capabilities of CPUs. However, parallel processors add complexity as… “Advancing Embedded Vision Systems: Harnessing
When, Where and How AI Should Be Applied
Phil Koopman dissects strengths and weaknesses of machine learning based AI AI does amazing stuff. No question about it. But how hard have we really thought about “machine-learning capabilities” for applications? Phil Koopman, professor at Carnegie Mellon University, delivered a keynote on Sept. 11, 2024 at the Business of Semiconductor Summit, (BOSS 2024), concentrating on
“Using AI to Enhance the Well-being of the Elderly,” a Presentation from Kepler Vision Technologies
Harro Stokman, CEO of Kepler Vision Technologies, presents the “Using Artificial Intelligence to Enhance the Well-being of the Elderly” tutorial at the May 2024 Embedded Vision Summit. This presentation provides insights into an innovative application of artificial intelligence and advanced computer vision technologies in the healthcare sector, specifically focused on… “Using AI to Enhance the
AI Model Training Cost Have Skyrocketed by More than 4,300% Since 2020
Over the past five years, AI models have become much more complex and capable, tailored to perform specific tasks across industries and provide better efficiency, accuracy and automation. However, the cost of training in these systems has exploded. According to data presented by AltIndex.com, AI model training costs have skyrocketed by more than 4,300% since
BrainChip Demonstration of LLM-RAG with a Custom Trained TENNs Model
Kurt Manninen, Senior Solutions Architect at BrainChip, demonstrates the company’s latest edge AI and vision technologies and products at the September 2024 Edge AI and Vision Alliance Forum. Specifically, Manninen demonstrates his company’s Temporal Event-Based Neural Network (TENN) foundational large language model with 330M parameters, augmented with a Retrieval-Augmented Generative (RAG) output to replace user
Advex AI Demonstration of Accelerating Machine Vision with Synthetic AI Data
Pedro Pachuca, CEO at Advex AI, demonstrates the company’s latest edge AI and vision technologies and products at the September 2024 Edge AI and Vision Alliance Forum. Specifically, Pachuca demonstrates Advex’s ability to ingest just 10 images and produce thousands of labeled, synthetic images in just hours. These synthetic images cover the distribution of variations
“Testing Cloud-to-Edge Deep Learning Pipelines: Ensuring Robustness and Efficiency,” a Presentation from Instrumental
Rustem Feyzkhanov, Staff Machine Learning Engineer at Instrumental, presents the “Testing Cloud-to-Edge Deep Learning Pipelines: Ensuring Robustness and Efficiency” tutorial at the May 2024 Embedded Vision Summit. A cloud-to-edge deep learning pipeline is a fully automated conduit for training and deploying models to the edge. This enables quick model retraining… “Testing Cloud-to-Edge Deep Learning Pipelines:
AI PCs to Make Up 60% of Total PC Sales in 2027, 3x More than This Year
Although traditional PCs remain the number one choice for most consumers buying a new device, the surging demand for AI-driven applications has boosted the popularity of AI PCs, both in consumer and professional sectors. This trend is expected to continue in the following years, with AI PC gaining a much bigger market share than this
How AI and Smart Glasses Give You a New Perspective on Real Life
This blog post was originally published at Qualcomm’s website. It is reprinted here with the permission of Qualcomm. When smart glasses are paired with generative artificial intelligence, they become the ideal way to interact with your digital assistant They may be shades, but smart glasses are poised to give you a clearer view of everything
“Real-time Retail Product Classification on Android Devices Inside the Caper AI Cart,” a Presentation from Instacart
David Scott, Senior Machine Learning Engineer at Instacart, presents the “Real-time Retail Product Classification on Android Devices Inside the Caper AI Cart” tutorial at the May 2024 Embedded Vision Summit. In this talk, Scott explores deploying an embedded computer vision model on Android devices for real-time product classification with the… “Real-time Retail Product Classification on
Using Generative AI to Enable Robots to Reason and Act with ReMEmbR
This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. Vision-language models (VLMs) combine the powerful language understanding of foundational LLMs with the vision capabilities of vision transformers (ViTs) by projecting text and images into the same embedding space. They can take unstructured multimodal data, reason over