Multimodal Archives

Rockets to Retail: Intel Core Ultra Delivers Edge AI for Video Management

Algorithms, Blog Posts, Intel, Multimodal, Network Optix, News, Processors, Software, Tools / April 9, 2025

At Intel Vision, Network Optix debuts natural language prompt prototype to redefine video management, offering industries faster AI-driven insights and efficiency. On the surface, aerospace manufacturers, shopping malls, universities, police departments and automakers might not have a lot in common. But they each collectively use and manage hundreds to thousands of video cameras across their […]

Rockets to Retail: Intel Core Ultra Delivers Edge AI for Video Management Read More +

R²D²: Advancing Robot Mobility and Whole-body Control with Novel Workflows and AI Foundation Models from NVIDIA Research

Algorithms, Blog Posts, Multimodal, NVIDIA, Processors, Robotics, Sensors and Cameras, Software, Tools / April 1, 2025

This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. Welcome to the first edition of the NVIDIA Robotics Research and Development Digest (R2D2). This technical blog series will give developers and researchers deeper insight and access to the latest physical AI and robotics research breakthroughs across

R²D²: Advancing Robot Mobility and Whole-body Control with Novel Workflows and AI Foundation Models from NVIDIA Research Read More +

Ambarella Debuts Next-generation Edge GenAI Technology at ISC West, Including Reasoning Models Running on its CVflow Edge AI SoCs

Ambarella, Multimodal, News, Processors, Tools / March 31, 2025

With Over 30 Million Edge AI Systems-on-Chip Shipped, Ambarella is Driving Innovation for a Broad Range of On-Device and On-Premise Generative AI Applications SANTA CLARA, Calif., March 31, 2025 — Ambarella, Inc. (NASDAQ: AMBA), an edge AI semiconductor company, today announced during the ISC West security expo that it is continuing to push the envelope

Ambarella Debuts Next-generation Edge GenAI Technology at ISC West, Including Reasoning Models Running on its CVflow Edge AI SoCs Read More +

Video Understanding: Qwen2-VL, An Expert Vision-language Model

Algorithms, Articles, Multimodal, Software, Tenyks, Tools / March 21, 2025

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. Qwen2-VL, an advanced vision language model built on Qwen2 [1], sets new benchmarks in image comprehension across varied resolutions and ratios, while also tackling extended video content. ‍Though Qwen2-V excels at many fronts, this article explores the model’s

Video Understanding: Qwen2-VL, An Expert Vision-language Model Read More +

NVIDIA Announces Isaac GR00T N1 — the World’s First Open Humanoid Robot Foundation Model — and Simulation Frameworks to Speed Robot Development

Algorithms, Multimodal, NVIDIA, Processors, Robotics, Software, Tools / March 19, 2025

Now Available, Fully Customizable Foundation Model Brings Generalized Skills and Reasoning to Humanoid Robots NVIDIA, Google DeepMind and Disney Research Collaborate to Develop Next-Generation Open-Source Newton Physics Engine New Omniverse Blueprint for Synthetic Data Generation and Open-Source Dataset Jumpstart Physical AI Data Flywheel March 18, 2025—GTC—NVIDIA today announced a portfolio of technologies to supercharge humanoid

NVIDIA Announces Isaac GR00T N1 — the World’s First Open Humanoid Robot Foundation Model — and Simulation Frameworks to Speed Robot Development Read More +

NVIDIA Announces Major Release of Cosmos World Foundation Models and Physical AI Data Tools

Algorithms, Automotive, Multimodal, News, NVIDIA, Processors, Robotics, Software, Tools / March 19, 2025

New Models Enable Prediction, Controllable World Generation and Reasoning for Physical AI Two New Blueprints Deliver Massive Physical AI Synthetic Data Generation for Robot and Autonomous Vehicle Post-Training 1X, Agility Robotics, Figure AI, Skild AI Among Early Adopters March 18, 2025—GTC—NVIDIA today announced a major release of new NVIDIA Cosmos™ world foundation models (WFMs), introducing

NVIDIA Announces Major Release of Cosmos World Foundation Models and Physical AI Data Tools Read More +

NVIDIA Unveils Open Physical AI Dataset to Advance Robotics and Autonomous Vehicle Development

Algorithms, Automotive, Multimodal, News, NVIDIA, Processors, Robotics, Software, Tools / March 19, 2025

Expected to become the world’s largest such dataset, the initial release of standardized synthetic data is now available to robotics developers as open source. Teaching autonomous robots and vehicles how to interact with the physical world requires vast amounts of high-quality data. To give researchers and developers a head start, NVIDIA is releasing a massive,

NVIDIA Unveils Open Physical AI Dataset to Advance Robotics and Autonomous Vehicle Development Read More +

Build Real-time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

Algorithms, Articles, Multimodal, NVIDIA, Processors, Robotics, Software, Tools / March 18, 2025

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of highly sophisticated perception and deep contextual understanding. These intelligent solutions offer a promising

Build Real-time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization Read More +

Scalable Video Search: Cascading Foundation Models

Algorithms, Articles, Multimodal, Object Identification, Object Tracking, Software, Tenyks, Tools / March 14, 2025

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. Video has become the lingua franca of the digital age, but its ubiquity presents a unique challenge: how do we efficiently extract meaningful information from this ocean of visual data? ‍In Part 1 of this series, we navigate

Scalable Video Search: Cascading Foundation Models Read More +

Building a Simple VLM-based Multimodal Information Retrieval System with NVIDIA NIM

Algorithms, Articles, Multimodal, NVIDIA, Processors, Software, Tools / March 12, 2025

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. In today’s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined, effective solutions for quick deployments, prototyping, or experimentation. One of the key challenges in information retrieval

Building a Simple VLM-based Multimodal Information Retrieval System with NVIDIA NIM Read More +

If you're building AI or vision-enabled products, you've come to the right place.

Multimodal

Rockets to Retail: Intel Core Ultra Delivers Edge AI for Video Management

R²D²: Advancing Robot Mobility and Whole-body Control with Novel Workflows and AI Foundation Models from NVIDIA Research

Ambarella Debuts Next-generation Edge GenAI Technology at ISC West, Including Reasoning Models Running on its CVflow Edge AI SoCs

Video Understanding: Qwen2-VL, An Expert Vision-language Model

NVIDIA Announces Isaac GR00T N1 — the World’s First Open Humanoid Robot Foundation Model — and Simulation Frameworks to Speed Robot Development

NVIDIA Announces Major Release of Cosmos World Foundation Models and Physical AI Data Tools

NVIDIA Unveils Open Physical AI Dataset to Advance Robotics and Autonomous Vehicle Development

Build Real-time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

Scalable Video Search: Cascading Foundation Models

Building a Simple VLM-based Multimodal Information Retrieval System with NVIDIA NIM

Pages

Topics

Contact

Address

Phone