Articles Archives

LLM Benchmarking: Fundamental Concepts

Algorithms, Articles, NVIDIA, Processors, Software, Tools / April 8, 2025

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. The past few years have witnessed the rise in popularity of generative AI and large language models (LLMs), as part of a broad AI revolution. As LLM-based applications are rolled out across enterprises, there is a need to […]

LLM Benchmarking: Fundamental Concepts Read More +

Exploring the COCO Dataset

3LC, Algorithms, Articles, Software, Tools / March 21, 2025

This article was originally published at 3LC’s website. It is reprinted here with the permission of 3LC. The COCO dataset is a cornerstone of modern object detection, shaping models used in self-driving cars, robotics, and beyond. But what happens when we take a closer look? By examining annotations, embeddings, and dataset patterns at a granular

Exploring the COCO Dataset Read More +

Video Understanding: Qwen2-VL, An Expert Vision-language Model

Algorithms, Articles, Multimodal, Software, Tenyks, Tools / March 21, 2025

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. Qwen2-VL, an advanced vision language model built on Qwen2 [1], sets new benchmarks in image comprehension across varied resolutions and ratios, while also tackling extended video content. ‍Though Qwen2-V excels at many fronts, this article explores the model’s

Video Understanding: Qwen2-VL, An Expert Vision-language Model Read More +

Build Real-time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

Algorithms, Articles, Multimodal, NVIDIA, Processors, Robotics, Software, Tools / March 18, 2025

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of highly sophisticated perception and deep contextual understanding. These intelligent solutions offer a promising

Build Real-time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization Read More +

Scalable Video Search: Cascading Foundation Models

Algorithms, Articles, Multimodal, Object Identification, Object Tracking, Software, Tenyks, Tools / March 14, 2025

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. Video has become the lingua franca of the digital age, but its ubiquity presents a unique challenge: how do we efficiently extract meaningful information from this ocean of visual data? ‍In Part 1 of this series, we navigate

Scalable Video Search: Cascading Foundation Models Read More +

Building a Simple VLM-based Multimodal Information Retrieval System with NVIDIA NIM

Algorithms, Articles, Multimodal, NVIDIA, Processors, Software, Tools / March 12, 2025

This article was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA. In today’s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined, effective solutions for quick deployments, prototyping, or experimentation. One of the key challenges in information retrieval

Building a Simple VLM-based Multimodal Information Retrieval System with NVIDIA NIM Read More +

AutoML Decoded: The Ultimate Guide and Tools Comparison

Algorithms, Articles, Software, Tools, Tryolabs / March 10, 2025

This article was originally published at Tryolabs’ website. It is reprinted here with the permission of Tryolabs. The quest for efficient and user-friendly solutions has led to the emergence of a game-changing concept: Automated Machine Learning (AutoML). AutoML is the process of automating the tasks involved in the entire Machine Learning lifecycle, such as data

AutoML Decoded: The Ultimate Guide and Tools Comparison Read More +

Zero-Shot AI: The End of Fine-tuning as We Know It?

Algorithms, Articles, Face Detection, Software, Tenyks, Tools / March 7, 2025

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. Models like SAM 2, LLaVA or ChatGPT can do tasks without special training. This has people wondering if the old way (i.e., fine-tuning) of training AI is becoming outdated. In this article, we compare two models: YOLOv8 (fine-tuning)

Zero-Shot AI: The End of Fine-tuning as We Know It? Read More +

Fine-tuning LLMs for Cost-effective GenAI Inference at Scale

Algorithms, Articles, Software, Tools, Tryolabs / March 3, 2025

This article was originally published at Tryolabs’ website. It is reprinted here with the permission of Tryolabs. Data is the new oil, fueling the AI revolution. From user-tailored shopping assistants to AI researchers, to recreating the King, the applicability of AI models knows no bounds. Yet these models are only as good as the data

Fine-tuning LLMs for Cost-effective GenAI Inference at Scale Read More +

SAM 2 + GPT-4o: Cascading Foundation Models via Visual Prompting (Part 2)

Algorithms, Articles, Multimodal, Object Identification, Object Tracking, Software, Tenyks, Tools / February 28, 2025

This article was originally published at Tenyks’ website. It is reprinted here with the permission of Tenyks. In Part 2 of our Segment Anything Model 2 (SAM 2) Series, we show how foundation models (e.g., GPT-4o, Claude Sonnet 3.5 and YOLO-World) can be used to generate visual inputs (e.g., bounding boxes) for SAM 2. Learn

SAM 2 + GPT-4o: Cascading Foundation Models via Visual Prompting (Part 2) Read More +

If you're building AI or vision-enabled products, you've come to the right place.

Articles

LLM Benchmarking: Fundamental Concepts

Exploring the COCO Dataset

Video Understanding: Qwen2-VL, An Expert Vision-language Model

Build Real-time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization

Scalable Video Search: Cascading Foundation Models

Building a Simple VLM-based Multimodal Information Retrieval System with NVIDIA NIM

AutoML Decoded: The Ultimate Guide and Tools Comparison

Zero-Shot AI: The End of Fine-tuning as We Know It?

Fine-tuning LLMs for Cost-effective GenAI Inference at Scale

SAM 2 + GPT-4o: Cascading Foundation Models via Visual Prompting (Part 2)

Pages

Topics

Contact

Address

Phone