NVIDIA AI Workbench Simplifies Using GPUs on Windows

This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA.

NVIDIA AI Workbench is a free, user-friendly development environment manager that streamlines data science, ML, and AI projects on your system of choice: PC, workstation, datacenter, or cloud. You can develop, test, and prototype projects locally on Windows, macOS, and Ubuntu and easily transfer development environments and computational work between systems (local and remote) to optimize cost, availability, and scale.

AI Workbench is focused on streamlining the developer experience without blocking the kind of customization that power users need. That’s a major reason that AI Workbench handles containers. They are the easiest way to provide and modify the environments needed for GPU-accelerated work.

This focus also means working with ecosystem partners to improve the user experience. For example, the collaboration with Canonical facilitates using an Ubuntu WSL distribution for the AI Workbench install on Windows.

More recently, NVIDIA collaborated with the Docker Desktop team to create a feature that lets AI Workbench directly install Docker Desktop. This feature is available in the latest AI Workbench release and significantly streamlines the experience on Windows and macOS.

This kind of streamlining is what makes AI Workbench the easiest way to get started on your own system, from laptops and workstations all the way to servers and VMs.

Managed Docker Desktop install/h2>
Docker Desktop is the recommended container runtime on NVIDIA AI Workbench for Windows and macOS. However, selecting Docker previously required manual steps to set up. To eliminate the manual steps, NVIDIA partnered with Docker on the AI Workbench-enabled installation of Docker Desktop for local systems.

This is the first time that Docker has enabled another application to do a managed installation for Docker Desktop. Thanks to the collaboration, installing Docker Desktop for AI Workbench is now straightforward. For more information, see Optimizing AI Application Development with Docker Desktop and NVIDIA AI Workbench (Docker website).

Selecting the Docker container runtime for AI Workbench results in the following tasks being automatically completed:

Installing Docker Desktop: Previously, you had to exit the AI Workbench installer and install Docker Desktop manually if it wasn’t already installed. Now you can have AI Workbench install Docker Desktop without needing to exit the AI Workbench installer.
Configuring Docker Desktop on Windows: AI Workbench uses its own WSL distribution, NVIDIA-Workbench. Previously, Windows users had to manually configure Docker Desktop to use this distribution. Now this happens automatically.

New AI Workbench projects

Included in this release is a new set of example projects for you to use and build from. An AI Workbench project is a structured Git repository that defines a containerized development environment in AI Workbench.

These projects support IDEs like Jupyter and Visual Studio Code as well as user-configured web applications. Everything is containerized, isolated, and easily modifiable. You can clone a project from GitHub or GitLab and then AI Workbench handles everything, including connecting to GPUs.

The best example of this so far is the Hybrid-RAG project on GitHub. With AI Workbench, you can just clone the project and get the RAG application running in a few clicks. If you don’t have a local GPU, the project lets you use either cloud endpoints or a self-hosted NIM container to run the inference for you.

This release has some example AI Workbench projects on GitHub that continue developing the RAG theme. There are also some new Jupyter-based fine-tuning projects and a LlamaFactory project that supports the NVIDIA RTX AI Toolkit.

Agentic RAG

The Agentic RAG AI Workbench project lets you work with an AI agent to include web search tool-calling into your RAG pipeline. Instead of just working with the documents in a database, the agent also dynamically searches for new documents online as a fallback to better respond to queries.

Figure 1. Structure of the agentic RAG example project.

LLM agents are systems designed to perceive and react to an environment, typically through tool-calling, to better take relevant actions. This project implements a LangGraph-based RAG agent with the following agentic elements to improve response generation:

Routing: Route relevant questions to different pipelines based on the query topic.
Fallback: Fall back to web search if retrieved docs are not relevant to the query.
Self-reflection: Fix hallucinations and answers that don’t address the question.

Figure 2. Agentic RAG example project with a customizable Gradio chat UI

This project includes a customizable Gradio Chat app that enables you to run inference using remotely running endpoints and microservices, whether on the cloud using the NVIDIA API catalog, self-hosted endpoints using NVIDIA NIM, or third-party self-hosted microservices. The mode of inference can easily be switched through the Chat app.

NIM Anywhere

NIM Anywhere is an all-in-one project for building NIM-based RAG applications that include a preconfigured RAG chatbot.

Figure 3. NIM Anywhere example project

Docker automation: Run services like NIM, Milvus, and Redis as persistent containers alongside the main project.
User-configurable models: Toggle between running RAG with either a NIM microservice on the NVIDIA API Catalog or a self-hosted NIM microservice running locally.
Customizable frontend: Add views to the frontend Gradio application to extend the project and build out new use cases.

NIM microservices are available as part of NVIDIA AI Enterprise, but you can also join the NVIDIA Developer Program to get started with NVIDIA NIM for free.

Fine-tuning projects

Finally, we introduce several fine-tuning workflows for exciting new models. Each of these projects feature models that can be quantized to fit on a single GPU:

Mixtral 8x7B: First example project for AI Workbench that demonstrates fine-tuning a mixture of experts (MoE) model.
Llama 3 8B: An example project demonstrating two approaches: Supervised full fine-tuning (SFT), as well as Direct Preference Optimization (DPO).
Phi-3 Mini: A highly accessible fine-tuning example due to the small model size and quantization capability.
RTX AI Toolkit: Provides an end-to-end workflow for Windows application developers. You can use popular foundation models, customize them with fine-tuning techniques using Workbench Projects, and deploy models into Windows applications for peak performance on a wide range of GPUs, from NVIDIA RTX PCs and workstations to cloud. You can get started by using AI Workbench along with the LlamaFactory GUI.

Other new features

Our development process includes direct feature requests from users. The following features are based on this user feedback:

SSH Agent
Ubuntu 24.04
Logging

SSH Agent

Some enterprise users of AI Workbench need password-protected SSH keys for accessing remotes. This was addressed by adding SSH agent support to the 2024.07 release. You also have the option to use the earlier SSH key feature.

Ubuntu 24.04

Previously, Ubuntu 22.04 was the only Linux distro that AI Workbench installation supported. The 2024.07 release added Ubuntu 24.04 support.

Logging

AI Workbench has multiple log files that can be complicated to find and interpret. To address this, the AI Workbench CLI now has a support command that lets you export metadata and logs into a zip file. This eliminates the need to find the files and includes metadata that can be sent to NVIDIA Support for faster diagnosis and remediation.

Coming soon

Here’s a sneak peak at where AI Workbench is heading: app sharing and multi-container support.

App sharing

Currently, a running application in a Workbench Project is only accessible to the user running that Workbench Project. Some users have requested the ability to share running applications.

In the next release, AI Workbench users will be able to securely share web apps in a Workbench Project through a link. Apps will be directly accessible to authenticated users in a web browser without requiring them to use AI Workbench.

Multi-container support

The current multi-container approach with the NIM Anywhere Project is a bit of a workaround. An upcoming AI Workbench release will have streamlined support for multi-container applications.

Next steps

Get started with AI Workbench by installing the application from the webpage. Users who already have AI Workbench can follow the instructions to update to the latest version. For more information, see Install AI Workbench on Windows.

Explore a range of example AI Workbench projects from data science to RAG.Ask questions on the NVIDIA AI Workbench developer forum and learn more about how other developers are using AI Workbench.

Related resources

GTC session: Getting the Most From NVIDIA AI Workbench for GenAI Workflows
GTC session: Accelerate your Workflows and Gain Competitive Advantage With AI Workstations (Presented by Z by HP)
GTC session: Breaking Barriers: How NVIDIA AI Workbench Makes AI Accessible to All
NGC Containers: Python Basic for AI Workbench
NGC Containers: PyTorch for AI Workbench
NGC Containers: NVIDIA GPU Operator

Tyler Whitehouse
Senior Product Manager, AI Workbench, NVIDIA

Edward Li
Technical Marketing Engineer, Enterprise Computing, NVIDIA

Shruthii Sathyanarayanan
Product Marketing Manager, Enterprise Computing, NVIDIA

If you're building AI or vision-enabled products, you've come to the right place.