New AI SDKs and Tools Released for NVIDIA Blackwell GeForce RTX 50 Series GPUs

This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA.

NVIDIA recently announced a new generation of PC GPUs—the GeForce RTX 50 Series—alongside new AI-powered SDKs and tools for developers. Powered by the NVIDIA Blackwell architecture, fifth-generation Tensor Cores and fourth-generation RT Cores, the GeForce RTX 50 Series delivers breakthroughs in AI-driven rendering, including neural shaders, digital human technologies, geometry and lighting.

Today, NVIDIA releases the first wave of SDKs for GeForce RTX 50 Series GPUs. As a developer, you can start integrating these updates into your apps to ensure software compatibility and optimal performance with NVIDIA Blackwell RTX GPUs, and expose the new features in GeForce RTX 50 Series GPUs.

This post details new and updated SDKs that enable developers to take full advantage of NVIDIA Blackwell GeForce RTX 50 Series GPUs.

Improved AI frameworks: CUDA, TensorRT, and PyTorch

To ensure compatibility with GeForce RTX 50 Series, it’s recommended that developers update to the latest version of the AI frameworks.

CUDA Toolkit 12.8 and NVIDIA TensorRT 10.8 are now available for optimized AI performance from RTX 50 Series GPUs.
Updates to PyTorch for native Windows on NVIDIA Blackwell RTX GPUs have been upstreamed into the main PyTorch GitHub repo. PyPi binaries and packages for Windows will be updated shortly.
PyTorch for Linux x86_64 on NVIDIA Blackwell RTX GPUs are now supported in nightly builds.

For details about updating your applications to the latest AI frameworks, see Software Migration Guide for NVIDIA Blackwell RTX GPUs: A Guide to CUDA 12.8, PyTorch, TensorRT, and Llama.cpp.

TensorRT 10.8 introduces support for FP4, which supercharges the latest diffusion-based models like Flux by more than 2x compared to FP16 precision on the RTX 4090. Additionally, TensorRT 10.8 offers weight-stripped engines to prevent weight duplication when shipping dedicated engines for different GPU arch families, thereby improving memory utilization. Moreover, NVIDIA TensorRT-Cloud now supports the latest GeForce RTX 50 Series GPUs, enabling developers to remotely build optimized inference engines.

AI-powered gaming

The GeForce RTX 50 Series GPUs and the latest SDK updates enable you, as a developer, to build revolutionary games using the following technologies.

Neural rendering with NVIDIA DLSS

NVIDIA DLSS is a suite of neural rendering technologies that uses AI to boost FPS, reduce latency, and improve image quality. Powered by GeForce RTX 50 Series GPUs and fifth-generation Tensor Cores, DLSS 4 introduces DLSS Multi Frame Generation which generates up to three additional frames and works in unison with the complete suite of DLSS technologies to multiply frame rates by up to 8x over traditional brute-force rendering. Additionally, DLSS Ray Reconstruction, DLSS Super Resolution, and DLAA technologies are now powered by transformer-based models to improve image and lighting detail and stability for all GeForce RTX GPUs.

Get started with DLSS through NVIDIA Streamline, an open-source cross-IHV solution that simplifies integration of the latest NVIDIA and other super-resolution technologies into applications and games.

Bring game characters to life with NVIDIA ACE

NVIDIA ACE is a suite of digital human technologies that bring game characters and digital assistants to life with generative AI. ACE now enables you to easily add agentic capabilities to digital humans within your game or app. It includes the following:

New multimodal SLMs, in early access, for advanced and autonomous agentic workflows, with support for longer-context and complex reasoning tasks.
Audio2Face 3D NIM provides state-of-the-art lipsync and facial animation using real-time audio.

Streamline AI model deployment with NVIDIA In-Game Inferencing SDK

NVIDIA In-Game Inferencing (IGI) SDK streamlines AI model deployment and integration for PC game developers. The SDK preconfigures the PC with the necessary AI models, engines, and dependencies. It orchestrates in-process AI inference for C++ games with support for all major inference backends across different hardware accelerators (GPU, NPU, CPU). The IGI SDK is now available in beta for select partners with general availability coming soon.

Accelerated content creation

New and updated SDKs with support for content creation on Blackwell RTX GPUs include the following.

Enhance video conferencing with NVIDIA Maxine

NVIDIA Maxine is a collection of high-performance, easy-to-use, NVIDIA NIM microservices and SDKs for deploying AI features that enhance audio, video, and augmented reality effects for video conferencing and telepresence. New features include the following:

Studio Voice makes any mic sound like a professional mic.
Virtual Key Light relights a face as if it were using a virtual keylight (coming soon).

Generate photorealistic imagery with NVIDIA Iray

NVIDIA Iray SDK is an intuitive, physically based rendering technology to generate photorealistic imagery for interactive and batch rendering workflows. Updates include:

Improved diffuse and sheen BRDFs with the new NVIDIA MDL SDK 1.10
Improved geometry tesselation and displacement
Precise and robust rendering of caustics
New mode to automatically enable and disable the sampling of caustics, to improve either quality or performance of the renderings
Support for faster clustering or network rendering

Hardware-accelerated video encoding and decoding with NVIDIA Video Codec SDK

NVIDIA Video Codec SDK is a set of APIs for hardware-accelerated video encode and decode on Windows and Linux. Updates include:

Support for 4:2:2 H.264, HEVC encoding, and decoding to take advantage of the ninth-generation NVENC encode in Blackwell
Introducing MV-HEVC, and UHQ AV1 for improved encode quality
Decode memory optimizations and 2xH.264 decode throughput per NVDEC over previous generations

These updates are coming soon for use through FFMPEG, and the Microsoft DXVA and MFT frameworks.

Optimize ray tracing with NVIDIA OptiX

NVIDIA OptiX SDK is an application framework for achieving optimal ray tracing performance on the GPU. It provides a simple, recursive, and flexible pipeline for accelerating ray tracing algorithms. Updates to OptiX 9.0 include:

Clusters API to accelerate BVH builds of massive dynamic triangle meshes
Cooperative Vectors API for executing small AI networks within OptiX Shader programs, accelerated by NVIDIA Tensor Cores
Hardware-accelerated linear curves on Blackwell GPUs

Boost AI-enhanced effects with NVIDIA RTX Video SDK

NVIDIA RTX Video SDK provides AI-enhanced effects technologies for creative and media playback apps to improve sharpness, clarity, and the automatic conversion of SDR video to HDR. The updates bring new neural networks that are 40% more performant, apply AI upscaling to 10-bit HDR video, and support for CUDA.

Get started

Ready to experiment, develop, and optimize with the latest AI capabilities on over 100 million RTX PCs worldwide? Get started with AI on NVIDIA RTX PCs. To learn more about adding support for NVIDIA Blackwell RTX GPUs to your AI application for maximum performance, check out the Software Migration Guide for NVIDIA Blackwell RTX GPUs: A Guide to CUDA 12.8, PyTorch, TensorRT, and Llama.cpp.

Related resources

GTC session: Getting AI to the Edge with NVIDIA Jetson Orin, Edge Impulse and Lumeo
NGC Containers: ASR Parakeet CTC Riva 1.1b
NGC Containers: NMT Megatron Riva 1b
SDK: Path Tracing SDK
SDK: Video Codec
SDK: NVAPI

Annamalai Chockalingam
Product Marketing Manager, NeMo Megatron and NeMo NLP, NVIDIA

If you're building AI or vision-enabled products, you've come to the right place.