NVIDIA Rubin Platform

NVIDIA Rubin Platform

Shaping the next generation of AI.

Overview

Driving the Era of Agentic AI

The NVIDIA Rubin platform is built for the age of agentic AI and reasoning, engineered to master multi-step problem-solving and massive long-context workflows at scale. By eliminating critical bottlenecks in communication and memory movement, the Rubin Platform supercharges inference—delivering more tokens per watt and lowering cost per token versus the NVIDIA Blackwell generation.

NVIDIA Kicks Off the Next Generation of AI With Rubin—Six New Chips, One Incredible AI Supercomputer

The leading-edge platform scales mainstream adoption, slashing cost per token with five breakthroughs for reasoning and agentic AI models.

Look Inside the Technological Breakthroughs

Transformer Engine

The Rubin platform features a new Transformer Engine with hardware-accelerated adaptive compression to boost NVFP4 performance while preserving accuracy, enabling up to 50 petaFLOPS of NVFP4 inference. Fully compatible with NVIDIA Blackwell, the Transformer Engine ensures seamless upgrades, so previously optimized codes transition effortlessly to the Rubin platform.

Third-Generation Confidential Computing

The third-generation of NVIDIA Confidential Computing expands security to full-rack scale with NVIDIA Vera Rubin NVL72. This platform creates a unified trusted execution environment across all 36 NVIDIA Vera CPUs, 72 NVIDIA Rubin GPUs, and the NVIDIA NVLink™ fabric that seamlessly connects them. The platform maintains data security across CPU, GPU, and NVLink domains. With attestation services for cryptographic proof of compliance, it combines massive scale with uncompromised protection, all to protect the world’s largest proprietary models, training data, and inference workloads.

Sixth-Generation NVLink and NVLink Switch

The sixth-generation NVLink delivers a major leap for NVIDIA's high-speed GPU interconnect fabric that unifies 72 NVIDIA Rubin GPUs into a single performance domain. Doubling NVIDIA Blackwell’s performance, Rubin delivers 3.6 terabytes per second (TB/s) of bandwidth per GPU and 260 TB/s of connectivity with low latency to facilitate faster communication. Combined with NVIDIA® Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)™ that reduces network congestion by up to 50 percent for collective operations, this next-generation interconnect accelerates training and inference for the world’s largest models, at scale and without compromise.

Second-Generation Reliability, Availability, and Serviceability (RAS) Engine

NVIDIA Rubin platform delivers rack-scale resiliency with advanced reliability features. NVIDIA Rubin GPUs feature a dedicated second-generation RAS engine for proactive maintenance and real-time health checks without downtime, while NVIDIA Vera CPUs add enhanced serviceability with SOCAMM LPDDR5X and in-system tests for the CPU cores. The rack introduces modular, cable-free tray designs for 18x faster assembly and serviceability versus NVIDIA Blackwell, combined with intelligent resiliency and software-defined NVLink routing, which ensures continuous operation and reduces maintenance overhead.

NVIDIA Vera CPU

The NVIDIA Vera CPU is engineered for data movement and agentic reasoning across accelerated systems, with full confidential computing support. It pairs seamlessly with NVIDIA GPUs or operates independently for analytics, cloud, orchestration, storage, and high-performance computing (HPC) workloads. Vera combines 88 NVIDIA-designed cores, up to 1.2 TB/s of LPDDR5X memory bandwidth, and NVIDIA Scalable Coherency Fabric to deliver predictable, energy-efficient performance for data- and memory-intensive workloads with full Arm® compatability. Integrated NVLink-C2C connectivity enables high-bandwidth, coherent CPU–GPU memory access to maximize system utilization and efficiency.

Explore NVIDIA Rubin Products

NVIDIA Vera Rubin NVL72

NVIDIA Vera Rubin NVL72 unifies 72 NVIDIA Rubin GPUs, 36 NVIDIA Vera CPUs, NVIDIA ConnectX®-9 SuperNICs, and NVIDIA BlueField®-4 DPUs. It scales up intelligence in a rack-scale platform with the sixth-generation NVLink and NVLink switch and scales out with NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum-X™ Ethernet to power the AI industrial revolution at scale.

NVIDIA DGX Vera Rubin NVL72

NVIDIA DGX Vera Rubin NVL72 provides enterprises with a turnkey, ready-to-deploy AI infrastructure solution built upon the NVIDIA Rubin platform, purpose-built to be deployed at scale to accelerate the most complex AI models.

NVIDIA DGX Rubin NVL8

NVIDIA DGX Rubin NVL8 is a liquid-cooled AI system powered by eight NVIDIA Rubin GPUs and sixth-generation NVLink, purpose-built to accelerate training, inference, and post-training for every AI workload.

Inside the NVIDIA Rubin Platform: Six New Chips, One AI Supercomputer

Read this technical deep dive to learn how NVIDIA Vera Rubin treats the data center, not the chip, as the unit of compute, establishing a new foundation for producing intelligence efficiently, securely, and predictably at scale.