NVIDIA NVLink Fusion

Semi-custom AI infrastructure with industry-proven AI scale-up performance and rack-scale architecture.

Overview

Semi-Custom AI Factories with NVLink Fusion

NVIDIA NVLink™ Fusion is the high-bandwidth, low-latency connective technology and IP that enables hyperscalers and AI natives to deploy custom XPUs and CPUs into NVIDIA’s world-leading AI infrastructure platform. Leverage NVIDIA’s proven scale-up and scale-out technology stack and ecosystem, as well as the MGX™ rack-scale architecture, to reduce development complexity, increase performance, and accelerate time to market for semi-custom AI factories. By standardizing on a single unified architecture, NVLink Fusion simplifies operations across the data center, enables flexible reprovisioning of data center capacity, and allows custom XPUs to integrate seamlessly with GPUs for heterogeneous compute. 

AWS Integrates AI Infrastructure with NVIDIA NVLink Fusion for Trainium4 Deployment

Learn how AWS is using NVLink Fusion to accelerate Trainium4 deployment.

Integrating Semi-Custom Compute into Rack-Scale Architecture with NVIDIA NVLink Fusion

Learn how NVIDIA NVLink Fusion allows hyperscalers to build semi-custom AI infrastructure, integrating their ASICs or CPUs with NVIDIA GPUs, while standardizing on a single scalable hardware infrastructure.

Using NVLink Fusion, high-performance AI factories can scale quickly, benefiting from all solution components that make the NVIDIA rack-scale architecture.

Benefits

NVLink Fusion Benefits

World-Class Scale-Up Performance

Unlocking the full potential of AI factories requires swift, seamless communication among all accelerators. NVIDIA NVLink 6 can connect 72 XPUs all-to-all at 3.6 TB/s per XPU, with future roadmap configurations including domain sizes up to 1,152, to boost AI performance and return on investment.

Production-Proven Technology Ecosystem and Supply Chain

The comprehensive NVLink Fusion technology ecosystem, including XPU design partners, CPU partners, and IP suppliers, helps hyperscalers and AI natives optimize XPU designs and streamline development. The MGX ecosystem provides a comprehensive rack-scale architecture and connects adopters’ hyperscalers to the same, proven supply chain that NVIDIA uses for its own MGX-based systems, eliminating the complexity of new rack designs and supplier management and accelerating time to market. 

Flexible Reprovisioning and Deployment Risk Mitigation

A key benefit of adopting MGX rack architecture is that XPU and GPU-based systems (such as Vera Rubin NVL72) can easily be designed into the same data center, sharing the same racks and rack footprints, networking, cooling, power delivery, and management systems. This unified approach allows NVLink Fusion adopters to decouple data center design and buildout from silicon readiness and supply, and enables them to easily reprovision data center capacity with a different mix of XPU- or GPU-based systems as needs evolve.

Unified Architecture for Heterogeneous AI Infrastructure

NVLink Fusion adopters can deploy different types of XPUs—or XPUs and GPUs—in the same data center for heterogeneous compute for disaggregated inference and other asymmetric workloads.

The result is a single, semi-custom AI factory that no one company could build alone.

Platform

NVIDIA NVLink Fusion Technology

NVIDIA NVLink

NVIDIA NVLink 6 and NVLink Switch Chip enable 260 TB/s of bandwidth in a single 72-accelerator NVLink domain (NVL72) and deliver 4x bandwidth efficiency with NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)™ FP8 support.

NVIDIA NVLink-C2C

NVIDIA NVLink-C2C extends the industry-leading NVLink technology to a chip-to-chip interconnect. This enables the creation of a new class of integrated products with NVIDIA partners, built via chiplets, allowing NVIDIA GPUs or CPUs to have a high-bandwidth coherent connection with custom silicon.

AI Infrastructure Platform

NVIDIA provides a modular portfolio of AI factory technology, including NVIDIA GPUs, NVIDIA Vera CPUs, co-packaged optics (CPO) switches, ConnectX® SuperNICs™, BlueField® DPUs, and Mission Control™ software for optimizing AI workflows and managing AI infrastructure.

Full-rack solutions are also available for semi-custom AI factory integration, including the Vera Rubin NVL72 rack, which can be mixed with XPU-based systems for disaggregated inference, the Vera CPU rack for supporting agentic AI systems and reinforcement learning, the NVIDIA LPX rack for assisting high-context and low-latency inference, the NVIDIA STX rack for AI-native storage, and the NVIDIA SPX rack for scale-out networking.

Adopters

NVLink Fusion Ecosystem

Scaling AI Inference Performance With NVLink Fusion

Learn how NVIDIA NVLink Fusion addresses the growing demands of complex AI models.