NVIDIA Quantum-2 InfiniBand Architecture

Extreme performance for cloud-native supercomputing at any scale

Record-Breaking Performance in Network Communications

NVIDIA Quantum-2, the seventh generation of the NVIDIA InfiniBand architecture, gives AI developers and scientific researchers the fastest networking performance and feature-sets available to take on the world’s most challenging problems. NVIDIA Quantum-2 empowers the world’s leading supercomputing data centers with software-defined networking, In-Network Computing, performance isolation, advanced acceleration engines, remote direct-memory access (RDMA), and the fastest speeds and feeds up to 400Gb/s.

2X Data Throughput

Data Speed

4X MPI Performance

Improved Performance

5X Switch System Capacity

Improved TCO

6.5X Higher Scalability

Exascale Ready

32X More AI Acceleration

Accelerated Deep Learning

Performance that Impacts

Enhancing HPC and AI Supercomputers and Applications

Accelerated In-Network Computing

Today’s high-performance computing (HPC), AI, and hyperscale infrastructures require faster interconnects and more intelligent networks to analyze data and run complex simulations with greater speed and efficiency. NVIDIA InfiniBand enhances and extends its In-Network Computing with preconfigured and programmable compute engines, such as the third generation of NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARPv3)™, Message Passing Interface (MPI) Tag Matching, MPI All-to-All, and programmable engines, delivering the best cost per node and ROI.

 
Benefits of Performance Isolation

Performance Isolation

The NVIDIA Quantum-2 InfiniBand platform provides innovative proactive monitoring and congestion management to deliver traffic isolations, nearly eliminating performance jitter, and ensuring predictive performance as if the application is being run on a dedicated system.

Cloud-Native Supercomputing

The NVIDIA Cloud-Native Supercomputing platform leverages the NVIDIA® BlueField® data processing unit (DPU) architecture with high-speed, low-latency NVIDIA Quantum-2 InfiniBand networking. The solution delivers bare-metal performance, user management and isolation, data protection, on-demand high performance computing (HPC), and AI services—simply and securely.

Data center with NVIDIA Quantum-2 InfiniBand architecture with SHARPv3 technology

Delivering Data at the Speed of Light

Host Channel Adapters

The NVIDIA ConnectX-7 InfiniBand host channel adapter (HCA), with PCIe Gen4 and Gen5 support, is available in various form factors, delivering single or dual network ports at 400Gb/s.

The ConnectX-7 InfiniBand HCAs include advanced In-Network Computing capabilities and also include additional programmable engines that enable  preprocessing data algorithms and offload application control paths to the network.

Fixed-Configuration Switches

The NVIDIA Quantum-2 family of fixed-configuration switches comprises 64 400Gb/s ports or 128 200Gb/s ports on physical 32 octal small form-factor (OSFP) connectors. The compact 1U switch design includes air-cooled and liquid-cooled versions that are either internally or externally managed.

The NVIDIA Quantum-2 family of fixed-configuration switches delivers an aggregated 51.2 terabits per second (Tb/s) of bidirectional throughput with a capacity of more than 66.5 billion packets per second.

Modular Switches

The NVIDIA Quantum-2 family of modular switches provides these port configurations:

  •  2,048 ports of 400Gb/s or 4,096 ports of 200Gb/s
  •  1,024 ports of 400Gb/s or 2,048 ports of 200Gb/s
  •  512 ports of 400Gb/s or 1,024 ports of 200Gb/s

The largest modular switch carries a total bidirectional throughput of 1.64 petabits per second, 5X over the previous-generation NVIDIA Quantum InfiniBand modular switch.

Transceivers and Cables

The NVIDIA Quantum-2 connectivity options provide maximum flexibility to build a topology of choice. They include a variety of transceivers and multi-fiber push-on connectors (MPOs), active copper cables (ACCs), and direct attached cables (DACs) with 1–2 and 1–4 splitter options.

Backward compatibility is also available to connect new 400Gb/s clusters to existing 200Gb/s or 100Gb/s-based infrastructures.

World-Leading Networking Performance, Scalability, and Efficiency

Performance

  • 400Gb/s bandwidth per port
  • 64 400Gb/s ports or 128 200Gb/s ports in a single switch
  • 2,048 400Gb/s ports or 4,096 200Gb/s ports in a single modular switch
  • Over 66.5 billion packets per second (bidirectional) from a single NVIDIA Quantum-2 switch device

Breaking Our Own Records

  • 2X the bandwidth per port versus previous generation
  • 3X the switch radix versus previous generation
  • 4X MPI performance
  • 32X higher AI acceleration power per switch versus previous generation
  • Over one million 400Gb/s nodes in a four-switch-tier (three hops) DragonFly+ network, 6.5X higher than the previous generation
  • 7% reduction in data center power and space

Key Features

  • Full transport offload
  • RDMA, GPUDirect® RDMA, GPUDirect Storage
  • Programmable In-Network Computing engines
  • MPI All-to-All hardware acceleration
  • MPI Tag Matching hardware acceleration
  • NVIDIA SHARPv3
  • Advanced adaptive routing, congestion control, and QoS
  • Self-healing networking

Read the full architecture brief to learn more about NVIDIA Quantum-2 InfiniBand Platform