NVIDIA HGX AI Supercomputer

The world’s leading AI computing platform.

Purpose-Built for AI and HPC

AI, complex simulations, and massive datasets require multiple GPUs with extremely fast interconnections and a fully accelerated software stack. The NVIDIA HGX™ AI supercomputing platform brings together the full power of NVIDIA GPUs, NVLink®, NVIDIA networking, and fully optimized AI and high-performance computing (HPC) software stacks to provide the highest application performance and drive the fastest time to insights. 

Unmatched End-to-End Accelerated Computing Platform

The NVIDIA HGX H200 combines H200 Tensor Core GPUs with high-speed interconnects to form the world’s most powerful servers. Configurations of up to eight GPUs deliver unprecedented acceleration, with up to 1.1 terabytes (TB) of GPU memory and 38 terabytes per second (TB/s) of aggregate memory bandwidth. This combined with a staggering 32 petaFLOPS of performance creates the world’s most powerful accelerated scale-up server platform for AI and HPC.

Both the HGX H200 and HGX H100 include advanced networking options—at speeds up to 400 gigabits per second (Gb/s)—utilizing NVIDIA Quantum-2 InfiniBand and Spectrum™-X Ethernet for the highest AI performance. HGX H200 and HGX H100 also include NVIDIA® BlueField®-3 data processing units (DPUs) to enable cloud networking, composable storage, zero-trust security, and GPU compute elasticity in hyperscale AI clouds.

HGX Stack

Deep Learning Training: Performance and Scalability

Up to 5X Faster Training at Scale

NVIDIA H200 and H100 GPUs feature the Transformer Engine, with FP8 precision, that provides up to 5X faster training over the previous GPU generation for large language models. The combination of fourth-generation NVLink—which offers 900GB/s of GPU-to-GPU interconnect—PCIe Gen5, and Magnum IO™ software delivers efficient scalability, from small enterprises to massive unified GPU clusters. These infrastructure advances, working in tandem with the NVIDIA AI Enterprise software suite, make HGX H200 and HGX H100 the world’s leading AI computing platform.

Deep Learning Inference: Performance and Versatility

Up to 30X Higher AI Inference Performance on the Largest Models

Megatron chatbot inference with 530 billion parameters.

Real-Time Deep Learning Inference

AI solves a wide array of business challenges using an equally wide array of neural networks. A great AI inference accelerator has to not only deliver the highest performance, but also the versatility needed to accelerate these networks in any location—from data center to edge—that customers choose to deploy them. 

HGX H200 and HGX H100 further extend NVIDIA’s market-leading inference leadership, accelerating inference by up to 30X over the prior generation on Megatron 530 billion parameter chatbots.

HPC Performance

Up to 110X Higher Performance for HPC Applications

110x Higher Performance for HPC Adipplications

Memory bandwidth is crucial for high-performance computing applications as it enables faster data transfer, reducing complex processing bottlenecks. For memory-intensive HPC applications like simulations, scientific research, and artificial intelligence, H200’s higher memory bandwidth ensures that data can be accessed and manipulated efficiently, leading up to 110X faster time to results compared to CPUs.

Accelerating HGX With NVIDIA Networking

The data center is the new unit of computing, and networking plays an integral role in scaling application performance across it. Paired with NVIDIA Quantum InfiniBand, HGX delivers world-class performance and efficiency, which ensures the full utilization of computing resources.

For AI cloud data centers that deploy Ethernet, HGX is best used with the NVIDIA Spectrum-X networking platform, which powers the highest AI performance over 400Gb/s Ethernet. Featuring NVIDIA Spectrum™-4 switches and BlueField-3 DPUs, Spectrum-X delivers consistent, predictable outcomes for thousands of simultaneous AI jobs at every scale through optimal resource utilization and performance isolation. Spectrum-X enables advanced cloud multi-tenancy and zero-trust security.  As a reference design for NVIDIA Spectrum-X, NVIDIA has designed Israel-1, a hyperscale generative AI supercomputer built with Dell PowerEdge XE9680 servers based on the NVIDIA HGX™ H100 eight-GPU platform, BlueField-3 DPUs, and Spectrum-4 switches.

Connecting HGX H200 or HGX H100 with NVIDIA Networking

  NVIDIA Quantum-2 InfiniBand Platform:

Quantum-2 Switch, ConnectX-7 Adapter, BlueField-3 DPU

NVIDIA Spectrum-X Platform:

Spectrum-4 Switch, BlueField-3 DPU, Spectrum-X License

NVIDIA Spectrum Ethernet Platform:

Spectrum Switch, ConnectX Adapter, BlueField DPU


NVIDIA HGX Specifications

NVIDIA HGX is available in single baseboards with four or eight H200 or H100 GPUs, or four or eight A100 GPUs. These powerful combinations of hardware and software lay the foundation for unprecedented AI supercomputing performance.

  HGX H200
  4-GPU 8-GPU
GPUs HGX H200 4-GPU HGX H200 8-GPU
Form factor 4x NVIDIA H200 SXM 8x NVIDIA H200 SXM
HPC and AI compute (FP64/TF32/FP16/FP8/INT8) 268TF/4PF/8PF/16PF/16 POPS 535TF/8PF/16PF/32PF/32 POPS
Memory Up to 564GB Up to 1.1TB
NVLink Fourth generation Fourth generation
NVSwitch N/A Third generation
NVSwitch GPU-to-GPU bandwidth N/A 900GB/s
Total aggregate bandwidth 3.6TB/s 7.2TB/s
  HGX H100
  4-GPU 8-GPU
GPUs HGX H100 4-GPU HGX H100 8-GPU
Form factor 4x NVIDIA H100 SXM 8x NVIDIA H100 SXM
HPC and AI compute (FP64/TF32/FP16/FP8/INT8) 268TF/4PF/8PF/16PF/16 POPS 535TF/8PF/16PF/32PF/32 POPS
Memory Up to 320GB Up to 640GB
NVLink Fourth generation Fourth generation
NVSwitch N/A Third generation
NVLink Switch N/A N/A
NVSwitch GPU-to-GPU bandwidth N/A 900GB/s
Total aggregate bandwidth 3.6TB/s 7.2TB/s
  HGX A100
  4-GPU 8-GPU
GPUs HGX A100 4-GPU HGX A100 8-GPU
Form factor 4x NVIDIA A100 SXM 8x NVIDIA A100 SXM
HPC and AI compute (FP64/TF32/FP16/INT8) 78TF/1.25PF/2.5PF/5 POPS 156TF/2.5PF/5PF/10 POPS
Memory Up to 320GB Up to 640GB
NVLink Third generation Third generation
NVSwitch N/A Second generation
NVSwitch GPU-to-GPU bandwidth N/A 600GB/s
Total aggregate bandwidth 2.4TB/s 4.8TB/s

Find out more about the NVIDIA H200 Tensor Core GPU.