Enterprise Reference Architectures

Build AI Factories That Scale

Turn your data center into a high-performance AI factory with NVIDIA Enterprise Reference Architectures.

Read Whitepaper | Explore NVIDIA-Certified Systems

Overview

The Building Blocks for AI Success

NVIDIA Enterprise Reference Architectures (Enterprise RAs) enable organizations to design, deploy, and scale high-performance AI factories using validated, repeatable infrastructure. These designs combine certified compute, high-speed east-west and north-south networking, observability tools, and software to ensure scalable performance, from four-node clusters to enterprise-scale environments.

Palantir Teams With NVIDIA to Deliver Sovereign AI Operating System Reference Architecture

The Palantir Sovereign AI OS Reference Architecture is based on NVIDIA Enterprise RAs, tested and qualified to run Palantir's complete software suite on NVIDIA AI infrastructure.

Read the Press Release

Proven Design and Validated Performance

Learn how Enterprise RAs, built on real-world deployments and battle-tested configurations, simplify planning and maximize ROI for scalable AI infrastructure.

Read the Whitepaper

Enterprise Reference Architectures

Your Guide to the Complete Family

A comprehensive suite of instructions for setting up clusters in the data center is now available.

Infrastructure

NVIDIA Enterprise Reference Architectures start with validated hardware configurations, including CPU-GPU-networking node patterns, cabling diagrams, and infrastructure details.

Network Logic

The Networking Configuration and Logical Architecture Logic Guide for Enterprise RAs provides instructions for node management and provisioning through VLAN design and network simulation on NVIDIA Air.

Software

Our software reference stack for Enterprise RAs outlines the software for managing, provisioning, and sizing infrastructure clusters. Current releases focus on open-source Kubernetes, with NVIDIA AI Enterprise and NVIDIA Run:ai software.

Observability

The Observability Guide for NVIDIA Enterprise Reference Architectures utilizes open-source tools, such as Prometheus and Grafana, to monitor GPU and networking performance across the entire cluster. Dashboards provide real-time metrics for system health and workload efficiency.

Deployment

The Deployment Guide for NVIDIA Enterprise Reference Architectures is a collection of infrastructure best practices that our team has learned from bringing up, deploying, testing, and validating the in-house clusters on which we’ve built our program.

Storage

The NVIDIA-Certified Storage Program is a complementary effort by select partners who have created storage guides designed to integrate into Enterprise RAs. Learn more about this unique program.

Use Cases

Designed for Every Use Case

Accelerate agentic AI, physical AI, high-performance computing (HPC), and AI simulation workloads with proven NVIDIA Enterprise Reference Architectures and NVIDIA-Certified Systems from global partners. The primary infrastructure cluster configurations for deploying enterprise AI factories are outlined below.

NVIDIA RTX PRO AI Factory
NVIDIA HGX AI Factory
NVIDIA NVL72 AI Factory

NVIDIA RTX PRO AI Factory

The NVIDIA RTX PRO™ AI Factory configuration is designed for a broad spectrum of enterprise workloads, including generative and agentic AI, data analytics, visual computing, and engineering simulation. Deployments are optimized around 16- and 32-node design points, providing an ideal balance of performance, scalability, and deployment efficiency. Designed for universal workload acceleration across enterprise AI, simulation, and visual computing, NVIDIA RTX PRO Servers are optimized for PCIe environments, making them ideal for space-, power-, and cooling-constrained data centers. Purpose-built for modern AI workloads, they deliver efficient performance for agentic AI and large language model (LLM) inference.

See Cluster Configuration Specs

NVIDIA HGX AI Factory

The high-performance NVIDIA HGX™ AI Factory configuration is purpose-built for multi-node AI training and inference at scale, leveraging NVIDIA HGX systems. Available in 32-, 64-, and 128-node design points and supported by NVIDIA Spectrum-X™ networking, the architecture features a flexible, rail-optimized design that enables efficient integration across diverse rack layouts while delivering high-throughput, low-latency performance. It provides breakthrough performance for AI power users running the most demanding workloads, enables large-scale model training and fine-tuning, and dramatically accelerates inference. With next-generation precision and ultra-fast interconnects, the solution achieves up to 15x higher token throughput.

See Cluster Configuration Specs

NVIDIA NVL72 AI Factory

The NVIDIA NVL72 AI Factory configuration is designed to train and deploy trillion-parameter models, delivering exascale computing power within a single rack. Built for massive model throughput, multi-user inference, and real-time inference at scale, it enables the next generation of AI-driven innovation. Deployment design points center on four- and eight-rack configurations. Built on a flexible, rail-optimized network, the architecture adapts to diverse rack layouts and system designs while delivering high-bandwidth, low-latency performance. The platform delivers exceptional AI factory output with industry-leading energy efficiency and is powered by fifth-generation NVIDIA NVLink™, FP4 Tensor Cores, and advanced thermal innovations.

See Cluster Configuration Specs

Benefits

The Strategic Value of Enterprise RAs

Unlock scalable, high-performance AI infrastructure with proven, partner-ready configurations.

Peak Performance for AI Workloads

Meet the intensive demands of AI inference, fine-tuning, and training with architectures that ensure full GPU utilization and performance consistency across multi-node clusters.

Flexible Scaling, Simplified Operations

Easily expand your infrastructure and ensure scalable, streamlined deployment for up to 128 nodes. Build the foundation for full-stack solutions with the NVIDIA Enterprise AI Factory validated design, which leverages our software ecosystem.

Reduce Complexity and TCO

Simplify deployment processes and efficient designs, reduce complexity and total cost of ownership (TCO), while reducing time to value.

Supportability

Follow specific, standardized design patterns to achieve consistent operation from one installation to the next, reduce the need for frequent support, and enable faster resolution times.

Partners

Partnered for Performance

We’re proud to collaborate with leading partners as they bring Enterprise Reference Architectures and AI factory solutions to market. Endorsed designs from these partners have passed our Design Review Board, offering guidance that earns our endorsement in one or more of the following categories: infrastructure, networking logic, and software.

Get Started

Palantir Sovereign AI OS Reference Architecture With NVIDIA

The Palantir Sovereign AI OS Reference Architecture is based on NVIDIA Enterprise RAs, tested and qualified to run Palantir's complete software suite on NVIDIA AI infrastructure with our global system partners. This sovereign AI architecture is critical for customers with latency-sensitive workflows, data sovereignty requirements, and high geographic distribution. The architecture provides enterprises with total control over their data, AI models, and applications.

Learn More

Resources

Learn More About Enterprise RAs

NVIDIA RTX PRO AI Factory Reference Architecture

The NVIDIA RTX PRO AI Factory configuration supports a broad range of enterprise workloads, including agentic AI inference, physical and industrial AI, visual computing, and high-performance computing for data analytics and simulation. This document details the hardware components underpinning this scalable and modular architecture.

Read Whitepaper

NVIDIA HGX AI Factory Reference Architecture

The NVIDIA HGX AI Factory configuration is focused on high-performance AI inference, model training, and fine-tuning. This document outlines the hardware components of a scalable, modular architecture, including cluster guidance and network fabric topologies used to interconnect the cluster.

Read Whitepaper

Unlock Massive Token Throughput with NVIDIA Run:ai

Joint benchmarking with Nebius shows that fractional GPU deployments using NVIDIA Run:ai on NVIDIA Enterprise Reference Architectures significantly improve throughput and utilization for production LLM workloads.

Read Blog

NVIDIA Enterprise Reference Architecture Overview

This whitepaper introduces NVIDIA Enterprise Reference Architectures, which provide proven guidance for designing and building AI factories for enterprise-class deployments ranging from 32 to 1,024 GPUs. These architectures help simplify AI infrastructure deployment, reduce operational complexity, and accelerate time to value.

Read Whitepaper

North–South Networks: The Key to Faster Enterprise AI Workloads

NVIDIA Enterprise Reference Architectures guide organizations in deploying AI factories that utilize both north-south and east-west networks, providing design recipes for scalable, secure, and high-performing AI infrastructure.

Read Blog

Deploying NVIDIA H200 NVL at Scale With a New Enterprise Reference Architecture

NVIDIA H200 NVL accelerates AI deployment with enhanced memory, high-speed NVLink, and an optimized Enterprise RA configuration.

Read Blog

NVIDIA’s AI Factory Drives Enterprise Innovation at Scale

NVIDIA built a unified AI factory to scale generative AI and agentic workflows across the enterprise, ensuring security, performance, and consistency. The platform supports hundreds of AI agents that accelerate innovation, streamline software and hardware engineering, and optimize supply chain operations—reducing planning times by over 95 percent and achieving decades’ worth of engineering work in just one year.

Explore Key Results

NVIDIA Blackwell Ultra Delivers up to 50x Better Performance and 35x Lower Cost for Agentic AI

Built to accelerate the next generation of agentic AI, NVIDIA Blackwell Ultra delivers breakthrough inference performance with dramatically lower cost. Cloud providers such as Microsoft, CoreWeave, and Oracle Cloud Infrastructure are deploying NVIDIA GB300 NVL72 systems at scale for low-latency and long-context use cases, such as agentic coding and coding assistants.

This is enabled by deep co-design across NVIDIA Blackwell, NVLink™, and NVLink Switch for scale-out; NVFP4 for low-precision accuracy; and NVIDIA Dynamo and TensorRT™ LLM for speed and flexibility—as well as development with community frameworks SGLang, vLLM, and more.

Explore Key Results

Next Steps

Ready to Get Started?

Learn more about NVIDIA Enterprise AI Factory.

Get Started

Take a Deeper Dive Into NVIDIA Enterprise Reference Architectures

Explore how NVIDIA Enterprise Reference Architectures provide scalable, prescriptive blueprints for deploying high-performance AI infrastructure.

Read Whitepaper

CPUs (Eligible)	2x 64c Intel Xeon 2x 64c AMD EPYC
GPUs	8x NVIDIA RTX PRO™ 6000 Blackwell Server Edition
Networking (East-West)	4x NVIDIA® BlueField®-3 B3140H (1x 400 Gb)
Networking (North-South)	1x BlueField-3 B3220 (2x 200 Gb)
Host Memory (Min)	Min 1,024 GB DDR5 ECC (1x DIMM per slot)
Host Boot Drive (Min)	1x 1 TB NVMe
Host Storage (Min)	2x 4 TB NVMe

CPUs (Eligible)	2x 64c Intel Xeon 2x 64c AMD EPYC
GPUs	8x NVIDIA Blackwell Ultra GPU
Networking (East-West)	8x NVIDIA® BlueField®-3 B3140H (1x 400 Gb)
Networking (North-South)	1x BlueField-3 B3220 (2x 200 Gb)
Host Memory (Min)	Min 1,536 GB DDR5 ECC (1x DIMM per slot)
Host Boot Drive (Min)	1x 1 TB NVMe
Host Storage (Min)	2x 4 TB NVMe

CPUs	2x 72c NVIDIA Grace™ (36 per rack)
GPUs	4x NVIDIA Blackwell GPUs (72 per rack)
Networking (East-West)	4x NVIDIA® ConnectX®-7 (1x 400 Gb)
Networking (North-South)	2x NVIDIA BlueField®-3 B3240 (4x 200 Gb)

Enterprise Reference Architectures

Build AI Factories That Scale

The Building Blocks for AI Success

Palantir Teams With NVIDIA to Deliver Sovereign AI Operating System Reference Architecture

Proven Design and Validated Performance

Your Guide to the Complete Family

Infrastructure

Network Logic

Software

Observability

Deployment

Storage

Designed for Every Use Case

NVIDIA RTX PRO AI Factory

NVIDIA HGX AI Factory

NVIDIA NVL72 AI Factory

The Strategic Value of Enterprise RAs

Peak Performance for AI Workloads

Flexible Scaling, Simplified Operations

Reduce Complexity and TCO

Supportability

Partnered for Performance

Palantir Sovereign AI OS Reference Architecture With NVIDIA

Learn More About Enterprise RAs

NVIDIA RTX PRO AI Factory Reference Architecture

NVIDIA HGX AI Factory Reference Architecture

Unlock Massive Token Throughput with NVIDIA Run:ai

NVIDIA Enterprise Reference Architecture Overview

North–South Networks: The Key to Faster Enterprise AI Workloads

Deploying NVIDIA H200 NVL at Scale With a New Enterprise Reference Architecture

NVIDIA’s AI Factory Drives Enterprise Innovation at Scale

NVIDIA Blackwell Ultra Delivers up to 50x Better Performance and 35x Lower Cost for Agentic AI

Next Steps

Ready to Get Started?

Take a Deeper Dive Into NVIDIA Enterprise Reference Architectures

Cluster Configuration 2-8-5-200 Specs

Cluster Configuration 2-8-9-400 Specs

Cluster Configuration 2-4-6-400 Specs