NVIDIA Enterprise Reference Architectures.

Enterprise Reference Architectures

Build AI Factories That Scale

Turn your data center into a high-performance AI factory with NVIDIA Enterprise Reference Architectures.

Overview

The Building Blocks for AI Success

NVIDIA Enterprise Reference Architectures (Enterprise RAs) enable organizations to design, deploy, and scale high-performance AI factories using validated, repeatable infrastructure. These designs combine certified compute, high-speed east-west and north-south networking, observability tools, and software to ensure scalable performance, from four-node clusters to enterprise-scale environments.

Palantir Teams With NVIDIA to Deliver Sovereign AI Operating System Reference Architecture

The Palantir Sovereign AI OS Reference Architecture is based on NVIDIA Enterprise RAs, tested and qualified to run Palantir's complete software suite on NVIDIA AI infrastructure.

Proven Design and Validated Performance

Learn how Enterprise RAs, built on real-world deployments and battle-tested configurations, simplify planning and maximize ROI for scalable AI infrastructure.

Enterprise Reference Architectures

Your Guide to the Complete Family

A comprehensive suite of instructions for setting up clusters in the data center is now available.

Infrastructure

NVIDIA Enterprise Reference Architectures start with validated hardware configurations, including CPU-GPU-networking node patterns, cabling diagrams, and infrastructure details.

Network Logic

The Networking Configuration and Logical Architecture Logic Guide for Enterprise RAs provides instructions for node management and provisioning through VLAN design and network simulation on NVIDIA Air.

Software

Our software reference stack for Enterprise RAs outlines the software for managing, provisioning, and sizing infrastructure clusters. Current releases focus on open-source Kubernetes, with NVIDIA AI Enterprise and NVIDIA Run:ai software.

Observability

The Observability Guide for NVIDIA Enterprise Reference Architectures utilizes open-source tools, such as Prometheus and Grafana, to monitor GPU and networking performance across the entire cluster. Dashboards provide real-time metrics for system health and workload efficiency.

Deployment

The Deployment Guide for NVIDIA Enterprise Reference Architectures is a collection of infrastructure best practices that our team has learned from bringing up, deploying, testing, and validating the in-house clusters on which we’ve built our program.

Storage

The NVIDIA-Certified Storage Program is a complementary effort by select partners who have created storage guides designed to integrate into Enterprise RAs. Learn more about this unique program.

Use Cases

Designed for Every Use Case

Accelerate agentic AI, physical AI, high-performance computing (HPC), and AI simulation workloads with proven NVIDIA Enterprise Reference Architectures and NVIDIA-Certified Systems from global partners. The primary infrastructure cluster configurations for deploying enterprise AI factories are outlined below.

NVIDIA RTX PRO AI Factory

The NVIDIA RTX PRO™ AI Factory configuration is designed for a broad spectrum of enterprise workloads, including generative and agentic AI, data analytics, visual computing, and engineering simulation. Deployments are optimized around 16- and 32-node design points, providing an ideal balance of performance, scalability, and deployment efficiency. Designed for universal workload acceleration across enterprise AI, simulation, and visual computing, NVIDIA RTX PRO Servers are optimized for PCIe environments, making them ideal for space-, power-, and cooling-constrained data centers. Purpose-built for modern AI workloads, they deliver efficient performance for agentic AI and large language model (LLM) inference.

NVIDIA HGX AI Factory

The high-performance NVIDIA HGX™ AI Factory configuration is purpose-built for multi-node AI training and inference at scale, leveraging NVIDIA HGX systems. Available in 32-, 64-, and 128-node design points and supported by NVIDIA Spectrum-X™ networking, the architecture features a flexible, rail-optimized design that enables efficient integration across diverse rack layouts while delivering high-throughput, low-latency performance. It provides breakthrough performance for AI power users running the most demanding workloads, enables large-scale model training and fine-tuning, and dramatically accelerates inference. With next-generation precision and ultra-fast interconnects, the solution achieves up to 15x higher token throughput.

NVIDIA NVL72 AI Factory

The NVIDIA NVL72 AI Factory configuration is designed to train and deploy trillion-parameter models, delivering exascale computing power within a single rack. Built for massive model throughput, multi-user inference, and real-time inference at scale, it enables the next generation of AI-driven innovation. Deployment design points center on four- and eight-rack configurations. Built on a flexible, rail-optimized network, the architecture adapts to diverse rack layouts and system designs while delivering high-bandwidth, low-latency performance. The platform delivers exceptional AI factory output with industry-leading energy efficiency and is powered by fifth-generation NVIDIA NVLink™, FP4 Tensor Cores, and advanced thermal innovations.

Benefits

The Strategic Value of Enterprise RAs

Unlock scalable, high-performance AI infrastructure with proven, partner-ready configurations.

Peak Performance for AI Workloads

Meet the intensive demands of AI inference, fine-tuning, and training with architectures that ensure full GPU utilization and performance consistency across multi-node clusters.

Flexible Scaling, Simplified Operations

Easily expand your infrastructure and ensure scalable, streamlined deployment for up to 128 nodes. Build the foundation for full-stack solutions with the NVIDIA Enterprise AI Factory validated design, which leverages our software ecosystem.

Reduce Complexity and TCO

Simplify deployment processes and efficient designs, reduce complexity and total cost of ownership (TCO), while reducing time to value.

Supportability

Follow specific, standardized design patterns to achieve consistent operation from one installation to the next, reduce the need for frequent support, and enable faster resolution times.

Partners

Partnered for Performance

We’re proud to collaborate with leading partners as they bring Enterprise Reference Architectures and AI factory solutions to market. Endorsed designs from these partners have passed our Design Review Board, offering guidance that earns our endorsement in one or more of the following categories: infrastructure, networking logic, and software.

Palantir Sovereign AI OS Reference Architecture With NVIDIA

The Palantir Sovereign AI OS Reference Architecture is based on NVIDIA Enterprise RAs, tested and qualified to run Palantir's complete software suite on NVIDIA AI infrastructure with our global system partners. This sovereign AI architecture is critical for customers with latency-sensitive workflows, data sovereignty requirements, and high geographic distribution. The architecture provides enterprises with total control over their data, AI models, and applications.

Resources

Learn More About Enterprise RAs

NVIDIA RTX PRO AI Factory Reference Architecture

The NVIDIA RTX PRO AI Factory configuration supports a broad range of enterprise workloads, including agentic AI inference, physical and industrial AI, visual computing, and high-performance computing for data analytics and simulation. This document details the hardware components underpinning this scalable and modular architecture.

NVIDIA HGX AI Factory Reference Architecture

The NVIDIA HGX AI Factory configuration is focused on high-performance AI inference, model training, and fine-tuning. This document outlines the hardware components of a scalable, modular architecture, including cluster guidance and network fabric topologies used to interconnect the cluster.

Unlock Massive Token Throughput with NVIDIA Run:ai

Joint benchmarking with Nebius shows that fractional GPU deployments using NVIDIA Run:ai on NVIDIA Enterprise Reference Architectures significantly improve throughput and utilization for production LLM workloads.

NVIDIA Enterprise Reference Architecture Overview

This whitepaper introduces NVIDIA Enterprise Reference Architectures, which provide proven guidance for designing and building AI factories for enterprise-class deployments ranging from 32 to 1,024 GPUs. These architectures help simplify AI infrastructure deployment, reduce operational complexity, and accelerate time to value.

North–South Networks: The Key to Faster Enterprise AI Workloads

NVIDIA Enterprise Reference Architectures guide organizations in deploying AI factories that utilize both north-south and east-west networks, providing design recipes for scalable, secure, and high-performing AI infrastructure.

Deploying NVIDIA H200 NVL at Scale With a New Enterprise Reference Architecture

NVIDIA H200 NVL accelerates AI deployment with enhanced memory, high-speed NVLink, and an optimized Enterprise RA configuration.

NVIDIA’s AI Factory Drives Enterprise Innovation at Scale

NVIDIA built a unified AI factory to scale generative AI and agentic workflows across the enterprise, ensuring security, performance, and consistency. The platform supports hundreds of AI agents that accelerate innovation, streamline software and hardware engineering, and optimize supply chain operations—reducing planning times by over 95 percent and achieving decades’ worth of engineering work in just one year.

NVIDIA Blackwell Ultra Delivers up to 50x Better Performance and 35x Lower Cost for Agentic AI

Built to accelerate the next generation of agentic AI, NVIDIA Blackwell Ultra delivers breakthrough inference performance with dramatically lower cost. Cloud providers such as Microsoft, CoreWeave, and Oracle Cloud Infrastructure are deploying NVIDIA GB300 NVL72 systems at scale for low-latency and long-context use cases, such as agentic coding and coding assistants.

This is enabled by deep co-design across NVIDIA Blackwell, NVLink™, and NVLink Switch for scale-out; NVFP4 for low-precision accuracy; and NVIDIA Dynamo and TensorRT™ LLM for speed and flexibility—as well as development with community frameworks SGLang, vLLM, and more.

Next Steps

Ready to Get Started?

Learn more about NVIDIA Enterprise AI Factory.

Take a Deeper Dive Into NVIDIA Enterprise Reference Architectures

Explore how NVIDIA Enterprise Reference Architectures provide scalable, prescriptive blueprints for deploying high-performance AI infrastructure.

Cluster Configuration 2-8-5-200 Specs

Cluster Configuration 2-8-9-400 Specs

Cluster Configuration 2-4-6-400 Specs

Cisco is the worldwide technology leader that is revolutionizing the way organizations connect and protect in the AI era. For more than 40 years, Cisco has securely connected the world. With its industry-leading AI-powered solutions and services, Cisco enables its customers, partners, and communities to unlock innovation, enhance productivity, and strengthen digital resilience. With purpose at its core, Cisco remains committed to creating a more connected and inclusive future for all.

NVIDIA Design Review Board-endorsed solutions:

Dell Technologies helps organizations and individuals build their digital future and transform how they work, live, and play. The company provides customers with the industry’s broadest and most innovative technology and services portfolio for the AI era.

NVIDIA Design Review Board-endorsed solutions:

HPE is a leader in essential enterprise technology, bringing together the power of AI, cloud, and networking to help organizations achieve more. As pioneers of possibility, our innovation and expertise advance the way people live and work. We empower our customers across industries to optimize operational performance, transform data into foresight, and maximize their impact. Unlock your boldest ambitions with HPE.

NVIDIA Design Review Board-endorsed solutions:

Lenovo is a US$69B revenue global technology powerhouse, ranked #196 in the Fortune Global 500, and serving millions of customers every day in 180 markets. Focused on a bold vision to deliver Smarter Technology for All, our ongoing partnership with NVIDIA combines Lenovo servers with accelerated GPUs. The Lenovo Hybrid AI Advantage™ with NVIDIA boosts productivity and innovation with faster AI deployment, powered by the Lenovo AI Library and a full-stack portfolio of AI infrastructure, devices, solutions, and services.

NVIDIA Design Review Board-endorsed solutions:

Supermicro is a global leader in application-optimized total IT solutions. Founded and operating in San Jose, California, Supermicro is committed to delivering first-to-market innovation for enterprise, cloud, AI, and 5G telco/edge IT infrastructure. We are a total IT solutions provider with server, AI, storage, IoT, switch systems, software, and support services. Supermicro’s motherboard, power, and chassis design expertise further enables our development and production, enabling next-generation innovation from cloud to edge for our global customers.

NVIDIA Design Review Board-endorsed solutions: