Turn your data center into a high-performance AI factory with NVIDIA Enterprise Reference Architectures.
Overview
NVIDIA Enterprise Reference Architectures (Enterprise RAs) enable organizations to design, deploy, and scale high-performance AI factories using validated, repeatable infrastructure. These designs combine certified compute, high-speed east-west and north-south networking, observability tools, and software to ensure scalable performance, from four-node clusters to enterprise-scale environments.
Enterprise Reference Architectures
A comprehensive suite of instructions for setting up clusters in the data center is now available.
Use Cases
Accelerate agentic AI, physical AI, high-performance computing (HPC), and AI simulation workloads with proven NVIDIA Enterprise Reference Architectures and NVIDIA-Certified Systems from global partners. The primary infrastructure cluster configurations for deploying enterprise AI factories are outlined below.
The NVIDIA RTX PRO™ AI Factory configuration is designed for a broad spectrum of enterprise workloads, including generative and agentic AI, data analytics, visual computing, and engineering simulation. Deployments are optimized around 16- and 32-node design points, providing an ideal balance of performance, scalability, and deployment efficiency. Designed for universal workload acceleration across enterprise AI, simulation, and visual computing, NVIDIA RTX PRO Servers are optimized for PCIe environments, making them ideal for space-, power-, and cooling-constrained data centers. Purpose-built for modern AI workloads, they deliver efficient performance for agentic AI and large language model (LLM) inference.
The high-performance NVIDIA HGX™ AI Factory configuration is purpose-built for multi-node AI training and inference at scale, leveraging NVIDIA HGX systems. Available in 32-, 64-, and 128-node design points and supported by NVIDIA Spectrum-X™ networking, the architecture features a flexible, rail-optimized design that enables efficient integration across diverse rack layouts while delivering high-throughput, low-latency performance. It provides breakthrough performance for AI power users running the most demanding workloads, enables large-scale model training and fine-tuning, and dramatically accelerates inference. With next-generation precision and ultra-fast interconnects, the solution achieves up to 15x higher token throughput.
The NVIDIA NVL72 AI Factory configuration is designed to train and deploy trillion-parameter models, delivering exascale computing power within a single rack. Built for massive model throughput, multi-user inference, and real-time inference at scale, it enables the next generation of AI-driven innovation. Deployment design points center on four- and eight-rack configurations. Built on a flexible, rail-optimized network, the architecture adapts to diverse rack layouts and system designs while delivering high-bandwidth, low-latency performance. The platform delivers exceptional AI factory output with industry-leading energy efficiency and is powered by fifth-generation NVIDIA NVLink™, FP4 Tensor Cores, and advanced thermal innovations.
Benefits
Unlock scalable, high-performance AI infrastructure with proven, partner-ready configurations.
Meet the intensive demands of AI inference, fine-tuning, and training with architectures that ensure full GPU utilization and performance consistency across multi-node clusters.
Easily expand your infrastructure and ensure scalable, streamlined deployment for up to 128 nodes. Build the foundation for full-stack solutions with the NVIDIA Enterprise AI Factory validated design, which leverages our software ecosystem.
Simplify deployment processes and efficient designs, reduce complexity and total cost of ownership (TCO), while reducing time to value.
Follow specific, standardized design patterns to achieve consistent operation from one installation to the next, reduce the need for frequent support, and enable faster resolution times.
Partners
We’re proud to collaborate with leading partners as they bring Enterprise Reference Architectures and AI factory solutions to market. Endorsed designs from these partners have passed our Design Review Board, offering guidance that earns our endorsement in one or more of the following categories: infrastructure, networking logic, and software.
The Palantir Sovereign AI OS Reference Architecture is based on NVIDIA Enterprise RAs, tested and qualified to run Palantir's complete software suite on NVIDIA AI infrastructure with our global system partners. This sovereign AI architecture is critical for customers with latency-sensitive workflows, data sovereignty requirements, and high geographic distribution. The architecture provides enterprises with total control over their data, AI models, and applications.
Resources
NVIDIA built a unified AI factory to scale generative AI and agentic workflows across the enterprise, ensuring security, performance, and consistency. The platform supports hundreds of AI agents that accelerate innovation, streamline software and hardware engineering, and optimize supply chain operations—reducing planning times by over 95 percent and achieving decades’ worth of engineering work in just one year.