NVIDIA Mission Control

Run models, automate the essentials.

Overview

Bringing the World’s Most Advanced AI Factory Expertise to Every Business

NVIDIA Mission Control™ streamlines every aspect of the AI factory—from developer workload scheduling and orchestration to monitoring and autonomous recovery—while empowering platform teams to operate efficiently and scale confidently with fully supported software. It powers NVIDIA Blackwell and NVIDIA Rubin data centers for the newest frontiers of AI, combining real‑time visibility with precise control over performance, power, and cooling with always-on resilience for maximized AI factory ROI. Mission Control lets every enterprise run AI with the efficiency of today’s hyperscalers, accelerating AI token production.

Manage and Run AI Factories

NVIDIA Mission Control simplifies AI operations—from cluster deployment to workload orchestration to building management integration—all with agility, resilience, and hyperscale efficiency for enterprises.

Automating AI Factory Operations

Accelerate AI experimentation with the new software platform for NVIDIA Blackwell infrastructure that powers every aspect of AI factory operations.

Technology

AI Data Center Operations and Orchestration

Simplify how AI factories are deployed and operated throughout the entire cluster life cycle.

Advanced Power Optimizations

Run at 85% power with 93% performance throughput in power-constrained or cost-conscious environments with access to validated implementations of NVIDIA’s latest power innovations.

Building Management Integration

Improve control for power and cooling events, including rapid leakage detection, with enhanced system and data center facilities coordination supported by automation and integrated dashboards.

Autonomous Recovery Engine

Identify, isolate, and recover from problems 10x faster, without manual intervention, leading to faster training and inference runs for maximized developer productivity and built-in infrastructure resiliency.

Continuous Health Checks

Validate hardware and cluster performance throughout the life cycle of your infrastructure with access to health checks with the option to trigger automated actions based on NVIDIA’s preset rules.

Dynamic Workload Orchestration

Boost GPU availability and utilization with included NVIDIA Run:ai technology or integrate Slurm and bring-your-own Kubernetes with our cluster management platform.

Flexible, Secure Configuration

Integrate NVIDIA Mission Control services with trusted ISV solutions for flexible, secure configurations that provide validated namespace isolations and meet your organization’s needs.

New Releases

NVIDIA Mission Control 2.3

NVIDIA Mission Control 2.3 is fully integrated across the NVIDIA ecosystem with support for NVIDIA GB200 NVL72 and NVIDIA GB300 NVL72. It features a new unified authentication across services and an added option for a virtualized control plane to improve flexibility and scalability. In addition, Mission Control now offers deployment for air-gapped environments and provides leak detection validation checks. NVIDIA DGX™ systems with NVIDIA Blackwell architectures also now have access to the full scope of Mission Control capabilities, including the autonomous recovery engine suite.

NVIDIA Mission Control includes access to NVIDIA’s latest power optimization innovations in a validated workflow with easy-to-use graphical interfaces for monitoring and managing actions at the cluster, system, and workload level. With Mission Control, administrators can access the domain power service and set cluster-wide, dynamic policies that are job-aware for optimizing power.

Benefits

Why NVIDIA Mission Control?

Instant Operational Agility

Bring agility to AI factory operations with seamless multi-node training and inference orchestration, flexibility to integrate with third-party software, and advanced power and cooling automation.

Extensive Monitoring

Gain deep visibility into workload uptime, cluster infrastructure, and facilities with integrated, ready-to-use Grafana dashboards and always-on health checks that reduce alert fatigue and optimize performance.

Built-in Resiliency

Redefine modern data center resiliency with an end-to-end autonomous recovery engine that spans from anomaly detection to isolation to fast job restart and automated hardware remediation.

Accelerated AI Token Production

Maximize AI factory output with end-to-end validated workflows, continuous operations for improved revenue potential, and NVIDIA Enterprise Support for a new standard of enterprise AI at scale.

Partners

Deploy and Run AI Factories With Leading System Providers

Configure, validate, and operate AI factories built on NVIDIA Grace™ Blackwell NVL72 from leading system providers who have tested and validated NVIDIA Mission Control for their systems.

Solutions

Everything You Need for a World-Class AI Factory

NVIDIA delivers all the building blocks for an AI factory. Together, NVIDIA Mission Control and NVIDIA AI Enterprise provide state-of-the-art infrastructure and workload management plus developer tools for production AI, allowing enterprises to harness the transformative power of AI with unprecedented, practical scale.

NVIDIA DGX SuperPOD

Leadership-class AI infrastructure purpose-built for the unique demands of AI.

NVIDIA DGX SuperPOD™ is a turnkey AI data center infrastructure solution that delivers uncompromising performance for every user and workload. Configurable with any NVIDIA DGX™ system, DGX SuperPOD provides leadership-class accelerated infrastructure with scalable performance for the most demanding AI training and inference workloads.

NVIDIA AI Enterprise

Cloud-native software platform that optimizes production AI with tools built for developers.

The NVIDIA AI Enterprise software suite includes NVIDIA’s best data science tools, pretrained models, optimized frameworks, and more, fully backed with NVIDIA enterprise support. NVIDIA AI Enterprise is optimized to run on top of Mission Control.

Next Steps

Ready to Get Started?

Unlock streamlined AI operations with NVIDIA Mission Control to power your enterprise’s AI moonshot.

Need Support for NVIDIA Mission Control?

Get expert support, faster results, and guidance with NVIDIA Enterprise Support Services.

NVIDIA Mission Control Documentation

Access user guides and release notes for NVIDIA Mission Control.