NVIDIA Multi-Instance GPU

Seven independent instances in a single GPU.

Multi-Instance GPU (MIG) expands the performance and value of NVIDIA Rubin, NVIDIA Blackwell, and NVIDIA Hopper GPUs. MIG can partition the GPU into as many as seven instances, each fully isolated with its own high-bandwidth memory, cache, and compute cores. This gives administrators the ability to support every workload, from the smallest to the largest, with guaranteed quality of service (QoS) and extending the reach of accelerated computing resources to every user.

Benefits
NVIDIA Blackwell and Hopper
Workloads
Specifications

Benefits
NVIDIA Blackwell and Hopper
Workloads
Specifications

Benefits Overview

Expand GPU Access

With MIG, you can achieve up to 7x more GPU resources on a single GPU. MIG gives researchers and developers more resources and flexibility than ever before.

Optimize GPU Utilization

MIG provides the flexibility to choose many different instance sizes, which allows provisioning of the right-sized GPU instance for each workload, ultimately optimizing utilization and maximizing data center investment.

Run Simultaneous Workloads

MIG enables inference, training, and high-performance computing (HPC) workloads to run at the same time on a single GPU with deterministic latency and throughput. Unlike time slicing, each workload runs in parallel, delivering higher performance.

How the Technology Works

Without MIG, different jobs running on the same GPU, such as different AI inference requests, compete for the same resources. A job consuming larger memory bandwidth starves others, resulting in several jobs missing their latency targets. With MIG, jobs run simultaneously on different instances, each with dedicated resources for compute, memory, and memory bandwidth, resulting in predictable performance with QoS and maximum GPU utilization.

Provision and Configure Instances as Needed

A GPU can be partitioned into different-sized MIG instances. For example, on NVIDIA GB200, an administrator could create two instances with 93GB of memory each, four instances with 46GB each, or seven instances with 23GB each.

MIG instances can also be dynamically reconfigured, enabling administrators to shift GPU resources in response to changing user and business demands. For example, seven MIG instances can be used during the day for low-throughput inference and reconfigured to one large MIG instance at night for deep learning training.

Run Workloads in Parallel, Securely

With a dedicated set of hardware resources for compute, memory, and cache, each MIG instance delivers guaranteed QoS and fault isolation. That means that a failure in an application running on one instance doesn’t impact applications running on other instances.

It also means that different instances can run different types of workloads—interactive model development, deep learning training, AI inference, or HPC applications. Since the instances run in parallel, the workloads also run in parallel—but separate and isolated—on the same physical GPU.

MIG in NVIDIA Blackwell and Hopper GPUs

NVIDIA Blackwell and Hopper GPUs support MIG with multi-tenant, multi-user configurations in virtualized environments across up to seven GPU instances, securely isolating each instance with confidential computing at the hardware and hypervisor level. Dedicated video decoders for each MIG instance deliver secure, high-throughput intelligent video analytics (IVA) on shared infrastructure. With concurrent MIG profiling, administrators can monitor right-sized GPU acceleration and allocate resources for multiple users.

For researchers with smaller workloads, rather than renting a full cloud instance, they can use MIG to isolate a portion of a GPU securely while being assured that their data is secure at rest, in transit, and in use. This improves flexibility for cloud service providers to price and address smaller customer opportunities.

Watch MIG in Action

Running Multiple Workloads on a Single A100 GPU

This demo runs AI and high-performance computing (HPC) workloads simultaneously on the same A100 GPU.

Watch Video

Boosting Performance and Utilization with Multi-Instance GPU

This demo shows inference performance on a single slice of MIG and then scales linearly across the entire A100.

Watch Video

Built for IT and DevOps

MIG enables fine-grained GPU provisioning by IT and DevOps teams. Each MIG instance behaves like a standalone GPU to applications, so there’s no change to the CUDA® platform. MIG can be used in all major enterprise computing environments.

Deploy from Data Center to Edge

Use MIG on premises, in the cloud, and at the edge.

Learn More

Leverage Containers

Run containerized applications on MIG instances.

Learn More

Support Kubernetes

Schedule Kubernetes pods on MIG instances.

Learn More

Virtualize Applications

Run applications on MIG instances inside a virtual machine.

Learn More

MIG Specifications

	NVIDIA Blackwell Ultra GPU*		NVIDIA Blackwell GPU*
	NVIDIA GB300 NVL72	NVIDIA HGX B300	NVIDIA GB200 NVL72	NVIDIA HGX B200
AI Security	Yes	Yes	Yes	Yes
Instance Types	7x 34 GB 4x 69 GB 2x 139 GB 1x 279 GB	7x 32 GB 4x 67 GB 2x 135 GB 1x 270 GB	7x 23 GB 4x 46 GB 2x 93 GB 1x 186 GB	7x 21 GB 4x 45 GB 2x 90 GB 1x 180 GB
GPU Profiling and Monitoring	Concurrently on all instances	Concurrently on all instances	Concurrently on all instances	Concurrently on all instances
Secure Tenants	7x	7x	7x	7x
Media Decoders	Dedicated NVJPEG and NVDEC per instance	Dedicated NVJPEG and NVDEC per instance	Dedicated NVJPEG and NVDEC per instance	Dedicated NVJPEG and NVDEC per instance

Preliminary specifications. All values are up to and may be subject to change.
* Sizes shown are per GPU specifications for each system. For further information, refer to the technical documentation.

Learn More About MIG.

Learn More