NVIDIA GPUs For Virtualization


NVIDIA virtual GPU (vGPU) software runs on NVIDIA GPUs and is based on NVIDIA Ampere, Turing, Volta, Pascal, and Maxwell GPU architectures. Match your needs with the right GPU below.

  • NVIDIA Ampere Architecture and RTX
  • Previous Generation
GPU A100 A401 Quadro RTX 8000 Quadro RTX 6000
GPU Architecture NVIDIA Ampere NVIDIA Ampere NVIDIA Turing NVIDIA Turing
VIRTUALIZATION WORKLOAD Virtualized compute workloads such as AI, deep learning, and high-performance computing (HPC) with NVIDIA Virtual Compute Server (vCS). Upgrade path for V100/V100S Tensor Core GPUs. Mid-range to high-end 3D design and creative workflows with NVIDIA® Quadro® Virtual Data Center Workstation (Quadro vDWS). Virtualized AI with NVIDIA vCS. Upgrade path for Quadrio RTX 8000, Quadro RTX 6000. High-end rendering, 3D design, and creative workflows with Quadro vDWS. Mid-range to high-end rendering, 3D design, and creative workflows with Quadro vDWS.
vGPU SOFTWARE SUPPORT NVIDIA Virtual Compute Server (vCS) Quadro vDWS, NVIDIA GRID® Virtual PC (vPC), GRID Virtual Apps (vApps), vCS Quadro vDWS, GRID vPC, GRID vApps, vCS Quadro vDWS, GRID vPC, GRID vApps, vCS
GPU V100 T4 M10 P6
GPU Architecture NVIDIA Volta NVIDIA Turing NVIDIA Maxwell NVIDIA Pascal
CUDA CORES 5,120 2,560 2,560 (640 per GPU) 2,048
MEMORY SIZE 32/16 GB HBM2 16 GB GDDR6 32 GB GDDR5 (8 GB per GPU) 16 GB GDDR5
VIRTUALIZATION WORKLOAD Ultra-high-end rendering, simulation, and 3D design with NVIDIA Quadro vDWS. AI, deep learning, and data science with NVIDIA vCS. Ideal upgrade path for V100. Entry-level 3D design and engineering workflows with Quadro vDWS. High-density, low-power GPU acceleration for knowledge workers with NVIDIA GRID software. AI, deep learning, and data science with vCS Knowledge workers using modern productivity apps and Windows 10 requiring best density and total cost of ownership (TCO). Multi-monitor support with NVIDIA GRID vPC and vApps. For customers requiring GPUs in a blade-server form factor. Ideal upgrade path for M6.
vGPU SOFTWARE SUPPORT Quadro vDWS, GRID vPC, GRID vApps, vCS Quadro vDWS, GRID vPC, GRID vApps, vCS Quadro vDWS, GRID vPC, GRID vApps Quadro vDWS, GRID vPC, GRID vApps, vCS

Performance Optimized

NVIDIA A100 Tensor Core

NVIDIA A100 Tensor Core

  • Delivers unprecedented acceleration at every scale for AI, data analytics, and HPC
  • Supports Multi-Instance GPU (MIG) technology, enabling partitioning into seven isolated GPU instances to accelerate workloads of all sizes
  • Provides the ideal upgrade path from V100
  • Enables data scientists and researchers to speed time to market, while achieving improved security and manageability with NVIDIA Virtual Compute Server

NVIDIA V100S Tensor Core

  • Accelerates the most demanding, double-precision compute workflows
  • Provides the ideal upgrade path from P100 and V100
  • Equips data scientists and researchers with the power to work efficiently with improved security and manageability with NVIDIA Virtual Compute Server
NVIDIA A40<sup>1</sup>


  • Fast, interactive performance powered by the NVIDIA Ampere GPU Architecture – with ultra-fast on-board graphics memory technology and optimized software drivers for professional applications
  • 2nd Generation RT Cores to accelerate photorealistic ray-traced rendering up to 2X faster than the previous generation
  • 3rd generation Tensor Cores to accelerate AI workloads – bringing AI capabilities to graphics with features like DLSS, AI denoising, and enhanced editing for select applications
  • Supports larger, more powerful virtual workstation instances for remote users, enabling larger workflows for high end design with Quadro vDWS or AI and compute with NVIDIA vCS

NVIDIA Quadro RTX 8000

  • Includes 48 gigabytes (GB) of memory with double the frame buffer of RTX 6000
  • Enables designers and artists to work with the largest and most complex ray-tracing and visual computing workloads 
  • Delivers the ultimate flexibility with Quadro vDWS software, powering virtual design workstations and render nodes to propel creative workflows
NVIDIA Quadro RTX 8000

NVIDIA Quadro RTX 6000

  • Improves graphic performance up to 75 percent versus P40
  • Combined with Quadro vDWS software, enables artists to access high-powered virtual design workstations and render nodes to speed design workflows and arrive at their best creations faster 
  • With RT Cores, a large frame buffer, and multiple profile sizes, gives artists and designers the flexibility to run the most demanding workloads from the data center 
  • Enables organizations to virtualize both accelerated graphics and single-precision compute ( NVIDIA CUDA® and OpenCL) workloads
NVIDIA Quadro RTX 6000

Density Optimized

NVIDIA Tesla M10


  • Designed for data centers needing graphics acceleration for high-density virtual desktop environments to meet the needs of the modern digital workplace 
  • Provides the ideal solution for organizations migrating to Windows 10—the most graphics-intensive operating system to date—or deploying additional infrastructure for disaster recovery and high availability due to business or regulatory requirements 
  • With a dual-slot PCI Express form factor for rack and tower servers, supports up to 64 concurrent users (512 MB profile)

NVIDIA T4 Tensor Core

  • Delivers up to 2X the frame buffer versus P4 
  • Delivers up to 2X the performance versus M60
  • With Quadro vDWS software, provides the ideal solution for entry-level to high-end 3D design and engineering workflows 
  • Performs deep learning inferencing in a virtual environment with NVIDIA Virtual Compute Server
  • With a single-slot, low-profile form factor and only 70 watts (W) of power consumption, achieves maximum GPU density per server node

Blade Optimized



  • Designed for blade servers and supports multiple data center workloads, including deep learning, high-performance computing, and graphics virtualization
  • Delivers higher graphics performance, improved energy efficiency, and increased user density when compared to M6, making it an ideal upgrade path
  • Comes in a mobile PCI Express module (MXM) form factor that runs at less than 90 W for high-density data centers with blade servers and converged infrastructure

Additional GPUs Supported for Virtualization