Power the Most Compute-Intensive Server Workloads with Virtual GPUs
NVIDIA Virtual Compute Server (vCS) enables data centers to accelerate server virtualization with the latest NVIDIA data center GPUs, including NVIDIA A100 Tensor Core GPU¹, so that the most compute-intensive workloads, such as artificial intelligence, deep learning, and data science, can be run in a virtual machine (VM).
GPU sharing (fractional) is only possible with NVIDIA vGPU technology. It enables multiple VMs to share a GPU, maximizing utilization for lighter workloads that require GPU acceleration.
With GPU aggregation, a VM can access more than one GPU, which is often required for compute-intensive workloads. vCS supports both multi-vGPU and peer-to-peer computing. With multi-vGPU, the GPUs aren’t directly connected; with peer-to-peer, they are through NVLink for higher bandwidth.
vCS provides support for app-, guest-, and host-level monitoring. In addition, proactive management features provide the ability to do live migration, suspend and resume, and create thresholds that expose consumption trends impacting user experiences, all through the vGPU management SDK.
NVIDIA GPU Cloud (NGC) is a hub for GPU-optimized software that simplifies workflows for deep learning, machine learning, and HPC, and now supports virtualized environments with NVIDIA vCS.
NVIDIA® NVLink™ is a high-speed, direct GPU-to-GPU interconnect that provides higher bandwidth, more links, and improved scalability for multi-GPU system configurations—now supported virtually with NVIDIA virtual GPU (vGPU) technology.
Error correction code (ECC) and page retirement provide higher reliability for compute applications that are sensitive to data corruption. They’re especially important in large-scale cluster-computing environments where GPUs process very large datasets and/or run applications for extended periods.
Multi-Instance GPU (MIG) is a revolutionary technology that can extend the capabilities of the data center that enables each NVIDIA A100 Tensor Core GPU¹ to be partitioned into up to seven instances, fully isolated and secured at the hardware level with their own high-bandwidth memory, cache, and compute cores. With vCS software, a VM can be run on each of these MIG instances so that organizations can take advantage of management, monitoring, and operational benefits of hypervisor based server virtualization.
GPUDirect® RDMA (remote direct memory access) enables network devices to directly access GPU memory, bypassing CPU host memory, decreasing GPU-to-GPU communication latency, and completely offloading the CPU.
GRID vPC/vApps and Quadro vDWS are client compute products for virtual graphics designed for knowledge workers and creative or technical professionals. vCS is for compute-intensive server workloads, such as AI, deep learning, and Data Science.
No, vCS is licensed differently than GRID vPC/vApps and Quadro vDWS. GRID vPC/vApps and Quadro vDWS are licensed by concurrent user (CCU), either as a perpetual license or yearly subscription. Since vCS is for server compute workloads, the license is tied to the GPU rather than a user. As such, vCS is licensed per GPU as a yearly subscription. Additional details on licensing can be found in the NVIDIA Virtual GPU Packaging, Pricing and Licensing Guide.
Please see the GPU Recommendations table above. In addition to NVIDIA V100S, V100, T4, RTX 8000, RTX 8000 as recommended GPUs for vCS, NVIDIA P100, P40, and P6 are also supported. Support for NVIDIA A100 Tensor Core GPU is coming soon.
Refer to the vGPU Certified Servers page for a full list of certified servers for all vGPU products.
Yes, containers can be run in VMs with vCS. NVIDIA NGC offers a comprehensive catalog of GPU-accelerated containers for deep learning, machine learning, and HPC. Workloads can also be run directly in a VM, without containers, using vCS.
View product release notes and supported third-party software products.