NVIDIA vComputeServer

Power the Most Compute-Intensive Server Workloads with Virtual GPUs

Virtualize Compute for AI, Deep Learning, and Data Science

NVIDIA Virtual Compute Server (vComputeServer) enables data centers to accelerate server virtualization with GPUs so that the most compute-intensive workloads, such as artificial intelligence, deep learning, and data science, can be run in a virtual machine (VM).

Features

GPU Sharing

GPU Sharing

GPU sharing (fractional) is only possible with NVIDIA vGPU technology. It enables multiple VMs to share a GPU, maximizing utilization for lighter workloads that require GPU acceleration.

GPU Aggregation

GPU Aggregation

With GPU aggregation, a VM can access more than one GPU, which is often required for compute-intensive workloads. vComputeServer supports both multi-vGPU and peer-to-peer computing. With multi-vGPU, the GPUs aren’t directly connected; with peer-to-peer, they are through NVLink for higher bandwidth.

Management and Monitoring

Management and Monitoring

vComputeServer provides support for app-, guest-, and host-level monitoring. In addition, proactive management features provide the ability to do live migration, suspend and resume, and create thresholds that expose consumption trends impacting user experiences, all through the vGPU management SDK.

NGC

NGC

NVIDIA GPU Cloud (NGC) is a hub for GPU-optimized software that simplifies workflows for deep learning, machine learning, and HPC, and now supports virtualized environments with NVIDIA vComputeServer.

Peer-to-Peer Computing

Peer-to-Peer Computing

NVIDIA® NVLink is a high-speed, direct GPU-to-GPU interconnect that provides higher bandwidth, more links, and improved scalability for multi-GPU system configurations—now supported virtually with NVIDIA virtual GPU (vGPU) technology.

ECC & Page Retirement

ECC & Page Retirement

Error correction code (ECC) and page retirement provide higher reliability for compute applications that are sensitive to data corruption. They’re especially important in large-scale cluster-computing environments where GPUs process very large datasets and/or run applications for extended periods.

GPU Recommendations

  NVIDIA A100 NVIDIA V100S NVIDIA RTX 8000 NVIDIA RTX 6000 NVIDIA T4
RT Cores - - 72 72 48
Tensor Cores TBD 640 576 576 320
CUDA® Cores TBD 5,120 4,680 4,680 2,560
Memory 40 GB HBM2 32 GB HBM2 48 GB GDDR6 24 GB GDDR6 16 GB GDDR6
Peak FP 32 19.5 TFLOPS 16.4 TFLOPS 14.9 TFLOPS 14.9 TFLOPS 8.1 TFLOPS
Peak FP 64 9.7 TFLOPS 8.2 TFLOPS - - -
NVLink: Number of GPUs per VM Up to 8 Up to 8 2 2 -
ECC and Page Retirement
Multi-GPU per VM Up to 16 Up to 16 Up to 16 Up to 16 Up to 16

Virtualization Partners

Frequently Asked Questions

Learn More About NVIDIA Virtual GPU Software

View product release notes and supported third-party software products.