NVIDIA vComputeServer

Power the Most Compute-Intensive Server Workloads with Virtual GPUs

Virtualize Compute for AI, Deep Learning, and Data Science

NVIDIA Virtual Compute Server (vComputeServer) enables data centers to accelerate server virtualization with the latest NVIDIA data center GPUs, including NVIDIA A100 Tensor Core GPU, so that the most compute-intensive workloads, such as artificial intelligence, deep learning, and data science, can be run in a virtual machine (VM).

Features

GPU Sharing

GPU Sharing

GPU sharing (fractional) is only possible with NVIDIA vGPU technology. It enables multiple VMs to share a GPU, maximizing utilization for lighter workloads that require GPU acceleration.

GPU Aggregation

GPU Aggregation

With GPU aggregation, a VM can access more than one GPU, which is often required for compute-intensive workloads. vComputeServer supports both multi-vGPU and peer-to-peer computing. With multi-vGPU, the GPUs aren’t directly connected; with peer-to-peer, they are through NVLink for higher bandwidth.

Management and Monitoring

Management and Monitoring

vComputeServer provides support for app-, guest-, and host-level monitoring. In addition, proactive management features provide the ability to do live migration, suspend and resume, and create thresholds that expose consumption trends impacting user experiences, all through the vGPU management SDK.

NGC

NGC

NVIDIA GPU Cloud (NGC) is a hub for GPU-optimized software that simplifies workflows for deep learning, machine learning, and HPC, and now supports virtualized environments with NVIDIA vComputeServer.

Peer-to-Peer Computing

Peer-to-Peer Computing

NVIDIA® NVLink is a high-speed, direct GPU-to-GPU interconnect that provides higher bandwidth, more links, and improved scalability for multi-GPU system configurations—now supported virtually with NVIDIA virtual GPU (vGPU) technology.

ECC & Page Retirement

ECC & Page Retirement

Error correction code (ECC) and page retirement provide higher reliability for compute applications that are sensitive to data corruption. They’re especially important in large-scale cluster-computing environments where GPUs process very large datasets and/or run applications for extended periods.

Multi-Instance GPU (MIG)

Multi-Instance GPU (MIG)

Multi-Instance GPU (MIG) is a revolutionary technology that can extend the capabilities of the data center that enables each NVIDIA A100 Tensor Core GPU to be partitioned into up to seven instances, fully isolated and secured at the hardware level with their own high-bandwidth memory, cache, and compute cores. With vComputeServer software, a VM can be run on each of these MIG instances so that organizations can take advantage of management, monitoring, and operational benefits of hypervisor based server virtualization.

NVIDIA vComputeServer

GPU Recommendations

  NVIDIA A100 NVIDIA V100S NVIDIA RTX 8000 NVIDIA RTX 6000 NVIDIA T4
显存 40 GB HBM2 32 GB HBM2 48 GB GDDR6 24 GB GDDR6 16 GB GDDR6
FP 32 峰值 19.5 TFLOPS 16.4 TFLOPS 14.9 TFLOPS 14.9 TFLOPS 8.1 TFLOPS
FP 64 峰值 9.7 TFLOPS 8.2 TFLOPS - - -
NVLink:每个 VM 的 GPU 数量 多达 8 多达 8 2 2 -
ECC 和页面引退
每个 VM 的多 vGPU 配置1 多达 16 多达 16 多达 16 多达 16 多达 16

Virtualization Partners

Frequently Asked Questions

Learn More About NVIDIA Virtual GPU Software

View product release notes and supported third-party software products.