NVIDIA NVQLink

Real-time accelerated computing for every quantum processor.

Connect with NVIDIA quantum experts to learn more about NVQLink.

Overview

Scale QPU Integration With the NVIDIA NVQLink Reference Platform

The open NVIDIA® NVQLink™ platform architecture tightly integrates quantum hardware with state-of-the-art accelerated computing to power the development of quantum processing units (QPUs) at scale. Using real-time APIs within the NVIDIA CUDA-Q™ software platform, researchers can easily leverage NVQLink for the low-latency, high-throughput connections they need to perform control tasks like calibration and quantum error correction (QEC). QPUs equipped with NVQLink allow QPU operators to unify quantum and accelerated compute resources to develop hybrid quantum applications.

World’s Leading Supercomputing Centers Adopt NVIDIA NVQLink to Integrate Quantum Processors

Supercomputing centers globally are adopting NVIDIA NVQLink, an open and universal interconnect, to integrate quantum processors and power large-scale quantum-classical workflows.

Read the Press Release

NVIDIA NVQLink Builds a Bridge From Accelerated Computing to the Quantum Processor

See how NVIDIA's NVQLink achieves groundbreaking microsecond-level latency by connecting accelerated GPU computing with quantum processors using RDMA over Ethernet—finally making high-performance computing a seamless, real-time partner in quantum control and error correction.

Read the Technical Blog

Build Real-Time Applications With QPU Access Through CUDA-Q

The CUDA-Q real-time API allows developers to take advantage of NVQLink’s low-latency, high throughput connection to quantum hardware. A simple remote function API call within CUDA-Q’s kernel-based programming model makes it easy to accelerate hybrid applications and develop scalable QEC workflows.

Explore CUDA-Q

Rigetti Computing

Highlights

High Performance in Real Time

Leading-Edge AI

40 PFLOPS (FP4¹)

Maximum GPU-QPU Throughput

400 Gb/s

Minimum GPU-QPU Latency
(FPGA to GPU to FPGA)

<4.0 microseconds

¹ With sparsity.

Workloads

Optimized for Large-Scale, Real-Time Quantum Computing

Accelerating the quantum workflow from calibration to full fault tolerance.

QPU Calibration

NVQLink provides the tight coupling of compute to quantum control required for real-time QPU calibration, maximizing the fidelity of quantum operations and bringing QPU downtime to zero.

QEC Decoding

NVQLink accelerates quantum error correction (QEC) decoding by providing high-throughput compute at low latency, transforming a noisy QPU into a functional, logical QPU.

Seamlessly deploy QEC-encoded programs and decoders using the CUDA-Q QEC library.

Logical Orchestration

NVQLink facilitates the execution of complex logical programs by enabling just-in-time compilation and dynamic routing for advanced QEC protocols such as lattice surgery and on-the-fly decoder reconfiguration.

Benefits

NVQLink and CUDA-Q

Together, NVQLink and CUDA-Q provide a platform for error-corrected quantum applications.

Programmable

Develop and deploy error-corrected applications to your QPU using extensible CUDA-Q libraries for QEC and more.

Interoperable

Work across all major quantum controllers and quantum processor modalities.

High Performance

Move hundreds of gigabites per second of data between the quantum controller and compute host with the industry’s most scalable, low-latency network.

Future Proof

Join a rapidly growing software ecosystem for accelerated quantum supercomputing.

Providers

NVQLink Ecosystem

Resources

Learn More About NVQLink

Blogs
Videos

NVIDIA NVQLink Architecture Integrates Accelerated Computing With Quantum Processors

NVIDIA NVQLink brings accelerated computing into the quantum stack, enabling today’s GPU superchips to support the online workloads of the QPU itself.

Read Blog

Real-Time Decoding, Algorithmic GPU Decoders, and AI Inference Enhancements in NVIDIA CUDA-Q QEC

Real-time decoding is crucial to fault-tolerant quantum computers. By enabling decoders to operate with low latency concurrently with a QPU, we can apply corrections to the device within the coherence time.

Read Blog

View All Blogs

NVQLink: Unlocking Quantum-GPU Supercomputing

AI supercomputing is the missing puzzle piece for running and controlling large-scale quantum computers.

Watch Video (01:58)

Accelerated Computing for the QPU

In building toward a future where quantum computing produces compelling and durable use cases, two trends in HPC-QPU integration indicate where computer architecture will play an important role.

Watch Video (30:01)

Building the Accelerated Quantum Supercomputer

NVIDIA and partners QuEra and Quantinuum dive into the concept of accelerated quantum supercomputing.

Watch Video (03:22)

View all Videos

FAQ

NVQLink is a platform architecture for tightly coupling a GPU-accelerated server to a quantum processing unit (QPU).

NVQLink serves two needs:

It offers accelerated computing at real-time latencies to QPU systems, offloading computationally demanding calibration and control tasks including quantum error correction (QEC).
It offers a standard tech stack to the supercomputer or datacenter that hosts the QPU, so that programmers can seamlessly write hybrid quantum applications.

The elements that define the NVQLink architecture are:

Real-time Host – A GPU-accelerated server capable of running CUDA code.
Quantum System Controller (QSC) – A system performing quantum coherent control and readout on a quantum system, typically a quantum processing unit (QPU).
Real-time Network – A network connecting the Real-time Host to the QSC. NVIDIA provides a reference architecture, but integrators are free to provide their own networking stack.
cudaq-realtime API – A runtime library of CUDA-Q enabling programmers to refer to devices (CPUs, GPUs, FPGAs) in the Real-time Host and QSC and coordinate work among them. Developers are free to build on the NVQLink architecture with software other than CUDA-Q, but the system must support the cudaq-realtime API.

Validation of an NVQLink system is performed by a cudaq-realtime library function that measures the round trip latency of a QSC-Host callback. The recognized implementation of this benchmark is provided in the open source CUDA-Q repository and exercises the core functionality of cudaq-realtime. Because cudaq-realtime is the supported way to build real-time applications on NVQLink, its API is a requisite for NVQLink compatibility.

Users of NVQLink are free to choose among many options for the Real-time Host, the Quantum System Controller, and the Real-time Network. NVIDIA provides a reference implementation for the Real-time Network, and third party network architectures are compatible if they support the cudaq-realtime library.

The reference architecture for the NVQLink network is based on an ultrahigh performance and widespread form of ethernet called RoCE, and has the following elements:

Holoscan Sensor Bridge (HSB) IP – An open source FPGA core that the QSC builder can easily integrate into their FPGA firmware, without disclosing their IP.
ConnectX NIC – A standard NVIDIA network interface card installed on the Real-time Host.
DOCA and HSB SDK – An open access networking stack on the Real-time Host to define the RoCE verbs and facilitate optimized kernel definitions.

Spectrum-X switch (optional) – If needed, an ethernet switch to expand the network radix and aggregate data to the Real-time Host from many points in the QSC.

There are many types of QPU, and there are various functions that QPU builders and users may want to offload to the Real-time Host with vastly different response time requirements. Because of this, we don’t prescribe a specific latency number.

NVQLink defines a common, open-source benchmark to ensure all compatible systems deliver transparent and reproducible network latency, allowing users to select the best solution for their needs.

In the context of NVQLink, the Real-time Host is intended to perform latency-critical compute to support quantum error correction (QEC) and online autocalibration of the QPU. These workloads have latency requirements from the millisecond range to the microsecond range on various QPU types, but regardless of the QPU type these workloads are essential for maximizing QPU performance and uptime.

NVQLink Real-time Hosts are available from our partners, each of which may have their own differentiated offerings. Ask your vendor what CUDA-Q realtime latency they support.

Creating a Real-time Host also requires support from the connected QSC. Please ensure your QSC provider is supporting the upgrade.

An existing CUDA-capable server can be converted to an NVQLink Real-time Host by ensuring the Real-time Network components are installed: an NVIDIA ConnectX or BlueField NIC and cudaq-realtime (available after March 2026 in the cuda-quantum repository).

A Real-time Host connected to a compatible QSC is validated using the CUDA-Q realtime callback latency benchmark.

Get Started

Inquire Now

Connect with NVIDIA Quantum experts to learn more about NVQLink.