High-Performance
Data Science

Harness the power of GPUs to easily accelerate your data science, machine learning, and AI workflows.

Run entire data science workflows with high-speed GPU compute and parallelize data loading, data manipulation, and machine learning for 50X faster end-to-end data science pipelines.

Why RAPIDS?

nvidia-data-science-process-1cn-d

Building a High-Performance Ecosystem

RAPIDS is a suite of open-source software libraries and APIs for executing data science pipelines entirely on GPUs—and can reduce training times from days to minutes. Built on NVIDIA® CUDA-X AI, RAPIDS unites years of development in graphics, machine learning, deep learning, high-performance computing (HPC), and more.

Faster Execution Time

Faster Execution Time

Data science is all about speed to results. RAPIDS leverages NVIDIA CUDA® under the hood to accelerate your workflows by running the entire data science training pipeline on GPUs. This reduces training time and the frequency of model deployment from days to minutes.

Use the Same Tools

Use the Same Tools

By hiding the complexities of working with the GPU and even the behind-the-scenes communication protocols within the data center architecture, RAPIDS creates a simple way to get data science done. As more data scientists use Python and other high-level languages, providing acceleration without code change is essential to rapidly improving development time.

Run Anywhere at Scale

Run Anywhere at Scale

RAPIDS can be run anywhere—cloud or on-prem. You can easily scale from a workstation to multi-GPU servers to multi-node clusters, as well as deploy it in production with Dask, Spark, MLFlow, and Kubernetes.

Lightning-Fast Performance on Big Data

Results show that GPUs provide dramatic cost and time-savings for small and large-scale Big Data analytics problems. Using familiar APIs like Pandas and Dask, at 10 terabyte scale, RAPIDS performs at up to 20x faster on GPUs than the top CPU baseline. Using just 16 NVIDIA DGX A100s to achieve the performance of 350 CPU-based servers, NVIDIA’s solution is 7x more cost effective while delivering HPC-level performance.

nvidia-16-dgx-a100-2c50-d

Faster Data Access, Less Data Movement

Common data processing tasks have many steps (data pipelines), which Hadoop can’t handle efficiently. Apache Spark solved this problem by holding all the data in system memory, which allowed more flexible and complex data pipelines, but introduced new bottlenecks. Analyzing even a few hundred gigabytes (GB) of data could take hours if not days on Spark clusters with hundreds of CPU nodes. To tap the true potential of data science, GPUs have to be at the center of data center design, consisting of these five elements: compute, networking, storage, deployment, and software. Generally speaking, end-to-end data science workflows on GPUs are 10X faster than on CPUs.

READ BLOG ›

Data Processing Evolution

Faster Data Access, Less Data Movement

RAPIDS Everywhere

RAPIDS provides a foundation for a new high-performance data science ecosystem and lowers the barrier of entry for new libraries through interoperability. Integration with leading data science frameworks like Apache Spark, cuPY, Dask, and Numba, as well as numerous deep learning frameworks, such as PyTorch, TensorFlow, and Apache MxNet, help broaden adoption and encourage integration with others.

  • Featured Projects
  • Contributors
  • Adopters
  • Open Source
blazingsql-logo

BlazingSQL is a high-performance distributed SQL engine in Python, built on RAPIDS to ETL massive datasets on GPUs.

nvtabular-logo

Built on RAPIDS, NVTabular accelerates feature engineering and preprocessing for recommender systems on GPUs.

custreamz-logo

Based on Streamz, written in Python, and built on RAPIDS, cuStreamz accelerates streaming data processing on GPUs.

plotly-dash-logo

Integrated with RAPIDS, Plotly Dash enables real-time, interactive visual analytics of multi-gigabyte datasets even on a single GPU.

apache-spark-logo

The RAPIDS Accelerator for Apache Spark provides a set of plug-ins for Apache Spark that leverage GPUs to accelerate processing via RAPIDS and UCX software.

anaconda-logo
Blazing SQL
capital-one-logo
cupy-logo
chainer-logo
deepwave-digital-logo
gunrock-logo
quansight-logo
walmart-logo
booz-allen-hamilton-logo
capital-one-logo
databricks-logo
graphistry-logo
h2oai-logo
ibm-logo
iguazio-logo
inria-logo
kinetica-logo
mapr-logo
omnisci-logo
preferred-networks-logo
pytorch-logo
uber-logo
ursa-labs-logo
walmart-logo
apache-arrow-logo
Blazing SQL
cupy-logo
dask-logo
gpu-open-analytics-initiative-goai-logo
nuclio-logo
numba-logo
scikit-learn-logo
dmlc-xgboost-logo

Technology at the Core

RAPIDS relies on CUDA primitives for low-level compute optimization but exposes that GPU parallelism and high-memory bandwidth through user-friendly Python interfaces. RAPIDS supports end-to-end data science workflows, from data loading and preprocessing to machine learning, graph analytics, and visualization. It’s a fully functional Python stack that scales to enterprise big-data use cases.

Data Loading and Preprocessing

Data Loading and Preprocessing

RAPIDS’s data loading, preprocessing, and ETL features are built on Apache Arrow for loading, joining, aggregating, filtering, and otherwise manipulating data, all in a pandas-like API familiar to data scientists. Users can expect typical speedups of 10X or greater.

Machine Learning

Machine Learning

RAPIDS’s machine learning algorithms and mathematical primitives follow a familiar scikit-learn-like API. Popular tools like XGBoost, Random Forest, and many others are supported for both single GPU and large data center deployments. For large datasets, these GPU-based implementations can complete 10-50X faster than their CPU equivalents.

Graph Analytics

Graph Analytics

RAPIDS’s graph algorithms like PageRank and functions like NetworkX make efficient use of the massive parallelism of GPUs to accelerate analysis of large graphs by over 1000X. Explore up to 200 million edges on a single NVIDIA A100 Tensor Core GPU and scale to billions of edges on NVIDIA DGX A100 clusters.

Visualization

Visualization

RAPIDS’s visualization features support GPU-accelerated cross-filtering. Inspired by the JavaScript version of the original, it enables interactive and super-fast multi-dimensional filtering of over 100 million row tabular datasets.

Machine Learning to Deep Learning: All on GPU

Deep Learning Integration

While deep learning is effective in domains like computer vision, natural language processing, and recommenders, there are areas where its use isn’t mainstream. Tabular data problems, which consist of columns of categorical and continuous variables, commonly make use of techniques like XGBoost, gradient boosting, or linear models. RAPIDS streamlines preprocessing of tabular data on GPUs and provides a seamless handoff of data directly to any frameworks supporting DLPack, like PyTorch, TensorFlow, and MxNet. These integrations open up new opportunities for creating rich workflows, even those previously out of reason like feeding new features created from deep learning frameworks back into machine learning algorithms.

Modern Data Centers for Data Science

There are five key ingredients to building AI-optimized data centers in the enterprise. The key to the design is placing GPUs at the center.

Compute

Compute

With their tremendous computational performance, systems with NVIDIA GPUs are the core compute building block for AI data centers. NVIDIA DGX systems deliver groundbreaking AI performance and can replace, on average, 50 dual-socket CPU servers. This is the first step to giving data scientists the industry’s most powerful tools for data exploration.

Software

Software

By hiding the complexities of working with the GPU and the behind-the-scenes communication protocols within the data center architecture, RAPIDS creates a simple way to get data science done. As more data scientists use Python and other high-level languages, providing acceleration without code change is essential to rapidly improving development time.

Networking

Networking

Remote direct memory access (RDMA) in NVIDIA Mellanox® network interface controllers (NICs), NCCL2 (NVIDIA collective communication library), and OpenUCX (an open-source point-to-point communication framework) has led to tremendous improvements in training speed. With RDMA allowing GPUs to communicate directly with each other across nodes at up to 100 gigabits per second (Gb/s), they can span multiple nodes and operate as if they were on one massive server.

Deployment

Deployment

Enterprises are moving to Kubernetes and Docker containers for deploying pipelines at scale. Combining containerized applications with Kubernetes enables businesses to change priorities on what task is the most important and adds resiliency, reliability, and scalability to AI data centers.

Storage

Storage

GPUDirect® Storage allows both NVMe and NVMe over Fabric (NVMe-oF) to read and write data directly to the GPU, bypassing the CPU and system memory. This frees up the CPU and system memory for other tasks, while giving each GPU access to orders of magnitude more data at up to 50 percent greater bandwidth.

Our Commitment to Open-Source Data Science

NVIDIA is committed to simplifying, unifying, and accelerating data science for the open-source community. By optimizing the whole stack—from hardware to software—and by removing bottlenecks for iterative data science, NVIDIA is helping data scientists everywhere do more than ever with less. This translates into more value for enterprises from their most precious resources: their data and data scientists. As Apache 2.0 open-source software, RAPIDS brings together an ecosystem on GPUs.

Without compute power, data scientists had to ‘dumb down’ their algorithms so they would run fast enough. No longer. GPUs allow us to do things we couldn’t do before.

- Bill Groves, Chief Data Officer, Walmart

NASA’s global models produce terabytes of data. Before RAPIDS, you would hit the button and wait six or seven hours to get the results. Speeding up the training cycle was a total game changer for developing the models.

- Dr. John Keller, NASA Goddard Space Flight Center

With 100X improvement in model training times and a cost savings of 98 percent, Capital One sees RAPIDS.ai and Dask as the next big things for data science and machine learning.

- Mike McCarty, Director of Software Engineering,Capital One Center for Machine Learning

Get Started Today