RAPIDS Accelerator for Apache Spark

Benefits
Use Cases
Project Aether
Get Started

Benefits
Use Cases
Project Aether
Get Started

The RAPIDS™ Accelerator for Apache Spark is a plug-in that leverages RAPIDS libraries and GPUs to accelerate data processing and machine learning pipelines on Apache Spark. It transforms existing pipelines without any code change.

Explore the Benefits of Acceleration

Faster Execution Time

Accelerate the performance of data preparation tasks to train AI models faster and speed up analytics.

Reduced Infrastructure Costs

Do more with less: Spark on NVIDIA GPUs completes jobs faster with less hardware when compared to CPUs, saving time as well as on-premises capital costs or operational costs in the cloud.

Quick Time to Value

Experience benefits quickly with no required code changes. Included tools identify the best jobs for GPU acceleration and calculate optimal configurations.

Use Cases

How the RAPIDS Accelerator Is Being Used

All kinds of enterprises use Apache Spark for business process analytics, loading of data into data warehouses, and data preprocessing at the start of machine learning pipelines.

Data Processing Scalability

Growing volumes of data stress IT resources. GPU acceleration enhances compute infrastructure so it can process vastly more data. By accelerating their operations, Taboola was able to maintain their processing deadlines on growing data volumes within their existing data center footprint.

Read Blog

AI Pipelines

AI pipelines consist of multiple steps, including data preparation, transformation, feature engineering, and data extraction. Accelerating these operations with GPUs results in quicker time to training, along with dramatic infrastructure cost reductions. AT&T reduced both the cost and time of their AI pipeline by 70 percent.

Read Their Story

Faster Analytics

Businesses rely on the latest data to make critical operational decisions. GPU acceleration lets them work with up-to-date information and get insights faster. Using GPUs, CapGemini helped an international retailer reduce transaction processing time from days to hours.

Read the Blog

Assistance for Migration at Scale

Project Aether

Automate the qualifying, testing, and configuring of your Spark jobs for GPU acceleration, using AI to optimize configurations for maximum performance.

Large-scale migration time can be reduced from weeks or months to hours or days, enabling quicker time to value and significant savings. Apply to be considered for this free service by filling out the interest form.

Get Access

TCO Analysis Tool

How Fast Can You Go?

Evaluate your own Apache Spark workloads for GPU acceleration potential and learn how to configure a cluster for optimal cost savings.

Learn More

The Cloudera and NVIDIA integration will empower us to use data-driven insights to power mission-critical use cases. We are currently implementing this integration and already seeing over 10X speed improvements at half the cost for our data engineering and data science workflows.

— Joe Ansaldi, Technical Branch Chief of Research Applied Analytics and Statistics, IRS

We’re seeing significantly faster performance with NVIDIA-accelerated Spark 3 compared to running Spark on CPUs. With these game-changing GPU performance gains, entirely new possibilities open up for enhancing AI-driven features in our full suite of Adobe Experience Cloud apps.

— William Yan, Senior Director of Machine Learning, Adobe

Our continued work with NVIDIA improves performance with RAPIDS optimizations for Apache Spark 3 and Databricks to benefit our joint customers like Adobe. These contributions lead to faster data pipelines, model training and scoring, that directly translate to more breakthroughs and insights for our community of data engineers and data scientists.

— Matei Zaharia, Original Creator of Apache Spark and Chief Technologist at Databricks

Quote 1
Quote 2
Quote 3

Starting Options

Get Started With the RAPIDS Accelerator for Apache Spark

Learn how to take GPU-accelerated data analytics from development to production.

Develop

Start using RAPIDS open-source libraries today to accelerate data science pipelines. Explore the latest technical resources and get started with the RAPIDS Accelerator for Apache Spark.

Get Started

Deploy

Accelerate data science with NVIDIA AI Enterprise, an end-to-end, secure, cloud-native AI software platform. NVIDIA AI Enterprise provides security, manageability, and API stability to mitigate the potential risks of open-source software.

Request a 90-Day License

Get the Free Ebook to Learn More

To unlock the value of AI-powered big data and learn more about the next evolution of Apache Spark, download the ebook Accelerating Apache Spark 3.x—Leveraging NVIDIA GPUs to Power the Next Era of Analytics and AI.

Download Now

Apply for Access to Project Aether

Section

Section

First Name

Last Name

Business Email Address

Organization/University Name

Industry

Job Title

Location

Preferred Language

State/Province

enterpriseOptIns hidden field

Send me the latest enterprise news, announcements, and more from NVIDIA. I can unsubscribe at any time.

NVIDIA Privacy Policy

I agree to the collection and processing of the above information by NVIDIA <span class="corporation-txt hidden">Corporation </span>for the purposes of research and event organization, and I have read and agree to <a href="https://www.nvidia.com/en-sg/about-nvidia/privacy-policy/?deeplink=visiting-our-website" target="_blank">NVIDIA Privacy Policy</a>.

I agree that the above information will be transferred to NVIDIA Corporation in the United States and stored in a manner consistent with <a href="https://www.nvidia.com/en-sg/about-nvidia/privacy-policy/?deeplink=visiting-our-website" target="_blank">NVIDIA Privacy Policy</a> due to necessities for research, event organization and corresponding NVIDIA internal management and system operation need. You may contact us by sending an email to <a href="mailto:privacy@nvidia.com">privacy@nvidia.com</a> to resolve related problems.