NVIDIA Home NVIDIA Home Menu Menu icon Menu Menu icon Close Close icon Close Close icon Close Close icon Caret down icon Accordion is closed, click to open. Caret down icon Accordion is closed, click to open. Caret up icon Accordion is open, click to close. Caret right icon Click to expand Caret right icon Click to expand Caret right icon Click to expand menu. Caret left icon Click to collapse menu. Caret left icon Click to collapse menu. Caret left icon Click to collapse menu. Shopping Cart Click to see cart items Search icon Click to search
Skip to main content
Artificial Intelligence Computing Leadership from NVIDIA
    • Gaming and Entertainment
      • GeForce Graphics Cards
      • Gaming Laptops
      • G-SYNC Monitors
      • SHIELD
    • Laptops and Workstations
      • Gaming Laptops
      • NVIDIA RTX Desktop Workstations
      • NVIDIA RTX in Professional Laptops
      • DGX Station
      • NVIDIA RTX Data Science Workstation
      • Studio products
    • Cloud and Data Center
      • Overview
      • Grace CPU
      • DGX Systems
      • NVIDIA OVX
      • EGX Platform
      • IGX Platform
      • HGX Platform
      • DRIVE Constellation
    • Networking
      • Overview
      • DPU
      • Ethernet
      • InfiniBand
    • GPUs
      • GeForce
      • NVIDIA RTX / Quadro
      • Data Center
      • Titan RTX
    • Embedded Systems
      • Jetson
      • DRIVE AGX
      • Clara AGX
    • Application Frameworks
      • Metaverse Applications - Omniverse
      • Automotive - DRIVE
      • Cloud-AI Video Streaming - Maxine
      • Speech AI - Riva
      • Data Analytics - RAPIDS
      • Healthcare - Clara
      • High-Performance Computing
      • Intelligent Video Analytics - Metropolis
      • Recommender Systems - Merlin
      • Robotics - Isaac
      • Telecommunications - Aerial
    • Apps and Tools
      • Application Catalog
      • NGC Catalog
      • NVIDIA NGC
      • 3D Workflows - Omniverse
      • Data Center
      • GPU Monitoring
      • NVIDIA RTX Experience
      • NVIDIA RTX Desktop Manager
      • RTX Accelerated Creative Apps
      • Video Conferencing
      • NVIDIA Workbench
    • Gaming and Creating
      • GeForce NOW Cloud Gaming
      • GeForce Experience
      • NVIDIA Broadcast App
      • Animation - Machinima
      • Modding - RTX Remix
    • Infrastructure
      • AI Enterprise Suite
      • Cloud Native Support
      • Cluster Management
      • Edge Deployment Management
      • Inference Serving
      • IO Acceleration
      • Networking
      • Virtual GPU
    • Cloud Services
      • Metaverse - Omniverse
      • Cloud Gaming
    • AI and Data Science
      • Overview
      • Data Analytics
      • Machine Learning
      • Deep Learning Training
      • Deep Learning Inference
      • Conversational AI
      • Prediction and Forecasting
      • Speech AI
      • Large Language Models
    • Data Center and Cloud Computing
      • Overview
      • Accelerated Computing for Enterprise IT
      • Cloud Computing
      • Colocation
      • Edge Computing
      • Networking
      • Virtualization
      • MLOps
    • Design and Simulation
      • Overview
      • Augmented and Virtual Reality
      • Multi-Display
      • Rendering
      • Metaverse
      • Graphics Virtualization
      • Engineering Simulation
      • Broadcast
    • Robotics and Edge Computing
      • Overview
      • AI-on-5G
      • Intelligent Video Analytics
      • Industrial
      • Robotics
      • Edge Deployment Management
      • Edge Solutions
    • High-Performance Computing
      • Overview
      • HPC and AI
      • Simulation and Modeling
      • Scientific Visualization
    • Self-Driving Vehicles
      • Overview
      • Chauffeur
      • Concierge
      • Training
      • HD Mapping
      • Simulation
      • Robotaxis
      • Trucking
      • ADAS
    • Industries
      • Overview
      • Architecture, Engineering, Construction & Operations
      • Automotive
      • Consumer Internet
      • Cybersecurity
      • Energy
      • Financial Services
      • Healthcare and Life Sciences
      • Higher Education
      • Game Development
      • Manufacturing
      • Media and Entertainment
      • US Public Sector
      • Restaurants
      • Retail and CPG
      • Robotics
      • Smart Cities
      • Supercomputing
      • Telecommunications
      • Transportation
    • Startups
    • NVIDIA Studio
      • Overview
      • Accelerated Apps
      • Products
      • Compare
      • Shop
    • Industries
      • Media and Entertainment
      • Manufacturing
      • Architecture, Engineering, and Construction
      • All Industries >
    • Solutions
      • Data Center/Cloud
      • Laptops/Desktops
      • Augmented and Virtual Reality
      • Multi-Display
      • Rendering
      • Metaverse - Omniverse
      • Graphics Virtualization
      • Engineering Simulation
    • Industries
      • Financial Services
      • Consumer Internet
      • Healthcare
      • Higher Education
      • Retail
      • Public Sector
      • All Industries >
    • Solutions
      • Data Analytics
      • Machine Learning
      • Deep Learning Training
      • Deep Learning Inference
      • Conversational AI
      • Prediction and Forecasting
      • Large Language Models
    • Software
      • NGC Catalog
      • NVIDIA NGC
      • RAPIDS
      • Apache Spark
      • Inference Serving
      • Recommender Systems - Merlin
      • Open Source Portal
      • AI Enterprise Suite
      • NVIDIA Workbench
    • Products
      • PC
      • Laptops & Workstations
      • Data Center
      • Cloud
    • Resources
      • Professional Services
      • Technical Training
      • Startups
      • AI Accelerator Program
      • Content Library
      • NVIDIA Research
      • Developer Blog
      • Kaggle Grandmasters
    • Developer Resources
      • Join the Developer Program
      • NGC Catalog
      • NVIDIA NGC
      • Technical Training
      • News
      • Blog
      • Forums
      • Open Source Portal
      • NVIDIA GTC
      • Startups
      • Developer Home >
    • Application Frameworks
      • Metaverse - Omniverse
      • Automotive - DRIVE
      • Cloud-AI Video Streaming - Maxine
      • Speech AI - Riva
      • Data Analytics - RAPIDS
      • Healthcare - Clara
      • High-Performance Computing
      • Intelligent Video Analytics - Metropolis
      • Recommender Systems - Merlin
      • Robotics - Isaac
      • Telecommunications - Aerial
    • Top SDKs and Libraries
      • Parallel Programming - CUDA Toolkit
      • Edge AI applications - Jetpack
      • BlueField data processing - DOCA
      • Accelerated Libraries - CUDA-X Libraries
      • Deep Learning Inference - TensorRT
      • Deep Learning Training - cuDNN
      • Deep Learning Frameworks
      • Conversational AI - NeMo
      • Intelligent Video Analytics - DeepStream
      • NVIDIA Unreal Engine 4
      • Ray Tracing - RTX
      • Video Decode/Encode
    • GeForce
      • Overview
      • GeForce Graphics Cards
      • Gaming Laptops
      • G-SYNC Monitors
      • RTX Games
      • GeForce Experience
      • GeForce Drivers
      • Forums
      • Support
      • Shop
    • GeForce NOW
      • Overview
      • Download
      • Games
      • Pricing
      • FAQs
      • Forums
      • Support
    • SHIELD
      • Overview
      • Compare
      • Shop
      • FAQs
      • Knowledge Base
    • Solutions
      • Data Center (On-Premises)
      • Edge Computing
      • Cloud Computing
      • Networking
      • Virtualization
      • Enterprise IT Solutions
    • Software
      • AI Enterprise Suite
      • Cloud Native Support
      • Cluster Management
      • Edge Deployment Management
      • Inference Serving
      • IO Acceleration
      • Networking
      • Virtual GPU
    • Apps and Tools
      • Data Center
      • GPU Monitoring
      • NVIDIA RTX Experience
      • NVIDIA RTX Desktop Manager
    • Resources
      • Data Center & IT Resources
      • Technical Training and Certification
      • Enterprise Support
      • Drivers
      • Security
      • Product Documentation
      • Forums
      • NVIDIA Research Home
      • Research Areas
      • AI Playground
      • Video Highlights
      • COVID-19
      • NGC Catalog
      • Technical Training
      • Startups
      • News
      • Developer Blog
      • Open Source Portal
      • Cambridge-1 Supercomputer
      • 3D Deep Learning Research
    • Products
      • AI Training - DGX
      • Edge Computing - EGX
      • Embedded Computing - Jetson
    • Software
      • Robotics - Isaac SDK
      • Simulation - Isaac Sim
      • TAO Toolkit
      • Vision AI - Deepstream SDK
      • Edge Deployment Management
      • Synthetic Data Generation - Replicator
    • Use Cases
      • Healthcare and Life Sciences
      • Manufacturing
      • Public Sector
      • Retail
      • Robotics
      • More >
    • Resources
      • NVIDIA Blog
      • Robotics Research
      • Developer Blog
      • Technical Training
      • Startups
  • NVIDIA GTC
  • Shop
  • Drivers
  • Support
  • Login Log out
Skip to main content
  • 0
    Cart
  • Login LogOut
NVIDIA NVIDIA logo
  • Products
    • Hardware
      • Gaming and Entertainment
        • GeForce Graphics Cards
        • Gaming Laptops
        • G-SYNC Monitors
        • SHIELD
      • Laptops and Workstations
        • Gaming Laptops
        • NVIDIA RTX Desktop Workstations
        • NVIDIA RTX in Professional Laptops
        • DGX Station
        • NVIDIA RTX Data Science Workstation
        • Studio products
      • Cloud and Data Center
        • Overview
        • Grace CPU
        • DGX Systems
        • NVIDIA OVX
        • EGX Platform
        • IGX Platform
        • HGX Platform
        • DRIVE Constellation
      • Networking
        • Overview
        • DPU
        • Ethernet
        • InfiniBand
      • GPUs
        • GeForce
        • NVIDIA RTX / Quadro
        • Data Center
        • Titan RTX
      • Embedded Systems
        • Jetson
        • DRIVE AGX
        • Clara AGX
    • Software
      • Application Frameworks
        • Metaverse Applications - Omniverse
        • Automotive - DRIVE
        • Cloud-AI Video Streaming - Maxine
        • Speech AI - Riva
        • Data Analytics - RAPIDS
        • Healthcare - Clara
        • High-Performance Computing
        • Intelligent Video Analytics - Metropolis
        • Recommender Systems - Merlin
        • Robotics - Isaac
        • Telecommunications - Aerial
      • Apps and Tools
        • Application Catalog
        • NGC Catalog
        • NVIDIA NGC
        • 3D Workflows - Omniverse
        • Data Center
        • GPU Monitoring
        • NVIDIA RTX Experience
        • NVIDIA RTX Desktop Manager
        • RTX Accelerated Creative Apps
        • Video Conferencing
        • NVIDIA Workbench
      • Gaming and Creating
        • GeForce NOW Cloud Gaming
        • GeForce Experience
        • NVIDIA Broadcast App
        • Animation - Machinima
        • Modding - RTX Remix
      • Infrastructure
        • AI Enterprise Suite
        • Cloud Native Support
        • Cluster Management
        • Edge Deployment Management
        • Inference Serving
        • IO Acceleration
        • Networking
        • Virtual GPU
      • Cloud Services
        • Metaverse - Omniverse
        • Cloud Gaming
  • Solutions
    • AI and Data Science
      • Overview
      • Data Analytics
      • Machine Learning
      • Deep Learning Training
      • Deep Learning Inference
      • Conversational AI
      • Prediction and Forecasting
      • Speech AI
      • Large Language Models
    • Data Center and Cloud Computing
      • Overview
      • Accelerated Computing for Enterprise IT
      • Cloud Computing
      • Colocation
      • Edge Computing
      • Networking
      • Virtualization
      • MLOps
    • Design and Simulation
      • Overview
      • Augmented and Virtual Reality
      • Multi-Display
      • Rendering
      • Metaverse
      • Graphics Virtualization
      • Engineering Simulation
      • Broadcast
    • Robotics and Edge Computing
      • Overview
      • AI-on-5G
      • Intelligent Video Analytics
      • Industrial
      • Robotics
      • Edge Deployment Management
      • Edge Solutions
    • High-Performance Computing
      • Overview
      • HPC and AI
      • Simulation and Modeling
      • Scientific Visualization
    • Self-Driving Vehicles
      • Overview
      • Chauffeur
      • Concierge
      • Training
      • HD Mapping
      • Simulation
      • Robotaxis
      • Trucking
      • ADAS
  • Industries
    • Industries
      • Overview
      • Architecture, Engineering, Construction & Operations
      • Automotive
      • Consumer Internet
      • Cybersecurity
      • Energy
      • Financial Services
      • Healthcare and Life Sciences
      • Higher Education
      • Game Development
      • Manufacturing
      • Media and Entertainment
      • US Public Sector
      • Restaurants
      • Retail and CPG
      • Robotics
      • Smart Cities
      • Supercomputing
      • Telecommunications
      • Transportation
  • For You
    • Creatives/Designers
      • NVIDIA Studio
        • Overview
        • Accelerated Apps
        • Products
        • Compare
        • Shop
      • Industries
        • Media and Entertainment
        • Manufacturing
        • Architecture, Engineering, and Construction
        • All Industries >
      • Solutions
        • Data Center/Cloud
        • Laptops/Desktops
        • Augmented and Virtual Reality
        • Multi-Display
        • Rendering
        • Metaverse - Omniverse
        • Graphics Virtualization
        • Engineering Simulation
    • Data Scientists
      • Industries
        • Financial Services
        • Consumer Internet
        • Healthcare
        • Higher Education
        • Retail
        • Public Sector
        • All Industries >
      • Solutions
        • Data Analytics
        • Machine Learning
        • Deep Learning Training
        • Deep Learning Inference
        • Conversational AI
        • Prediction and Forecasting
        • Large Language Models
      • Software
        • NGC Catalog
        • NVIDIA NGC
        • RAPIDS
        • Apache Spark
        • Inference Serving
        • Recommender Systems - Merlin
        • Open Source Portal
        • AI Enterprise Suite
        • NVIDIA Workbench
      • Products
        • PC
        • Laptops & Workstations
        • Data Center
        • Cloud
      • Resources
        • Professional Services
        • Technical Training
        • Startups
        • AI Accelerator Program
        • Content Library
        • NVIDIA Research
        • Developer Blog
        • Kaggle Grandmasters
    • Developers
      • Developer Resources
        • Join the Developer Program
        • NGC Catalog
        • NVIDIA NGC
        • Technical Training
        • News
        • Blog
        • Forums
        • Open Source Portal
        • NVIDIA GTC
        • Startups
        • Developer Home >
      • Application Frameworks
        • Metaverse - Omniverse
        • Automotive - DRIVE
        • Cloud-AI Video Streaming - Maxine
        • Speech AI - Riva
        • Data Analytics - RAPIDS
        • Healthcare - Clara
        • High-Performance Computing
        • Intelligent Video Analytics - Metropolis
        • Recommender Systems - Merlin
        • Robotics - Isaac
        • Telecommunications - Aerial
      • Top SDKs and Libraries
        • Parallel Programming - CUDA Toolkit
        • Edge AI applications - Jetpack
        • BlueField data processing - DOCA
        • Accelerated Libraries - CUDA-X Libraries
        • Deep Learning Inference - TensorRT
        • Deep Learning Training - cuDNN
        • Deep Learning Frameworks
        • Conversational AI - NeMo
        • Intelligent Video Analytics - DeepStream
        • NVIDIA Unreal Engine 4
        • Ray Tracing - RTX
        • Video Decode/Encode
    • Gamers
      • GeForce
        • Overview
        • GeForce Graphics Cards
        • Gaming Laptops
        • G-SYNC Monitors
        • RTX Games
        • GeForce Experience
        • GeForce Drivers
        • Forums
        • Support
        • Shop
      • GeForce NOW
        • Overview
        • Download
        • Games
        • Pricing
        • FAQs
        • Forums
        • Support
      • SHIELD
        • Overview
        • Compare
        • Shop
        • FAQs
        • Knowledge Base
    • IT Professionals
      • Solutions
        • Data Center (On-Premises)
        • Edge Computing
        • Cloud Computing
        • Networking
        • Virtualization
        • Enterprise IT Solutions
      • Software
        • AI Enterprise Suite
        • Cloud Native Support
        • Cluster Management
        • Edge Deployment Management
        • Inference Serving
        • IO Acceleration
        • Networking
        • Virtual GPU
      • Apps and Tools
        • Data Center
        • GPU Monitoring
        • NVIDIA RTX Experience
        • NVIDIA RTX Desktop Manager
      • Resources
        • Data Center & IT Resources
        • Technical Training and Certification
        • Enterprise Support
        • Drivers
        • Security
        • Product Documentation
        • Forums
    • Researchers
        • NVIDIA Research Home
        • Research Areas
        • AI Playground
        • Video Highlights
        • COVID-19
        • NGC Catalog
        • Technical Training
        • Startups
        • News
        • Developer Blog
        • Open Source Portal
        • Cambridge-1 Supercomputer
        • 3D Deep Learning Research
    • Roboticists
      • Products
        • AI Training - DGX
        • Edge Computing - EGX
        • Embedded Computing - Jetson
      • Software
        • Robotics - Isaac SDK
        • Simulation - Isaac Sim
        • TAO Toolkit
        • Vision AI - Deepstream SDK
        • Edge Deployment Management
        • Synthetic Data Generation - Replicator
      • Use Cases
        • Healthcare and Life Sciences
        • Manufacturing
        • Public Sector
        • Retail
        • Robotics
        • More >
      • Resources
        • NVIDIA Blog
        • Robotics Research
        • Developer Blog
        • Technical Training
        • Startups
    • Startups
    • NVIDIA GTC
    • Shop
    • Drivers
    • Support
Deep Learning Institute
Self Paced Courses
Instructor-Led Workshops
Educator Programs
Enterprise Solutions
Resources
  • Self Paced Courses
  • Instructor-Led Workshops
  • Educator Programs
  • Enterprise Solutions
  • Resources
  • Self Paced Courses
  • Instructor-Led Workshops
  • Educator Programs
  • Enterprise Solutions
  • Resources

 Instructor-Led Workshop
Accelerating CUDA C++ Applications with Multiple GPUs

Request a workshop for your organization
Notify me when public workshops are available
View public workshops

Computationally intensive CUDA® C++ applications in high-performance computing, data science, bioinformatics, and deep learning can be accelerated by using multiple GPUs, which can increase throughput and/or decrease your total runtime. When combined with the concurrent overlap of computation and memory transfers, computation can be scaled across multiple GPUs without increasing the cost of memory transfers. For organizations with multi-GPU servers, whether in the cloud or on NVIDIA DGX™ systems, these techniques enable you to achieve peak performance from GPU-accelerated applications. And it’s important to implement these single-node, multi-GPU techniques before scaling your applications across multiple nodes. 

This workshop covers how to write CUDA C++ applications that efficiently and correctly utilize all available GPUs in a single node, dramatically improving the performance of your applications and making the most cost-effective use of systems with multiple GPUs.

 

Learning Objectives


By participating in this workshop, you’ll:
  • Use concurrent CUDA streams to overlap memory transfers with GPU computation
  • Utilize all available GPUs on a single node to scale workloads across all available GPUs
  • Combine the use of copy/compute overlap with multiple GPUs
  • Rely on the NVIDIA Nsight™ Systems Visual Profiler timeline to observe improvement opportunities and the impact of the techniques covered in the workshop

Download workshop datasheet (PDF 243 KB)

Workshop Outline

Introduction
(15 mins)
  • Meet the instructor.
  • Create an account at courses.nvidia.com/join
Using JupyterLab
(15 mins)
  • Get familiar with your GPU-accelerated interactive JupyterLab environment.
Application Overview
(15 mins)
  • Orient yourself with a single GPU CUDA C++ application that will be the starting point for the course.
  • Observe the current performance of the single GPU CUDA C++ application using Nsight Systems.
Introduction to CUDA Streams
(90 mins)
  • Learn the rules that govern concurrent CUDA stream behavior.
  • Use multiple CUDA streams to perform concurrent host-to-device and device-to-host memory transfers.
  • Utilize multiple CUDA streams for launching GPU kernels.
  • Observe multiple streams in the Nsight Systems Visual Profiler timeline view.
Break (60 mins)
Copy/Compute Overlap with CUDA Streams
(90 mins)
  • Learn the key concepts for effectively performing copy/compute overlap.
  • Explore robust indexing strategies for the flexible use of copy/compute overlap in applications.
  • Refactor the single-GPU CUDA C++ application to perform copy/compute overlap.
  • See copy/compute overlap in the Nsight Systems visual profiler timeline.
Multiple GPUs with CUDA C++
(60 mins)
  • Learn the key concepts for effectively using multiple GPUs on a single node with CUDA C++.
  • Explore robust indexing strategies for the flexible use of multiple GPUs in applications.
  • Refactor the single-GPU CUDA C++ application to utilize multiple GPUs.
  • See multiple-GPU utilization in the Nsight Systems Visual Profiler timeline.
Break (15 mins)
Copy/Compute Overlap with Multiple GPUs
(60 mins)
  • Learn the key concepts for effectively performing copy/compute overlap on multiple GPUs.
  • Explore robust indexing strategies for the flexible use of copy/compute overlap on multiple GPUs.
  • Refactor the single-GPU CUDA C++ application to perform copy/compute overlap on multiple GPUs.
  • Observe performance benefits for copy/compute overlap on multiple GPUs.
  • See copy/compute overlap on multiple GPUs in the Nsight Systems visual profiler timeline.
Course Assessment (30 mins)
Final Review
(30 mins)
  • Review key learnings.
  • Learn to build your own training environment from the DLI base environment container.
  • Complete the workshop survey.
 

Workshop Details

Duration: 8 hours

Price: Contact us for pricing.

Prerequisites:

  • Professional experience programming CUDA C/C++ applications, including the use of the nvcc compiler, kernel launches, grid-stride loops, host-to-device and device-to-host memory transfers, and CUDA error handling
  • Familiarity with the Linux command line
  • Experience using makefiles to compile C/C++ code

Suggested resources to satisfy prerequisites: Fundamentals of Accelerated Computing with CUDA C/C++, Ubuntu Command Line for Beginners (sections 1 through 5), Makefile Tutorial (through the Simple Examples section)

Technologies: CUDA C++, Nsight Systems

Certificate: Upon successful completion of the assessment, participants will receive an NVIDIA DLI certificate to recognize their subject matter competency and support professional career growth.

Hardware Requirements: Desktop or laptop computer capable of running the latest version of Chrome or Firefox. Each participant will be provided with dedicated access to a fully configured, GPU-accelerated server in the cloud.

Languages: English, Simplified Chinese

Upcoming Workshops

Central European Time
GTC
 

Thu, Apr 15, 2021

9:00 a.m.–5:00 p.m.

Register now
Pacific Time
GTC
 

Fri, Apr 16, 2021

9:00 a.m.–5:00 p.m.

Register now
Pacific Time
 

Wed, Jun 23, 2021

9:00 a.m.–5:00 p.m.

Register now

If your organization is interested in boosting and developing key skills in AI, accelerated data science, or accelerated computing, you can request instructor-led training from the NVIDIA DLI.

Request a Workshop

Questions?

Contact Us for Questions on
Deep Learning Training

  1. Section
  • Section

Optional

Read our FAQs

Read our FAQs.

NVIDIA Deep Learning institute services

Inquire about NVIDIA Deep Learning Institute services.

NVIDIA Developer Forums

For technical questions, check out the NVIDIA Developer Forums.

DLI Solutions
  • Self-Paced, Online Courses
  • Live Instructor-Led Workshops
  • Educator Programs and Teaching Kits
  • Enterprise Solutions
Products
  • DGX Systems
  • DGX A100
  • DGX Station
  • EGX Platform
  • Data Center GPUs
  • Virtual GPU
  • NVIDIA DRIVE
  • NVIDIA Isaac
  • Jetson
  • GeForce RTX
Technologies
  • CUDA-X AI
  • NGC Catalog
  • Data Analytics
  • Deep Learning SDKs
  • Deep Learning Frameworks
  • Conversational AI
  • Recommender Systems
  • Industry Frameworks
Resources
  • Technical Resources
  • NVIDIA Developer
  • NVIDIA Research
  • GPU Technology Conference
  • Careers
  • Newsroom
  • Deep Learning Blogs
  • NVIDIA On-Demand
Sign Up for NVIDIA News
Subscribe
Follow NVIDIA
Facebook Twitter LinkedIn Instagram YouTube
NVIDIA
United Kingdom
  • Privacy Policy
  • Manage My Privacy
  • Legal
  • Accessibility
  • Corporate Policies
  • Product Security
  • Contact
Copyright © 2023 NVIDIA Corporation