NVIDIA Deep Learning Institute

Training You to Solve the World’s Most Challenging Problems

The NVIDIA Deep Learning Institute (DLI) offers hands-on training in AI and accelerated computing to solve real-world problems. Through self-paced online and instructor-led training powered by GPUs in the cloud, developers, data scientists, researchers, and students can get practical experience and earn a certificate of competency to support professional growth. We also offer resources for business executives and university educators.

  • <span style=

    Developers

  • <span style=

    Executives

  • <span style=

    Educators

  • <span style=

    Students

Learn how to apply AI to your projects and how to accelerate your applications with CUDA and OpenACC. Through hands-on training, you’ll gain practical skills for your work and earn a certificate of subject matter competency.

Online training with DLI

Dive into hands-on, self-paced training online from anywhere at any time, with access to a fully-configured, GPU-accelerated workstation in the cloud. Choose an 8-hour course to implement and deploy an end-to-end project, or a 2-hour course to apply a specific technology or technique. Most online 8-hour courses offer a certificate of subject matter competency to support your professional growth.

Certificate Available

Deep Learning Courses

If you’re new to deep learning, start with Fundamentals to learn how to train and deploy a neural network to solve real-world problems. Once you have a basic understanding of deep learning, you’ll be able to apply your knowledge to more advanced, industry-specific DLI training.

DEEP LEARNING FUNDAMENTALS

  • Fundamentals of Deep Learning for Computer Vision 

    Prerequisites: Familiarity with basic programming fundamentals such as functions and variables

    Framework: Caffe, DIGITS

    Assessment Type: Code-based

    Duration: 8 hours

    Languages: English

    Price: $90

    Certificate Available

    Explore the fundamentals of deep learning by training neural networks and using results to improve performance and capabilities.

    In this course, you’ll learn the basics of deep learning by training and deploying neural networks. You’ll learn how to:

    • Implement common deep learning workflows, such as image classification and object detection
    • Experiment with data, training parameters, network structure, and other strategies to increase performance and capability
    • Deploy your neural networks to start solving real-world problems

    Upon completion, you’ll be able to start solving problems on your own with deep learning.

  • Getting Started on AI with Jetson Nano

    Prerequisites: Familiarity with Python (helpful, not required)

    Tools, Libraries, Frameworks: PyTorch, Jetson Nano

    Duration: 8 hours

    Languages: English

    Price: Free

    Certificate: Available

    The power of AI is now in the hands of makers, self-taught developers, and embedded technology enthusiasts everywhere with the NVIDIA Jetson Nano Developer Kit. This easy-to-use, powerful computer lets you run multiple neural networks in parallel for applications like image classification, object detection, segmentation, and speech processing. In this course, you'll use Jupyter iPython notebooks on your own Jetson Nano to build a deep learning classification project with computer vision models.

    You'll learn how to:

    • Set up your Jetson Nano and camera
    • Collect image data for classification models
    • Annotate image data for regression models
    • Train a neural network on your data to create your own models
    • Run inference on the Jetson Nano with the models you create

    Upon completion, you'll be able to create your own deep learning classification and regression models with the Jetson Nano. Hardware is required to complete this course (view details).

  • Image Classification with DIGITS

    Prerequisites: None

    Framework: Caffe (with DIGITS interface)

    Duration: 2 hours

    Languages: English, Chinese, Japanese

    Price: $30

    Deep learning enables entirely new solutions by replacing hand-coded instructions with models learned from examples. Train a deep neural network to recognize handwritten digits by:

    • Loading image data to a training environment
    • Choosing and training a network
    • Testing with new data and iterating to improve performance

    Upon completion, you’ll be able to assess what data you should be using for training.

  • Object Detection with DIGITS

    Prerequisites: Basic experience with neural networks

    Framework: Caffe (with DIGITS interface)

    Duration: 2 hours

    Languages: English, Chinese

    Price: $30

    Learn to apply deep learning to object detection through the challenge of detecting whale faces from aerial images by:

    • Combining traditional computer vision with deep learning
    • Performing minor “brain surgery” on an existing neural network using the deep learning framework Caffe
    • Harnessing the knowledge of the deep learning community by identifying and using a purpose-built network and end-to-end labeled data

    Upon completion, you’ll be able to solve custom problems with deep learning.

  • Optimization and Deployment of TensorFlow Models with TensorRT

    Prerequisites: Experience with TensorFlow and Python

    Framework: TensorFlow, Python, TensorRT (TF-TRT)

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn the fundamentals of generating high-performance deep learning models in the TensorFlow platform using built-in TensorRT library (TF-TRT) and Python. You'll explore:

    • How to pre-process classifications models and freeze graphs and weights in order to perform optimization
    • Get familiar with fundamentals of graph optimization and quantization using FP32, FP16 and INT8
    • Use TF-TRT API to optimize subgraphs and select optimization parameters that best fit your model
    • Design and embed custom operations in Python to mitigate the non-supporting layers problem and optimize detection models

    Upon completion, you'll understand how to utilize TF-TRT to achieve deployment-ready optimized models.

  • Deep Learning Workflows with TensorFlow, MXNet, and NVIDIA Docker

    Prerequisites: Basic experience with a bash terminal

    Framework: TensorFlow, MXNet

    Duration: 2 hours

    Languages: English, Japanese

    Price: $30

    The NVIDIA Docker plugin makes it possible to containerize production-grade deep learning workflows using GPUs. Learn to reduce host configuration and administration by:

    • Learning to work with Docker images and manage the container lifestyle
    • Accessing images on the public Docker image registry—DockerHub—for maximum reuse in creating composable lightweight containers
    • Training neural networks using both TensorFlow and MXNet frameworks

    Upon completion, you’ll be able to containerize and distribute pre-configured images for deep learning.

  • Accelerating Data Science Workflows with RAPIDS

    Prerequisites: Advanced competency in Pandas, NumPy, and scikit-learn

    Framework: None

    Duration: 2 hours

    Languages: English

    Price: $30

    The open source RAPIDS project allows data scientists to GPU-accelerate their data science and data analytics applications from beginning to end, creating possibilities for drastic performance gains and techniques not available through traditional CPU-only workflows.

    Learn how to GPU-accelerate your data science applications by:

    • Utilizing key RAPIDS libraries like cuDF (GPU-enabled Pandas-like dataframes) and cuML (GPU-accelerated machine learning algorithms)
    • Learning techniques and approaches to end-to-end data science, made possible by rapid iteration cycles created by GPU acceleration
    • Understanding key differences between CPU-driven and GPU-driven data science, including API specifics and best practices for refactoring

    Upon completion, you'll be able to refactor existing CPU-only data science workloads to run much faster on GPUs and write accelerated data science workflows from scratch.

  • Image Segmentation with TensorFlow

    Prerequisites: Basic experience with neural networks

    Framework: TensorFlow

    Duration: 2 hours

    Languages: English

    Price: $30

    Image (or semantic) segmentation is the task of placing each pixel of an image into a specific class. Learn how to segment MRI images to measure parts of the heart by:

    • Comparing image segmentation with other computer vision problems
    • Experimenting with TensorFlow tools such as TensorBoard and the TensorFlow Python API
    • Learning to implement effective metrics for assessing model performance

    Upon completion, you’ll be able to set up most computer vision workflows using deep learning.

  • Image Classification with Microsoft Cognitive Toolkit

    Prerequisites: None

    Framework: Microsoft Cognitive Toolkit

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn to train a neural network using the Microsoft Cognitive Toolkit framework. You’ll build and train increasingly complex networks to:

    • Compare the expression of a neural network using BrainScript’s “Simple Network Builder” vs. the more generalizable “Network Builder”
    • Visualize neural network graphs
    • Train and test a neural network to classify handwritten digits

    Upon completion, you’ll have basic knowledge of convolutional neural networks (CNNs) and be prepared to move to the more advanced usage of Microsoft Cognitive Toolkit.

  • Linear Classification with TensorFlow

    Prerequisites: None

    Framework: TensorFlow

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn how to make predictions from structured data using TensorFlow’s TFLearn API. Through the challenge of predicting personal income when given census data, you’ll:

    • Load, view, and organize data from a CSV for machine learning
    • Split an existing dataset into features and labels (input, output) of a neural network
    • Build from linear to deep models and assess the difference in performance

    Upon completion, you’ll be able to make predictions from your own structured data.

  • Signal Processing with DIGITS

    Prerequisites: Basic experience training neural networks

    Framework: Caffe, DIGITS

    Duration: 2 hours

    Languages: English, Chinese

    Price: $30

    Deep neural networks are better at classifying images than humans, which has implications beyond what we expect of computer vision. Learn how to convert radio frequency (RF) signals into images to detect a weak signal corrupted by noise. You’ll be trained how to:

    • Treat non-image data as image data
    • Implement a deep learning workflow (load, train, test, adjust) in DIGITS
    • Test performance programmatically and guide performance improvements

    Upon completion, you’ll be able to classify both image and image-like data using deep learning.

DEEP LEARNING FOR DIGITAL CONTENT CREATION

  • Image Creation Using GANs with TensorFlow and DIGITS

    Prerequisites: Experience with CNNs

    Frameworks: TensorFlow

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn how to train a Generative Adversarial Network (GAN) to generate image contents in DIGITS. You’ll learn how to:

    • Use GANs to create handwritten numbers
    • Visualize the feature space and use attribute vector to generate image analogies
    • Train a GAN to generate images with set attributes

    Upon completion, you’ll be able to use GANs to generate images by manipulating feature space.

  • Image Style Transfer with Torch

    Prerequisites: Experience with CNNs

    Frameworks: Torch

    Duration: 2 hours

    Languages: English

    Price: $30

    Explore how to transfer the look and feel of one image to another image by extracting distinct visual features. See how convolutional neural networks (CNNs) are used for feature extraction, and how these features feed into a generator to create a new image. You’ll learn how to:

    • Transfer the look and feel of one image to another image by extracting distinct visual features
    • Qualitatively determine whether a style is transferred correctly using different techniques
    • Use architectural innovations and training techniques for arbitrary style transfer

    Upon completion, you’ll be able to use neural networks for arbitrary style transfer at a speed that's effective for video.

  • Rendered Image Denoising Using Autoencoders

    Prerequisites: Experience with CNNs

    Frameworks: TensorFlow

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn how neural networks with autoencoders can be used to dramatically speed up the removal of noise in ray traced images. You’ll learn how to:

    • Determine whether noise exists in rendered images
    • Use a pre-trained network to denoise some sample images or your own images
    • Train your own denoiser using the provided dataset

    Upon completion, you’ll be able to use autoencoders inside neural networks to train your own rendered image denoiser.

  • Image Super Resolution Using Autoencoders

    Prerequisites: Experience with CNNs

    Frameworks: Keras

    Duration: 2 hours

    Languages: English

    Price: $30

    Leverage the power of a neural network with autoencoders to create high-quality images from low-quality source images. In this mini course, you'll:

    • Understand and design an autoencoder
    • Learn various methods to rigorously measuring image quality

    Upon completion, you'll be able to use autoencoders inside neural networks to significantly enhance image quality.

DEEP LEARNING FOR HEALTHCARE

  • Modeling Time Series Data with Recurrent Neural Networks in Keras

    Prerequisites: Basic experience with deep learning

    Frameworks: Keras

    Duration: 2 hours

    Languages: English

    Price: $30

    Recurrent Neural Networks (RNNs) allow models to classify or forecast time-series data, like natural language, markets, and even a patient’s health over time. You'll learn how to:

    • Create training and testing datasets using electronic health records in HDF5 (hierarchical data format version five)
    • Prepare datasets for use with recurrent neural networks, which allows modeling of very complex data sequences
    • Construct a Long-Short Term Memory model (LSTM), a specific RNN architecture, using the Keras library running on top of Theano to evaluate model performance against baseline data

    Upon completion, you’ll be able to model time-series data using RNNs.

  • Medical Image Classification Using the MedNIST Dataset

    Prerequisites: Basic experience with Python

    Frameworks: PyTorch

    Duration: 2 hours

    Languages: English

    Price: $30

    Get a hands-on practical introduction to deep learning for radiology and medical imaging. You'll learn how to:

    • Collect, format, and standardize medical image data
    • Architect and train a convolutional neural network (CNN) on a dataset
    • Use the trained model to classify new medical images

    Upon completion, you’ll be able to apply CNNs to classify images in a medical imaging dataset.

  • Data Science Workflows for Deep Learning in Medical Applications

    Prerequisites: Basic experience with Python and CNNs

    Frameworks: PyTorch

    Duration: 2 hours

    Languages: English

    Price: $30

    Medical datasets present special challenges for the application of deep learning. You will:

    • Learn introductory techniques in data augmentation and standardization
    • Experiment with these techniques on a simple medical imaging dataset
    • Validate your techniques by training a convolutional neural network on the augmented dataset

    Upon completion, you'll be able to apply simple data manipulation techniques to your medical imaging datasets.

  • Medical Image Segmentation Using DIGITS

    Prerequisites: Basic experience with CNNs and basic experience with Python

    Frameworks: DIGITS, Caffe

    Duration: 2 hours

    Languages: English

    Price: $30

    Image (or semantic) segmentation is the task of placing each pixel of an image into a specific class. You’ll segment MRI images to measure parts of the heart by:

    • Extending Caffe with custom Python layers
    • Implementing the process of transfer learning
    • Creating fully convolutional neural networks (CNNs) from popular image classification networks

    Upon completion, you’ll be able to set up most computer vision workflows using deep learning.

  • Image Classification with TensorFlow: Radiomics—1p19q Chromosome Status Classification

    Prerequisites: Basic experience with CNNs and basic experience with Python

    Frameworks: TensorFlow

    Duration: 2 hours

    Languages: English

    Price: $30

    Thanks to work being performed at the Mayo Clinic, using deep learning techniques to detect radiomics from MRI imaging has led to more effective treatments and better health outcomes for patients with brain tumors. Learn to detect the 1p19q co-deletion biomarker by:

    • Designing and training convolutional neural networks (CNNs)
    • Using imaging genomics (radiomics) to create biomarkers that identify the genomics of a disease without the use of an invasive biopsy
    • Exploring the radiogenomics work being done at the Mayo Clinic

    Upon completion, you’ll have unique insight into the novelty and promising results of using deep learning to predict radiomics.

  • Medical Image Analysis with R and MXNet

    Prerequisites: Basic experience with CNNs and basic experience with Python

    Frameworks: MXNet

    Duration: 2 hours

    Languages: English

    Price: $30

    Convolutional neural networks (CNNs) can be applied to medical image analysis to infer patient status from non-visible images. Learn how to train a CNN to infer the volume of the left ventricle of the human heart from time-series MRI data. You'll explore how to:

    • Extend a canonical 2D CNN to more complex data
    • Use MXNet through the standard Python API and R
    • Process high-dimensionality imagery that may be volumetric and have a temporal component

    Upon completion, you’ll know how to use CNNs for non-visible images.

  • Data Augmentation and Segmentation with Generative Networks for Medical Imaging

    Prerequisites: Experience with CNNs

    Frameworks: TensorFlow

    Duration: 2 hours

    Languages: English

    Price: $30

    A generative adversarial network (GAN) is a pair of deep neural networks: a generator that creates new examples based on the training data provided and a discriminator that attempts to distinguish between genuine and simulated data. As both networks improve together, the examples created become increasingly realistic. This technology is promising for healthcare, because it can augment smaller datasets for training of traditional networks. You'll learn how to:

    • Generate synthetic brain MRIs
    • Apply GANs for segmentation
    • Use GANs for data augmentation to improve accuracy

    Upon completion, you'll be able to apply GANs to medical imaging use cases.

  • Coarse-to-Fine Contextual Memory for Medical Imaging

    Prerequisites: Experience with CNNs

    Frameworks: TensorFlow

    Duration: 2 hours

    Languages: English

    Price: $30

    Coarse-to-fine contextual memory (CFCM) is a technique developed for image segmentation using very deep architectures and incorporating features from many different scales with convolutional long short-term memory (LSTM). You’ll:

    • Take a deep dive into encoder-decoder architectures for medical image segmentation
    • Get to know common building blocks (convolutions, pooling layers, residual nets, etc.)
    • Investigate different strategies for skip connections

    Upon completion, you'll be able to apply CFCM techniques to medical image segmentation and similar imaging tasks.

  • Deep Learning for Genomics Using DragoNN with Keras and Theano

    Prerequisites: Basic experience with convolutional neural networks (CNNs) and basic experience with Python

    Frameworks: Keras, Theano

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn to interpret deep learning models to discover predictive genome sequence patterns. Use the deep regulatory genomics neural network (DragoNN) toolkit on simulated and real regulatory genomic data to:

    • Demystify popular DragoNN architectures
    • Explore guidelines for modeling and interpreting regulatory sequence using DragoNN models
    • Identify when DragoNN is a good choice for a learning problem in genomics and high-performance models

    Upon completion, you’ll be able to use the discovery of predictive genome sequence patterns to gain new biological insights.

DEEP LEARNING FOR INTELLIGENT VIDEO ANALYTICS

  • Deployment for Intelligent Video Analytics using TensorRT

    Prerequisites: Basic experience with CNNs and C++

    Frameworks: TensorRT

    Duration: 2 hours

    Languages: English

    Price: $30

    When a trained neural network is tasked to find the answer on new data inputs, it is referred to as deployment. TensorRT is the primary tool for deployment, with various options to improve inference performance of neural networks. In this mini course, you'll:

    • Learn how to use giexec to run inferencing.
    • Use mixed precision INT8 to optimize inferencing.
    • Leverage custom layers API for plugins.

    Upon completion, you'll know how to use TensorRT to accelerate inferencing performance for neural networks.

  • AI Workflows for Intelligent Video Analytics with DeepStream

    Prerequisites: Experience with C++ and Gstreamer

    Frameworks: DeepStream3

    Duration: 2 hours

    Languages: English

    Price: $30

    The DeepStream 3.0 framework features hardware-accelerated building blocks of Intelligent Video Analytics (IVA) applications. This allows developers to focus on building core deep learning networks. The DeepStream SDK underpins a variety of use cases and offers flexibility on the deployment medium.

    You’ll learn how to:

    • Deploy DeepStream pipeline for parallel, multi-stream video processing and deliver applications with maximum throughput at scale
    • Configure the processing pipeline and create intuitive, graph-based applications. Leverage multiple deep network models to process video streams and achieve more intelligent insights

    Upon completion, you'll know how to create AI-based video analytics applications using DeepStream to transform video streams into actionable insights.

Accelerated Computing Courses

If you’re new to accelerated computing, get started by learning how to accelerate your applications with CUDA and OpenACC.

  • Fundamentals of Accelerated Computing with CUDA C/C++ 

    Prerequisites: Basic C/C++ competency including familiarity with variable types, loops, conditional statements, functions, and array manipulations.

    Assessment Type: Code-based

    Duration: 8 hours

    Languages: English

    Price: $90

    Certificate Available

    The CUDA computing platform enables the acceleration of CPU-only applications to run on the world’s fastest massively parallel GPUs. Experience C/C++ application acceleration by:

    • Accelerating CPU-only applications to run their latent parallelism on GPUs
    • Utilizing essential CUDA memory management techniques to optimize accelerated applications
    • Exposing accelerated application potential for concurrency and exploiting it with CUDA streams
    • Leveraging command line and visual profiling to guide and check your work

    Upon completion, you’ll be able to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques.

  • Fundamentals of Accelerated Computing with CUDA Python

    Prerequisites: Basic Python competency including familiarity with variable types, loops, conditional statements, functions, and array manipulations. NumPy competency including the use of ndarrays and ufuncs.

    Assessment Type: Code-based

    Duration: 8 hours

    Languages: English

    Price: $90

    Certificate Available

    This course explores how to use Numba—the just-in-time, type-specializing Python function compiler—to accelerate Python programs to run on massively parallel NVIDIA GPUs. You’ll learn how to:

    • Use Numba to compile CUDA kernels from NumPy universal functions (ufuncs)
    • Use Numba to create and launch custom CUDA kernels
    • Apply key GPU memory management techniques

    Upon completion, you’ll be able to use Numba to compile and launch CUDA kernels to accelerate your Python applications on NVIDIA GPUs.

  • Fundamentals of Accelerated Computing with OpenACC

    Prerequisites: Basic experience with C/C++

    Duration: 8 hours

    Languages: English

    Price: $90

    Learn the basics of OpenACC, a high-level programming language for programming on GPUs. This course is for anyone with some C/C++ experience who is interested in accelerating the performance of their applications beyond the limits of CPU-only programming. In this course, you’ll learn:

    • Four simple steps to accelerating your already existing application with OpenACC
    • How to profile and optimize your OpenACC codebase
    • How to program on multi-GPU systems by combining OpenACC with the message passing interface (MPI)

    Upon completion, you’ll be able to build and optimize accelerated heterogeneous applications on multiple GPU clusters using a combination of OpenACC, CUDA-aware MPI, and NVIDIA profiling tools.

  • Accelerating Applications with CUDA C/C++

    Prerequisites: Basic experience with C/C++

    Duration: 2 hours

    Languages: English, Japanese

    Price: $30

    Learn how to accelerate your C/C++ application using CUDA to harness the massively parallel power of NVIDIA GPUs. You'll learn how to program with CUDA in order to:

    • Accelerate SAXPY algorithms
    • Accelerate Matrix Multiply algorithms
    • Accelerate heat conduction algorithms

    Upon completion, you'll be able to use the CUDA platform to accelerate C/C++ applications.

  • OpenACC – 2X in 4 Steps

    Prerequisites: Basic experience with C/C++

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn how to accelerate your C/C++ or Fortran application using OpenACC to harness the massively parallel power of NVIDIA GPUs. OpenACC is a directive-based approach to computing where you provide compiler hints to accelerate your code, instead of writing the accelerator code yourself. Get started on the four-step process for accelerating applications using OpenACC:

    • Characterize and profile your application
    • Add compute directives
    • Add directives to optimize data movement
    • Optimize your application using kernel scheduling

    Upon completion, you will be ready to use a profile-driven approach to rapidly accelerate your C/C++ applications using OpenACC directives.

  • Introduction to Accelerated Computing

    Prerequisites: Basic experience with C/C++

    Duration: 2 hours

    Languages: English

    Price: $30

    Explore the three techniques for accelerating code on a GPU:

    • Using GPU-accelerated libraries
    • Using compiler directives like OpenACC
    • Writing code directly in CUDA-enabled languages

    Upon completion, you'll understand how to demonstrate the potential speed-ups and ease of use of porting to the GPU.

  • GPU Memory Optimizations with CUDA C/C++

    Prerequisites: Accelerating Applications with CUDA C/C++ or similar experience

    Duration: 2 hours

    Languages: English

    Price: $30

    Explore memory optimization techniques for programming with CUDA C/C++ on an NVIDIA GPU, and how to use the NVIDIA Visual Profiler (NVVP) to support these optimizations. You'll learn how to:

    • Implement a naive matrix transposing algorithm
    • Perform several cycles of profiling the algorithm with NVVP and optimize its performance

    Upon completion, you'll know how to analyze and improve global and shared memory access patterns, and how to optimize your accelerated C/C++ applications.

  • Accelerating Applications with GPU-Accelerated Libraries in C/C++

    Prerequisites: “Accelerating Applications with CUDA C/C++” or similar experience

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn how to accelerate your C/C++ application using drop-in libraries to harness the massively parallel power of NVIDIA GPUs. You'll work through three exercises, including how to:

    • Use cuBLAS to accelerate a basic matrix multiply
    • Combine libraries by adding some cuRAND API calls to the previous cuBLAS calls
    • Use nvprof to profile code and optimize with some CUDA Runtime API calls

    Upon completion, you'll be ready to utilize several CUDA enabled libraries for rapid application acceleration in your existing CPU-only C/C++ programs.

  • Accelerating Applications with GPU-Accelerated Libraries in Python

    Prerequisites: Basic experience with Python

    Duration: 2 hours

    Languages: English

    Price: $30

    Learn how to use GPU libraries to accelerate Python code on NVIDIA GPUs by:

    • Using the cuRAND library to accelerate a Monte Carlo pricer
    • Optimizing data movement between the CPU and GPU

    Upon completion, you'll be able to begin using GPU-accelerated Python libraries to accelerate your CPU-only Python code.

  • Using Thrust to Accelerate C++

    Prerequisites: “Accelerating Applications with CUDA C/C++” or similar experience

    Duration: 2 hours

    Languages: English

    Price: $30

    Thrust is a parallel algorithms library loosely based on the C++ Standard Template Library. It enables developers to quickly embrace the power of parallel computing and supports multiple system back-ends such as OpenMP and Intel's Threading Building Blocks. Use Thrust to accelerate C++ through exercises that cover:

    • Basic Iterators, Containers, and Functions
    • Built-in and Custom Functors
    • Portability to CPU processing

    Upon completion, you'll be ready to harness the power of the Thrust library to accelerate your C/C++ applications.

Online Training with Partners

DLI collaborates with leading educational organizations to expand the reach of deep learning training to developers worldwide.

Resources

Explore a wide range of technical resources on AI and accelerated computing.

Stay ahead of the curve by training your technical workforce on AI and accelerated computing. Plus, access resources to learn about the business impact of AI.

Instructor-led training

DLI will bring instructor-led workshops to your site to train teams of developers, data scientists, researchers, and engineers. Led by DLI-certified instructors, full-day workshops offer lectures, hands-on training to implement and deploy an end-to-end project, and a certificate of subject matter competency.

Certificate Available

Deep Learning Workshops

If your team is new to deep learning, start with Fundamentals to learn how to train and deploy a neural network to solve real-world problems. Once your team has a basic understanding of deep learning, they’ll be able to apply their knowledge to more advanced, industry-specific DLI training.

DEEP LEARNING FUNDAMENTALS

  • Fundamentals of Deep Learning for Computer Vision 

    Prerequisites: Familiarity with basic programming fundamentals such as functions and variables

    Frameworks: Caffe, DIGITS

    Assessment Type: Code-based

    Languages: English, Chinese, Japanese, Korean

    Certificate Available

    Explore the fundamentals of deep learning by training neural networks and using results to improve performance and capabilities.

    In this workshop, you’ll learn the basics of deep learning by training and deploying neural networks. You’ll learn how to:

    • Implement common deep learning workflows, such as image classification and object detection
    • Experiment with data, training parameters, network structure, and other strategies to increase performance and capability
    • Deploy your neural networks to start solving real-world problems

    Upon completion, you’ll be able to start solving problems on your own with deep learning.

  • Fundamentals of Deep Learning for Multiple Data Types 

    Prerequisites: Familiarity with basic Python (functions and variables), prior experience training neural networks.

    Frameworks: TensorFlow

    Assessment Type: Multiple choice

    Languages: English, Chinese, Japanese, Korean

    Certificate Available

    This workshop explores how convolutional and recurrent neural networks can be combined to generate effective descriptions of content within images and video clips.

    Learn how to train a network using TensorFlow and the Microsoft Common Objects in Context (COCO) dataset to generate captions from images and video by:

    • Implementing deep learning workflows like image segmentation and text generation
    • Comparing and contrasting data types, workflows, and frameworks
    • Combining computer vision and natural language processing

    Upon completion, you’ll be able to solve deep learning problems that require multiple types of data inputs.

  • Fundamentals of Deep Learning for Natural Language Processing 

    Prerequisites: Basic experience with neural networks and Python programming, familiarity with linguistics

    Frameworks: TensorFlow, Keras

    Assessment Type: Code-based, multiple choice

    Languages: English, Chinese

    Certificate Available

    Learn the latest deep learning techniques to understand textual input using natural language processing (NLP). You’ll learn how to:

    • Convert text to machine-understandable representations and classical approaches
    • Implement distributed representations (embeddings) and understand their properties
    • Train machine translators from one language to another

    Upon completion, you’ll be proficient in NLP using embeddings in similar applications.

  • Fundamentals of Deep Learning for Multi-GPUs 

    Prerequisites: Experience with stochastic gradient descent mechanics

    Frameworks: TensorFlow

    Assessment Type: Code-based

    Languages: English

    Certificate Available

    The computational requirements of deep neural networks used to enable AI applications like self-driving cars are enormous. A single training cycle can take weeks on a single GPU or even years for larger datasets like those used in self-driving car research. Using multiple GPUs for deep learning can significantly shorten the time required to train lots of data, making solving complex problems with deep learning feasible.

    This workshop will teach you how to use multiple GPUs to train neural networks. You'll learn:

    • Approaches to multi-GPUs training
    • Algorithmic and engineering challenges to large-scale training
    • Key techniques used to overcome the challenges mentioned above

    Upon completion, you'll be able to effectively parallelize training of deep neural networks using TensorFlow.

DEEP LEARNING BY INDUSTRY

  • Deep Learning for Autonomous Vehicles—Perception

    Prerequisites: Experience with CNNs

    Frameworks: TensorFlow, DIGITS, TensorRT

    Languages: English, Chinese, Japanese

    In this workshop, you’ll learn how to design, train, and deploy deep neural networks for autonomous vehicles using the NVIDIA DRIVE PX development platform.

    Learn how to:

    • Integrate sensor input using the DriveWorks software stack
    • Train a semantic segmentation neural network
    • Optimize, validate, and deploy a trained neural network using TensorRT

    Upon completion, students will be able to create and optimize perception components for autonomous vehicles using NVIDIA DRIVE PX.

  • Deep Learning for Finance Trading Strategy

    Prerequisites: Experience with neural networks and knowledge of financial industry

    Frameworks: TensorFlow

    Languages: English

    Linear techniques like principal component analysis (PCA) are the workhorses of creating “eigenportfolios” for use in statistical arbitrage strategies. Other techniques using time series financial data are also prevalent. But now, trading strategies can be advanced with the power of deep neural networks.

    In this workshop, you’ll learn how to:

    • Prepare time series data and test network performance using training and test datasets
    • Structure and train a long short-term memory (LSTM) network to accept vector inputs and make predictions
    • Use the autoencoder as anomaly detector to create an arbitrage strategy

    Upon completion, you’ll be able to use time series financial data to make predictions and exploit arbitrage using neural networks.

  • Deep Learning for Digital Content Creation Using Autoencoders

    Prerequisites: Basic familiarity with deep learning concepts such as CNNs and experience with Python programming language

    Frameworks: Torch, TensorFlow

    Assessment Type: Multiple choice

    Languages: English

    Certificate Available

    Explore the latest techniques for designing, training, and deploying neural networks for digital content creation. You’ll learn how to:

    • Apply the architectural innovations and training techniques used to make arbitrary video style transfer
    • Train your own denoiser for rendered images
    • Upscale images with super resolution AI

    Upon completion, you’ll be able to start creating digital assets using deep learning approaches.

  • Deep Learning for Digital Content Creation Using GANs

    Prerequisites: Basic familiarity with deep learning concepts such as CNNs and experience with Python programming language

    Frameworks:TensorFlow, Torch

    Assessment Type: Multiple choice

    Languages: English

    Certificate Available

    Explore advanced techniques for designing, training, and deploying neural networks for digital content creation. You’ll learn how to:

    • Train a generative adversarial network (GAN) to generate images
    • Create analogous images from one theme to another
    • Convert text to images using deep learning

    Upon completion, you’ll be able to start creating digital assets using deep learning approaches.

  • Deep Learning for Game Development

    Prerequisites: Basic familiarity with deep learning concepts such as CNNs and experience with Python programming language

    Frameworks: TensorFlow, Theano

    Assessment Type: Multiple choice

    Languages: English, Chinese

    Certificate Available

    Learn the latest techniques for designing, training, and deploying neural networks for game development. You’ll learn how to:

    • Train a phase-functioned neural network that animates characters
    • Translate an input image to an output image
    • Train a deep reinforcement learning agent to play Starcraft 2

    Upon completion, you’ll understand various ways to incorporate deep learning techniques to game development.

  • Deep Learning for Healthcare Image Analysis

    Prerequisites: Basic familiarity with deep neural networks, basic coding experience in Python or a similar language

    Frameworks: Caffe, DIGITS, R, MXNet, TensorFlow

    Assessment Type: Code-based

    Languages: English, Japanese

    Certificate Available

    This workshop explores how to apply convolutional neural networks (CNNs) to MRI scans to perform a variety of medical tasks and calculations. You’ll learn how to:

    • Perform image segmentation on MRI images to determine the location of the left ventricle
    • Calculate ejection fractions by measuring differences between diastole and systole using CNNs applied to MRI scans to detect heart disease
    • Apply CNNs to MRI scans of low-grade gliomas (LGGs) to determine 1p/19q chromosome co-deletion status

    Upon completion, you’ll be able to apply CNNs to MRI scans to conduct a variety of medical tasks.

  • Deep Learning for Healthcare Genomics

    Prerequisites: Basic familiarity with deep neural networks, basic coding experience in Python or a similar language

    Frameworks: TensorFlow, Caffe, DIGITS, Theano, DragoNN

    Assessment Type: Multiple choice

    Languages: English, Japanese

    Certificate Available

    This workshop teaches you how to apply deep learning to detect chromosome co-deletion and search for motifs in genomic sequences. You’ll learn how to:

    • Understand the basics of convolutional neural networks (CNNs) and how they work
    • Apply CNNs to MRI scans of low-grade gliomas (LGGs) to determine 1p/19q chromosome co-deletion status
    • Use the DragoNN toolkit to simulate genomic data and to search for motifs

    Upon completion, you’ll be able to: understand how CNNs work, evaluate MRI images using CNNs, and use real regulatory genomic data to research new motifs.

  • Deep Learning for Intelligent Video Analytics

    Prerequisites: Experience with deep networks (specifically variations of CNNs), intermediate-level experience with C++ and Python

    Frameworks: TensorFlow, TensorRT, Caffe

    Assessment Type: Code-based

    Languages: English

    Certificate Available

    We live in a data-hungry world fueled by the public's desire for high-quality video feeds. Every day, more than a billion cameras capture almost every event around the world. In order to process these video feeds, we need advanced techniques to transform them into actionable analytics. This involves identification, classification, segmentation, prediction, and recommendation.

    You’ll learn how to:

    • Train and evaluate deep learning models using the TensorFlow object detection application programming interface (API)
    • Explore the strategies and trade-offs involved in developing high quality neural network models to track moving objects in large-scale video datasets
    • Optimize inference times using TensorRT for real-time applications

    Upon completion, you’ll be able to deploy object detection and tracking networks to work on real-time, large-scale video streams.

  • Deep Learning for Robotics

    Prerequisites:  Basic familiarity with deep neural networks, basic coding experience in Python or similar language

    Frameworks: DIGITS

    Assessment Type: Code-based

    Languages: English

    Certificate Available

    AI is revolutionizing the acceleration and development of robotics across a broad range of industries. Explore how to create robotics solutions on a Jetson for embedded applications. You’ll learn how to:

    • Apply computer vision models to perform detection
    • Prune and optimize the model for embedded application
    • Train a robot to actuate the correct output based on the visual input

    Upon completion, you’ll know how to deploy high-performance deep learning applications for robotics.

Accelerated Computing Workshops

If your team is new to accelerated computing, they can start by learning how to accelerate applications with CUDA and OpenACC

  • Fundamentals of Accelerated Computing with CUDA C/C++ 

    Prerequisites: Basic C/C++ competency including familiarity with variable types, loops, conditional statements, functions, and array manipulations.

    Duration: 8 hours

    Assessment Type: Code-based

    Languages: English, Chinese, Japanese, Korean

    Certificate Available

    The CUDA computing platform enables the acceleration of CPU-only applications to run on the world’s fastest massively parallel GPUs. Experience C/C++ application acceleration by:

    • Accelerating CPU-only applications to run their latent parallelism on GPUs
    • Utilizing essential CUDA memory management techniques to optimize accelerated applications
    • Exposing accelerated application potential for concurrency and exploiting it with CUDA streams
    • Leveraging command line and visual profiling to guide and check your work

    Upon completion, you’ll be able to accelerate and optimize existing C/C++ CPU-only applications using the most essential CUDA tools and techniques. You’ll understand an iterative style of CUDA development that will allow you to ship accelerated applications fast.

  • Fundamentals of Accelerated Computing with CUDA Python

    Prerequisites: Basic Python competency including familiarity with variable types, loops, conditional statements, functions, and array manipulations. NumPy competency including the use of ndarrays and ufuncs.

    Duration: 8 hours

    Assessment Type: Code-based

    Languages: English

    Certificate Available

    This workshop explores how to use Numba—the just-in-time, type-specializing Python function compiler—to accelerate Python programs to run on massively parallel NVIDIA GPUs. You’ll learn how to:

    • Use Numba to compile CUDA kernels from NumPy universal functions (ufuncs)
    • Use Numba to create and launch custom CUDA kernels
    • Apply key GPU memory management techniques

    Upon completion, you’ll be able to use Numba to compile and launch CUDA kernels to accelerate your Python applications on NVIDIA GPUs.

Enterprise Solution

If you’re looking for more comprehensive enterprise training, the DLI Enterprise Solution offers a package of training and lectures to meet your organization’s unique needs. From hands-on online and onsite training to executive briefings and enterprise-level reporting, DLI can help your company transform into an AI organization. Contact us to learn more.

Resources

Explore a wide range of resources on the fundamentals of AI and how AI can impact your business.

Download course materials from NVIDIA to boost your university curriculum or get certified to deliver DLI workshops on campus.

Teaching Kits

DLI Teaching Kits are available to qualified university educators looking for course solutions across deep learning, accelerated computing, and robotics. Educators can integrate lecture materials, hands-on courses, GPU cloud resources, and more into their curriculum.

University Ambassador Program

The DLI University Ambassador Program enables qualified educators to deliver hands-on DLI workshops to university faculty, students, and researchers at no cost. Educators are encouraged to download the DLI Teaching Kits to be evaluated for participation in the Ambassador Program.

DLI has certified University Ambassadors at hundreds of universities, including:

Arizona State University
Columbia
The Hong Kong University Of Science And Technology