Deep Learning

Deep learning is a subset of artificial intelligence (AI)  and machine learning (ML) that uses multi-layered artificial neural networks to deliver state-of-the-art accuracy in tasks like object detection, speech recognition, language translation, and others.

The progression of AI, machine learning, and deep learning.

What is Deep Learning?

Deep learning is a subset of machine learning, with the difference that DL algorithms can automatically learn representations from data such as images, video, or text, without introducing human domain knowledge. The word "deep" in deep learning represents the many layers of algorithms, or neural networks, that are used to recognize patterns in data. DL’s highly flexible architectures can learn directly from raw data, similar to the way the human brain operates, and can increase their predictive accuracy when provided with more data.   

Further, deep learning is the primary technology that allows high precision and accuracy in tasks such as speech recognition, language translation, and object detection. It has led to many recent breakthroughs in AI, including Google DeepMind’s AlphaGo, self-driving cars, intelligent voice assistants, and many others.

 

How Deep Learning Works

Deep learning uses multi-layered artificial neural networks (ANNs), which are networks composed of several "hidden layers" of nodes between the input and output. 

Image recognition.

An artificial neural network transforms input data by applying a nonlinear function to a weighted sum of the inputs. The transformation is known as a neural layer and the function is referred to as a neural unit. 

An artificial neural network.

The intermediate outputs of one layer, called features, are used as the input into the next layer. The neural network learns multiple layers of nonlinear features (like edges and shapes) through repeated transformations, which it then combines in a final layer to create a prediction (of more complex objects). 

With backpropagation inside of a process called gradient descent, the errors are sent back through the network again and the weights are adjusted, improving the model. The neural net learns by varying the weights or parameters of a network so as to minimize the difference between the predictions of the neural network and the desired values. This process is repeated thousands of times, adjusting a model's weights in response to the error it produces, until the error can't be reduced anymore. This phase where the artificial neural network learns from the data is called training.  During this process, the layers learn the optimal features for the model, which has the advantage that features do not need to be predetermined.

NVIDIA DIGITS and NVIDIA TensorRT.

GPUs: Key to Deep Learning

Architecturally, the CPU is composed of just a few cores with lots of cache memory that can handle a few software threads at a time. In contrast, a GPU is composed of hundreds of cores that can handle thousands of threads simultaneously.

Difference between a CPU and GPU.

State-of-the-art deep learning neural networks can have from millions to well over one billion parameters to adjust using backpropagation. They also require a large amount of training data to achieve high accuracy, meaning hundreds of thousands to millions of input samples will have to be run through both a forward and backward pass. Because neural nets are created from large numbers of identical neurons, they’re highly parallel by nature. This parallelism maps naturally to GPUs, providing a significant computation speedup over CPU-only training and making them the platform of choice for training large, complex neural network-based systems. The parallel nature of inference operations also lend themselves well for execution on GPUs.

Deep Learning Use Cases

Deep learning is commonly used across apps in computer vision, conversational AI, and recommendation systems. Computer vision apps use deep learning to gain knowledge from digital images and videos. Conversational AI apps help computers understand and communicate through natural language. And recommendation systems use images, language, and a user’s interests to offer meaningful and relevant search results and services.

Deep learning is enabling self-driving cars, smart personal assistants, and smarter web services. Applications of deep learning, such as fraud detection and supply chain modernization, are also being used by the world’s most advanced teams and organizations.

There are different variations of deep learning algorithms, such as the following:

  • ANNs where information is only fed forward from one layer to the next are called feedforward artificial neural networks.  Multilayer perceptrons (MLPs) are a type of feedforward ANN consisting of at least three layers of nodes: an input layer, a hidden layer and an output layer. MLPs are good at classification prediction problems using labeled inputs. They’re flexible networks that can be applied to a variety of scenarios.
  • Convolutional Neural Networks are the image crunchers to identify objects. CNN image recognition is better in some scenarios than humans, and that ranges from cats to identifying indicators for cancer in blood and tumors in MRI scans. CNNs  are today’s eyes of autonomous vehicles, oil exploration, and fusion energy research. In healthcare, they can help spot diseases faster in medical imaging and save lives.
  •  Recurrent neural networks are the mathematical engines to parse language patterns and sequenced data.

    • These networks  are revving up a voice-based computing revolution and provide the natural language processing brains that give ears and speech to Amazon’s Alexa, Google’s Assistant, and Apple’s Siri. They also lend clairvoyant-like magic to Google’s autocomplete feature that fills in lines of your search queries.
    • RNN applications extend beyond natural language processing and speech recognition. They’re used in language translation, stock predictions, and algorithmic trading as well.
    • To detect fraud in finance, anomalous spending patterns can be red-flagged using RNNs, which are particularly good at guessing what comes next in a sequence of data. American Express has deployed deep-learning-based models optimized with NVIDIA® TensorRT and running on NVIDIA Triton Inference Server to detect fraud.

Deep Learning Benefits

One benefit of deep learning is its inherent flexibility in developing approximations for diverse sets of data. Data scientists can develop approximations for just about anything when using deep learning and neural networks. The accuracy of the predictions and analysis of deep learning when trained with huge amounts of data is unparalleled.

A clear advantage of using deep learning over machine learning is the ability to execute feature engineering on its own. Using deep learning, an algorithm can scan data searching for features that correlate, then combine them to enable faster learning without any human intervention.

Why Deep Learning Matters to Researchers and Data Scientists

With NVIDIA GPU-accelerated deep learning frameworks, researchers and data scientists can significantly speed up deep learning training that could otherwise take days and weeks to just hours and days. When models are ready for deployment, developers can rely on GPU-accelerated inference platforms for the cloud, embedded device, or self-driving cars, to deliver high-performance, low-latency inference for the most computationally-intensive deep neural networks.

NVIDIA Deep Learning for Developers

GPU-accelerated deep learning frameworks offer flexibility to design and train custom deep neural networks and provide interfaces to commonly used programming languages such as Python and C/C++. Widely used deep learning frameworks such as MXNet, PyTorch, TensorFlow, and others rely on NVIDIA GPU-accelerated libraries to deliver high-performance, multi-GPU accelerated training.

Widely used deep learning frameworks.

Next Steps