Deep learning is a subset of artificial intelligence (AI) and machine learning (ML) that uses multi-layered artificial neural networks to deliver state-of-the-art accuracy in tasks like object detection, speech recognition, language translation, and others.
Deep learning is a subset of machine learning, with the difference that DL algorithms can automatically learn representations from data such as images, video, or text, without introducing human domain knowledge. The word "deep" in deep learning represents the many layers of algorithms, or neural networks, that are used to recognize patterns in data. DL’s highly flexible architectures can learn directly from raw data, similar to the way the human brain operates, and can increase their predictive accuracy when provided with more data.
Further, deep learning is the primary technology that allows high precision and accuracy in tasks such as speech recognition, language translation, and object detection. It has led to many recent breakthroughs in AI, including Google DeepMind’s AlphaGo, self-driving cars, intelligent voice assistants, and many others.
Deep learning uses multi-layered artificial neural networks (ANNs), which are networks composed of several "hidden layers" of nodes between the input and output.
An artificial neural network transforms input data by applying a nonlinear function to a weighted sum of the inputs. The transformation is known as a neural layer and the function is referred to as a neural unit.
The intermediate outputs of one layer, called features, are used as the input into the next layer. The neural network learns multiple layers of nonlinear features (like edges and shapes) through repeated transformations, which it then combines in a final layer to create a prediction (of more complex objects).
With backpropagation inside of a process called gradient descent, the errors are sent back through the network again and the weights are adjusted, improving the model. The neural net learns by varying the weights or parameters of a network so as to minimize the difference between the predictions of the neural network and the desired values. This process is repeated thousands of times, adjusting a model's weights in response to the error it produces, until the error can't be reduced anymore. This phase where the artificial neural network learns from the data is called training. During this process, the layers learn the optimal features for the model, which has the advantage that features do not need to be predetermined.
Architecturally, the CPU is composed of just a few cores with lots of cache memory that can handle a few software threads at a time. In contrast, a GPU is composed of hundreds of cores that can handle thousands of threads simultaneously.
State-of-the-art deep learning neural networks can have from millions to well over one billion parameters to adjust using backpropagation. They also require a large amount of training data to achieve high accuracy, meaning hundreds of thousands to millions of input samples will have to be run through both a forward and backward pass. Because neural nets are created from large numbers of identical neurons, they’re highly parallel by nature. This parallelism maps naturally to GPUs, providing a significant computation speedup over CPU-only training and making them the platform of choice for training large, complex neural network-based systems. The parallel nature of inference operations also lend themselves well for execution on GPUs.
Deep learning is commonly used across apps in computer vision, conversational AI, and recommendation systems. Computer vision apps use deep learning to gain knowledge from digital images and videos. Conversational AI apps help computers understand and communicate through natural language. And recommendation systems use images, language, and a user’s interests to offer meaningful and relevant search results and services.
Deep learning is enabling self-driving cars, smart personal assistants, and smarter web services. Applications of deep learning, such as fraud detection and supply chain modernization, are also being used by the world’s most advanced teams and organizations.
There are different variations of deep learning algorithms, such as the following:
Recurrent neural networks are the mathematical engines to parse language patterns and sequenced data.
One benefit of deep learning is its inherent flexibility in developing approximations for diverse sets of data. Data scientists can develop approximations for just about anything when using deep learning and neural networks. The accuracy of the predictions and analysis of deep learning when trained with huge amounts of data is unparalleled.
A clear advantage of using deep learning over machine learning is the ability to execute feature engineering on its own. Using deep learning, an algorithm can scan data searching for features that correlate, then combine them to enable faster learning without any human intervention.
With NVIDIA GPU-accelerated deep learning frameworks, researchers and data scientists can significantly speed up deep learning training that could otherwise take days and weeks to just hours and days. When models are ready for deployment, developers can rely on GPU-accelerated inference platforms for the cloud, embedded device, or self-driving cars, to deliver high-performance, low-latency inference for the most computationally-intensive deep neural networks.
GPU-accelerated deep learning frameworks offer flexibility to design and train custom deep neural networks and provide interfaces to commonly used programming languages such as Python and C/C++. Widely used deep learning frameworks such as MXNet, PyTorch, TensorFlow, and others rely on NVIDIA GPU-accelerated libraries to deliver high-performance, multi-GPU accelerated training.