Conversational AI

Accelerate the Full Pipeline, from Speech Recognition to Language Understanding and Speech Synthesis

AI-driven services in speech and language present a revolutionary path for personalized natural conversation, but they face strict accuracy and latency requirements for real-time interactivity. With NVIDIA’s conversational AI SDK, developers can quickly build and deploy state-of-the-art AI services to power applications across a single unified architecture, delivering highly accurate, low-latency systems with little upfront investment.

Conversational AI Models From NGC

State-of-the-Art Models

Harness conversational AI models from NGC that are trained on various open and proprietary datasets for more than 100,000 hours on NVIDIA DGX systems.

Multinodal Solutions to Build Human-Like Interactive Skills

Custom Skills

Customize speech and language skills on your domain using TAO Toolkit.

Deploy Optimized Models in the Cloud & Data Center

Rapid Deployment

Deploy optimized models in the cloud, in the data center, and at the edge with a single command.

End-to-End Acceleration to Execute Model Inference Under the 300ms latency Bound

End-to-End Acceleration

Accelerate at pipeline scale and execute model inference in well under the 300 millisecond (ms) latency bound.

Introduction to Conversational AI

Download our e-book for an introduction to conversational AI, how it works, and how it’s applied in industry today.

True End-to-End Acceleration

Fully Accelerated Pipeline

Full Pipeline Inference in Fractions of a Second

Execute full conversational AI pipelines consisting of automatic speech recognition (ASR) for audio transcription, natural language understanding (NLU), and text-to-speech (TTS) in well under the 300 ms latency bound for real-time interactions, freeing up room to increase pipeline complexity without sacrificing user experience. 

The NVIDIA A100 Tensor Core GPU delivered record-setting performance in the MLPerf Training v0.7 benchmark, clocking in at 6.53 hours per accelerator for BERT on WikiText and 0.83 minutes at scale.

NVIDIA Solutions For
Conversational AI Applications

Train and Deploy with Purpose-Built Systems

Train at Scale

NVIDIA DGX A100 features eight NVIDIA A100 Tensor Core GPUs—the most advanced data center accelerator ever made. Tensor Float 32 (TF32) precision delivers a 20X AI performance improvement over previous generations—without any code change—and an additional 2X performance boost by leveraging structural sparsity across common NLP models. Third-generation NVIDIA® NVLink®, second-generation NVIDIA NVSwitch, and NVIDIA Mellanox® InfiniBand enable ultra-high-bandwidth and low-latency connections between all the GPUs. This allows multiple DGX A100 systems to train massive billion-parameter models at scale to deliver state-of-the-art accuracy. And with NVIDIA NeMo, an open-source toolkit, developers can build, train, and fine-tune DGX-accelerated conversational AI models with only a few lines of code.

NVIDIA DGX A100 - Universal System for AI Infrastructure
NVIDIA DGX A100 - Universal System for AI Infrastructure

Deploy at the Edge

NVIDIA EGX Platform makes it possible to drive real-time conversational AI while avoiding networking latency by processing high-volume speech and language data at the edge. With NVIDIA TensorRT, developers can optimize models for inference and deliver conversational AI applications with low latency and high throughput. With the NVIDIA Triton Inference Server, the models can then be deployed in production. TensorRT and Triton Inference Server work with NVIDIA Riva, an application framework for conversational AI, for building and deploying end-to-end, GPU-accelerated pipelines on EGX. Under the hood, Riva applies TensorRT, configures the Triton Inference Server, and exposes services through a standard API, deploying with a single command through Helm charts on a Kubernetes cluster.

AI-Driven Skills

Multi-Speaker Transcription

Classic speech-to-text algorithms have evolved, making it now possible to transcribe meetings, lectures, and social conversations while simultaneously identifying speakers and labeling their contributions. NVIDIA Riva allows you to create accurate transcriptions in call centers, video conferencing meetings, and automate clinical note taking during physician-patient interactions. With Riva, you can also customize models and pipelines to meet your specific use case needs.

NVIDIA Riva Enables the Fusion of Multi-Sensor Audio and Vision Data
AI Driven Services to Engage With Customers

Virtual Assistant

Virtual assistants can engage with customers in a nearly human-like way, powering interactions in contact centers, smart speakers, and in-car intelligent assistants. AI-driven services like speech recognition, language understanding, voice synthesis, and vocoding alone cannot support such a system, as they’re missing key components such as dialogue tracking. Riva supplements these backbone services with easy-to-use components that can be extended for any application.

Accelerating Enterprises and Developer Libraries

  • Ecosystem Partners
  • Developer Libraries

GPU-accelerate top speech, vision, and language workflows to meet enterprise-scale requirements.

Intelligent Voice

Build GPU-accelerated, state-of-the-art deep learning models with popular conversational AI libraries.

Hugging Face

Industry Use Cases

Curai’s Platform to Enhance Patient Experience

Chat-Based App Enhances Patient Experience

Using natural language processing, Cureai’s platform allows patients to share their conditions with their doctors, access their own medical record and helps providers extract data from medical conversations to better inform treatment..

Square Takes Edge Off Conversational AI with GPUs

Square Takes Edge off Conversational AI with GPUs

Learn about Square Assistant, a  conversational AI engine that empowers small businesses to communicate with their customers more efficiently.

Transforming  Financial  Services  With Conversational  AI

Transforming Financial Services with Conversational AI

Discover what the enterprise journey should look like for successful implementation and how to enable your business through ROI.

Get Started Accelerating Conversational AI Today

Train Smarter with NVIDIA TAO Toolkit

Run Training on NVIDIA DGX A100 Systems

Simplify Deployment with NVIDIA Riva

Deploy to the Edge on the NVIDIA EGX Platform