August 30 – September 3, 2021

Join us at INTERSPEECH, a technical conference focused on the latest research and technologies in speech processing. NVIDIA will present accepted papers on our latest research in speech recognition and speech synthesis.


Explore NVIDIA’s work in conversational AI research across automatic speech recognition, natural language processing, and text-to-speech. This chapter of I AM AI reveals how NVIDIA developers and creators deploy state-of-the art models for expressive speech synthesis capabilities.

Conference Schedule at a Glance

Come check out NVIDIA’s papers at this year’s hybrid INTERSPEECH event. They cover a wide range of groundbreaking research in the field of conversational AI, including datasets, pre-trained models, and real-world applications for speech recognition and text-to-speech.

SPGISpeech: 5,000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition
Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko
11:00 a.m. - 01:00 p.m. CET
TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction
Stanislav Beliaev, Boris Ginsburg
04:00 - 06:00 p.m. CET
Compressing 1D Time-Channel Separable Convolutions Using Sparse Random Ternary Matrices
Gonçalo Mordido, Matthijs Van Keirsbilck, Alexander Keller
04:00 - 06:00 p.m. CET
NeMo Inverse Text Normalization: From Development To Production
Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg
04:00 - 06:00 p.m. CET
Scene-Agnostic Multi-Microphone Speech Dereverberation
Yochai Yemini, Ethan Fetaya, Haggai Maron, Sharon Gannot
07:00 - 09:00 p.m. CET
Hi-Fi Multi-Speaker English TTS Dataset
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg, Yang Zhang
07:00 - 09:00 p.m. CET

Deep Dive

Get Started With Pre-trained Model

Get Started With Pre-trained Models

NVIDIA offers pre-trained models for speech recognition, language understanding, and speech synthesis through the NGC catalog. These models are highly accurate and have been trained on a variety of open and proprietary datasets for thousands of hours using GPUs. The NGC models are seamlessly integrated with SDKs such as NVIDIA NeMo for building, training, and fine-tuning conversational AI models


Create Cutting-Edge Conversational AI Models

Explore NVIDIA NeMo, an open-source toolkit for researchers developing new state-of-the-art conversational AI models. It provides a collection of modules and models for automatic speech recognition, natural language processing, and text-to-speech. NeMo modules and models are highly interoperable with popular PyTorch and PyTorch Lightning frameworks, giving researchers exceptional flexibility.

Develop Conversational AI Apps For Enterprise

Develop Conversational AI Apps For Enterprise

NVIDIA offers Riva, a GPU- accelerated SDK to help enterprises develop multimodal conversational AI applications. It includes highly accurate pre-trained models in NGC, tools for fine-tuning these models on custom datasets, and optimized real-time speech and language skills for tasks like transcription and natural language-understanding.

NVIDIA Deep Learning Institute (DLI)

With the NVIDIA Deep Learning Institute (DLI), developers, data scientists, researchers, and students can access hands-on training in AI, accelerated computing, and accelerated data science to advance their knowledge in  topics like AI for speech processing.

Use code INTERSPEECH25 to receive 25% off the upcoming workshops:

Building Transformer-Based Natural Language Processing Applications
September 23, 2021 at 9:00am-5:00pm PDT.

Building Conversational AI Applications
November 24, 2021 at 9:00am-5:00pm CET

NVIDIA Developer Program
Join the NVIDIA Inception Program

Unlock Your Startup’s Potential

NVIDIA Inception nurtures cutting-edge startups that are revolutionizing industries with artificial intelligence. Our acceleration platform offers go-to-market support, expertise, and technology—all tailored to a new business’s evolution.



At NVIDIA, you’ll solve some of the world’s hardest problems and discover never-before-seen ways to improve the quality of life for people everywhere. From healthcare to robots, self-driving cars to blockbuster movies—and a growing list of new opportunities every single day. Explore all of our open roles, including internships and new college graduate positions.

Learn  more about our career opportunities by exploring current job openings as well as university jobs.


Sign up to receive the latest news from NVIDIA