Generative AI for Customer Service and Support

Enhance customer experiences and improve business processes in telecommunications with generative AI.


Conversational AI / NLP



Business Goal



NVIDIA AI Enterprise

How Telecom Companies Are Driving Results With Generative AI

Telecommunications companies need to deliver exceptional customer service while navigating the complexities of maintaining high network availability, performance, and security—all essential for running applications and services. This comes at a time when the industry is investing heavily in 5G and the expansion of fiber networks, significantly increasing capital expenditures. The challenge is providing accurate, reliable support through well-informed customer service agents.

Telecom operators are invested in call centers and improving end-to-end customer experiences, including order orchestration, order management, and case summarization. Improvement in customer experiences not only yields cost savings—it also increases revenue opportunities.

Elevate Customer Experiences and Employee Productivity While Reducing Costs

Telecommunications operators are turning to generative AI to improve customer experiences, enhance employee productivity, and reduce network operating costs. Using software and accelerated infrastructure enables telcos to harness the power of generative AI efficiently, quickly, and safely without needing to build or customize their models. 

The NVIDIA AI Foundry for Generative AI

The NVIDIA AI foundry—which includes NVIDIA AI Foundation models, the NVIDIA NeMo™ framework and tools, and NVIDIA DGX™ Cloud—gives enterprises an end-to-end solution for creating custom generative AI models. 

Amdocs, a leading software and services provider, plans to build custom large language models (LLMs) for the $1.7 trillion global telecommunications industry using NVIDIA AI foundry service on Microsoft Azure. Initial use cases span customer care scenarios, including accelerating resolution of customer inquiries by drawing information from across company data using retrieval-augmented generation (RAG). In network operations, Amdocs and NVIDIA are exploring ways to generate solutions that address configuration, coverage, and performance issues as they arise.

As part of NVIDIA AI Enterprise, NeMo provides an end-to-end solution—including enterprise-grade with support, security, and stability—across the LLM pipeline, from data processing to training to inference of generative AI models. It allows telcos to quickly train, customize, and deploy LLMs at scale, reducing time to solution while increasing return on investment.

The NVIDIA AI foundry includes NVIDIA AI Foundation models, the NeMo framework and tools, and NVIDIA DGX Cloud, giving enterprises an end-to-end solution for creating custom generative AI models.

Once generative AI models are built, fine-tuned, and trained, NeMo enables seamless deployment through optimized inference on virtually any data center or cloud. NeMo Retriever, a collection of generative AI microservices, provides world-class information retrieval with the lowest latency, highest throughput, and maximum data privacy, enabling organizations to generate insights in real time. NeMo Retriever enhances generative AI applications with enterprise-grade RAG, which can be connected to business data wherever it resides. 

NVIDIA DGX Cloud is an AI-training-as-a-service platform, offering a serverless experience for enterprise developers that’s optimized for generative AI. Enterprises can experience performance-optimized, enterprise-grade NVIDIA AI Foundation models directly from a browser and customize them using proprietary data with NeMo on DGX Cloud. 

NVIDIA Riva and Omniverse for Digital Avatars

As part of their efforts to improve productivity for its more than 150,000 employees, AT&T is moving to adopt NVIDIA Omniverse™ Avatar Cloud Enginer (ACE) and NVIDIA Tokkio. ACE and Tokkio are cloud-native AI microservices, workflows, and application frameworks for easily building, customizing, and deploying interactive avatars that see, perceive, intelligently converse, and provide recommendations to enhance customer experiences. For speech AI, the carrier also uses NVIDIA Riva and is examining other customer service and operations use cases for digital twins and generative AI.

T-Mobile uses Riva, GPU-accelerated multilingual speech and translation software, and NeMo, the open-source framework for building, training, and fine-tuning state-of-the-art conversational AI models. These NVIDIA tools allowed T-Mobile engineers to fine-tune ASR models on their custom datasets and interpret customer languages accurately across noisy environments. The T-Mobile team fine-tuned speech AI models using NeMo, slashing word error rate (WER) by 10 percent across noisy production environments. They reduced latency by 10X using Riva, achieving the highest level of real-time performance for thousands of concurrent users.

KT, South Korea’s leading mobile operator with over 22 million subscribers, is training smart speakers and customer call centers with NVIDIA AI. Their popular AI voice assistant, GiGA Genie, has conversed with 8 million people through smart speakers. In KT's AI Contact Center (AICC), AI voice agents manage more than 100,000 calls daily, either providing requested information or connecting customers to human agents for answers to more detailed inquiries. KT built billion-parameter LLMs trained with the NVIDIA DGX SuperPOD platform and NeMo framework. Using hyperparameter optimization tools in NeMo, KT trained their LLMs 2X faster than with other frameworks. With LLMs, GiGA Genie gained better language understanding and can generate more human-like sentences and AICC reduced consultation times by 15 seconds.

Quick Links

“With NVIDIA® Riva services, fine-tuned using T-Mobile data, we’re building products to help us resolve customer issues in real time. After evaluating several automatic speech recognition (ASR) solutions, T-Mobile has found Riva to deliver a quality model at extremely low latency, enabling experiences our customers love.”

Matthew Davis, Vice President of Product and Technology, T-Mobile

Getting Started With Generative AI for Customer Support

Telecommunications operators looking to build custom generative AI models for enterprise applications can employ the NVIDIA AI foundry, which has four distinct steps:

  1. Start with state-of-the-art generative AI models: Leading foundation models include Meta Llama 3, Google Gemma 7B, Mixtral 8x7B, and NVIDIA’s Nemotron-3 8B family, optimized for the highest performance per cost.
  2. Customize foundation models: Tune and test the models with proprietary data using NVIDIA NeMo, the end-to-end, cloud-native framework for building, customizing, and deploying generative AI models anywhere.
  3. Build models faster in your own AI factory: Streamline AI development on NVIDIA DGX Cloud, a serverless AI-training-as-a-service platform with multi-node training capabilities and near-limitless GPU resource scalability.
  4. Deploy and scale: Run generative AI applications anywhere—cloud, data center, or edge—by deploying with NVIDIA AI Enterprise, the production-grade, secure, end-to-end software platform that includes generative AI reference applications and enterprise support.


NVIDIA NIM, part of NVIDIA AI Enterprise, is an easy-to-use runtime designed to accelerate the deployment of generative AI across your enterprise. This versatile microservice supports open community models and NVIDIA AI Foundation models from the NVIDIA API catalog, as well as custom AI models. NIM builds on NVIDIA Triton™ Inference Server, a powerful and scalable open source platform for deploying AI models, and is optimized for large language model (LLM) inference on NVIDIA GPUs with NVIDIA TensorRT-LLM. NIM is engineered to facilitate seamless AI inferencing with the highest throughput and lowest latency, while preserving the accuracy of predictions. You can now deploy AI applications anywhere with confidence, whether on-premises or in the cloud.

NVIDIA NeMo Retriever

NeMo Retriever is a collection of CUDA-X microservices enabling semantic search of enterprise data to deliver highly accurate responses using retrieval augmentation. Developers can use these GPU-accelerated microservices for specific tasks including ingesting, encoding, and storage large volumes of data, interacting with existing relational databases, and searching for relevant pieces of information to answer business questions.

Getting Started With Multilingual Speech and Translation AI Applications

NVIDIA® Riva is a set of GPU-accelerated multilingual speech and translation microservices for building fully customizable, real-time conversational AI pipelines. Riva includes automatic speech recognition (ASR), text-to-speech (TTS), and neural machine translation (NMT) and is deployable in all clouds, in data centers, at the edge, or on embedded devices. With Riva, organizations can add speech and translation interfaces with large language models (LLMs) and retrieval-augmented generation (RAG). Riva is part of the NVIDIA AI Enterprise software platform, which streamlines development and deployment of production AI.

Riva features the following benefits:

  • Multilingual high-accuracy and expressive voices: Riva achieves high transcription accuracy for bilingual and multilingual translations of English, Spanish, Mandarin, Hindi, Russian, Arabic, Japanese, Korean, German, Portuguese, French, and Italian. It includes two out-of-the-box (OOTB) expressive professional voices—female and male—for English, Spanish, German, Italian, and French, with state-of-the-art models pretrained on thousands of hours of audio on NVIDIA supercomputers.
  • Fully customizable: With Riva, ASR pipelines can be customized for different languages, accents, domains, vocabulary, and context for the best possible accuracy for specific use cases. Across TTS pipelines, voice and intonation can be customized.
  • Flexible deployments: Consistent experiences are delivered to customers, with higher inference performance than existing technology and on any deployment—in data centers, in the cloud, at the edge, or in embedded devices.


Offering High-Quality Automatic Speech Recognition

Every speech AI application relies on ASR—converting human voice into readable text. Often the first step in the speech pipeline, ASR quality influences the effectiveness of downstream conversational AI tasks. Riva offers world-class, OOTB ASR models in diverse languages, empowering enterprises to deploy high-quality speech AI applications globally. These models can be further customized through word boosting, customized punctuation, and inverse text normalization. Models can also be fine-tuned for custom jargon, domain-specific words, and noisy environments.

Riva ASR pipeline: GPU-optimized for high performance and accuracy.

Creating Expressive, Human-Like Text-to-Speech

Generating human voices from text is important for customer-facing services in conversational applications. Producing an expressive and engaging human-like voice requires state-of-the-art AI models, is compute-intensive, and calls for a mature pipeline to fine-tune and express voices. Riva provides professional female and male OOTB voices and lets users easily customize the models and pipeline for voice pronunciation, pitch, volume, and speed.

Riva TTS pipeline: The robust pipeline enables easy tone and accent adjustments.

Quick Links

Speech-to-text technology, also known as speech recognition, has become a game-changer in the world of customer service. By automating tasks such as call routing, call categorization, and voice authentication, businesses can greatly reduce wait times and guarantee customers are directed to the most qualified agents to handle their requests. Generative AI recommends next-best actions, identifies call sentiment, predicts customer satisfaction, and even measures agent quality and compliance.

AI improves contact centers in these ways:

  • Optimizes staffing resources
  • Personalizes ‌customer experiences
  • Assists agents by offering actionable insights

Although speech AI can drive significant improvements to call centers, successfully implementing speech-to-text comes with a few challenges, including:

  • Phonetic ambiguity
  • Diverse speaking styles
  • Noisy environments
  • Limitations of telephony
  • Domain-specific vocabulary

Telcos must consider many factors when implementing cutting-edge speech AI technology—accuracy, latency, scalability, security, and operational costs. They must evaluate vendors based on pricing models, the total cost of ownership, and hidden costs. Achieving comprehensive language, accent, and dialect coverage is critical for speech recognition accuracy for all languages. Speech AI models must achieve low latency to provide better real-time experiences for both agents and customers

Quick Links

Enhance Customer Service and Support With Generative AI

Telecommunications operators can support customer support agents with generative AI that accurately transcribes customer conversations and provides real-time recommendations to quickly resolve customer queries. This produces better customer experiences and lowers costs for contact centers.

Customer Success

The Future of Customer Service With AT&T

Learn about cutting-edge NVIDIA Riva speech AI technology solutions that are revolutionizing telecommunications.

Speech-to-Text at Scale With T-Mobile

See how T-Mobile developed speech AI models with NVIDIA NeMo, managed cloud deployment with NVIDIA Riva, and identified and removed bias in their AI models.

Unveiling End-to-End Speech and Translation AI Magic

Get insights on overcoming the challenges in deploying cutting-edge, multilingual speech technologies for business outcomes.

Additional Resources

Insights From Industry Experts: Speech AI for Impactful Contact Centers

Explore the benefits and challenges of using automatic speech recognition, multi-language translation, and text-to-speech to deliver faster and more accurate customer service.

Transforming Customer Service With Speech AI Applications

Learn about the challenges of and the best practices for building highly accurate, multi-language, voice-enabled applications using NVIDIA Riva and audio transcription and intelligent virtual assistant workflows.