Hardware/Semiconductor
Image courtesy of MediaTek
MediaTek, a leading global semiconductor company, is strategically investing in cutting-edge large language models (LLMs) to highlight its AI hardware capabilities, cultivate a developer ecosystem, and reinforce its brand as a full-stack AI innovator—driving long-term demand for its AI-capable devices. To accelerate its enterprise AI efforts and enhance productivity across various domains, MediaTek established an on-premises AI factory powered by NVIDIA DGX SuperPOD™. By centralizing its AI compute, MediaTek has streamlined R&D workflows; achieved significant inference performance gains with NVIDIA AI Enterprise software, including NVIDIA NIM™ and NVIDIA® Riva; and ensured the secure handling of proprietary data for applications like AI-assisted code completion and automated documentation.
MediaTek
Generative AI / LLMs
MediaTek enables nearly two billion connected devices annually with innovative systems-on-chip (SoC) for mobile, home entertainment, connectivity, and IoT products. The company is at the forefront of integrating AI into its core operations to enhance productivity and competitiveness.
MediaTek’s strategic development of cutting-edge LLMs, including the Breeze series and a 480B-parameter traditional-Chinese model, is mission critical. This investment reinforces the company’s market position by showcasing AI hardware capabilities on Dimensity chips with integrated specialized AI processors, positioning its SoCs as credible AI platforms. Through open-sourcing models, MediaTek actively builds a robust developer ecosystem that fuels platform adoption and establishes itself as a trusted AI provider. This effort advances MediaTek into a full-stack AI innovator, driving long-term hardware demand by offering turnkey, localized AI solutions.
Image courtesy of MediaTek
Image courtesy of MediaTek
As MediaTek deepened its commitment to AI, the company encountered significant challenges in establishing and scaling its computing infrastructure to meet the demands of its AI initiatives. The sheer scale of its ongoing developer resource demands, encompassing thousands of model training iterations monthly and processing billions of tokens for inference, presented a formidable hurdle. These intensive workloads demanded a robust, scalable, and efficient computing environment.
MediaTek needed a solution that could not only handle its current extensive workloads but also provide a foundation for future growth and increasingly complex AI models, all while maintaining operational efficiency and cost-effectiveness. Another crucial consideration was the ability to explore the potential usage of newer models on local machines without having to expose proprietary data, ensuring data security and compliance.
To address these challenges and realize a more cost-effective, scalable, and secure platform for its AI factory, MediaTek chose NVIDIA DGX SuperPOD with NVIDIA Blackwell-based systems. This decision was driven by the platform's ability to provide a scalable foundation that accelerates the development and deployment of AI applications. The DGX SuperPOD specifically enabled the training of much larger language models and the handling of larger datasets within the same timeframes as previous systems, significantly boosting MediaTek’s competitiveness and innovation in AI.
To tackle the high rack density, power consumption, and thermal demands, MediaTek incorporated NVIDIA's DGX SuperPOD reference architecture into its new data center designs. This approach has profoundly enhanced MediaTek’s AI research and development capabilities, providing a secure and controlled environment for sensitive data.
The implementation of the NVIDIA DGX SuperPOD has transformed MediaTek’s AI development lifecycle. The high compute utilization required to drive this level of AI development underscores the critical need for a powerful on-premises solution to manage such extensive and continuous AI workloads.
"Our AI factory, powered by DGX SuperPOD, processes approximately 60 billion tokens per month for inference and completes thousands of model-training iterations every month,” said David Ku, Co-COO and CFO at MediaTek.
Model inferencing, particularly with cutting-edge LLMs, requires loading entire models into GPU memory. Models with hundreds of billions of parameters can easily exceed the memory capacity of a single GPU server, requiring partitioning across multiple GPUs. The DGX SuperPOD, comprising tightly coupled DGX systems and high-performance NVIDIA networking, is purpose-built to deliver ultra-fast, coordinated GPU memory and compute needed for efficient training and inference with today’s largest AI workloads.
"The DGX SuperPOD is indispensable for our inference workloads. It allows us to deploy and run massive models that wouldn't fit on a single GPU or even a single server, ensuring we achieve the best performance and accuracy for our most demanding AI applications," said Ku.
MediaTek leverages these large models for core research and development and for a centralized, high-demand API, subsequently distilling smaller versions for specific edge or mobile applications. This strategic and right-sized approach ensures absolute best performance and accuracy.
With the DGX platform, MediaTek has streamlined its product development pipeline by integrating AI agents into R&D workflows. For example, AI-assisted code completion has significantly reduced programming time and error rates. An AI agent built with domain-adapted LLMs helps engineers understand, analyze, and optimize designs by extracting information from design flowcharts and state diagrams as part of the chip design process. This agent can now produce technical documentation in days, compared to weeks earlier.
NVIDIA NeMo™, a software suite for building, training, and deploying large language models, is also leveraged to fine-tune these models, ensuring optimal performance and domain-specific accuracy.
“With DGX SuperPOD, we’ve gone from training 7-billion-parameter models in a week to training models exceeding 480 billion parameters in the same timeframe—a dramatic leap that has accelerated our AI capabilities.”
David Ku
Co-Chief Operating Officer and Chief Financial Officer, MediaTek
MediaTek has experienced significant benefits in streamlining daily IT operations using NVIDIA Mission Control for its DGX SuperPOD cluster. NVIDIA Mission Control, a workload orchestration and infrastructure management software, saves the company’s engineers invaluable time daily by consolidating GPU provisioning, Linux cluster deployment, and system monitoring into one powerful tool.
“In particular, NVIDIA Mission Control provides helpful features such as consistent provisioning across our DGX nodes, comprehensive system health checks with alerts, and robust resource monitoring," Ku stated. “Its real-time GPU/CPU utilization metrics are particularly useful for managing our DGX SuperPOD cluster’s performance, offering real-time observability to optimize parallel computation processing and fine-tune batch sizes.”
MediaTek leverages the NVIDIA AI Enterprise software suite to maximize the performance and efficiency of its AI factory. Implementing NVIDIA NIM with TensorRT-LLM on DGX has yielded impressive performance gains in model inference. "Thanks to NVIDIA NIM and TensorRT-LLM running on DGX, we've seen a 40% improvement in inference speed and a 60% increase in token throughput. This directly translates to faster insights for our business and more responsive AI applications," said Ku.
For agentic voice control, MediaTek has integrated NVIDIA Riva automatic speech recognition (ASR), small language models (SLM), and text-to-speech (TTS) into NVIDIA DGX Spark™, enabling voice commands for services like internet search, calendar, and messaging. NVIDIA Riva significantly accelerated this agentic development, enabling MediaTek to deploy ASR and TTS efficiently. NVIDIA Riva provides production-grade implementations of globally top-ranked ASR models, and MediaTek uses these versions for its ASR pipelines, ultimately meeting its requirements for both speed and accuracy.
The NVIDIA Infrastructure Specialist (NVIS) team has also played a crucial role in MediaTek's AI journey. MediaTek collaborated closely with NVIDIA AI technical experts, who provided substantial support for DGX SuperPOD installation and performance tuning. The NVIDIA technical team also delivered NVIDIA NeMo software training and TensorRT-LLM implementation, further optimizing the LLMs deployed on MediaTek’s NVIDIA DGX SuperPOD.
MediaTek's on-premises AI factory, powered by NVIDIA DGX SuperPOD and supported by NVIDIA's comprehensive software stack and expert services, has significantly enhanced the company's ability to innovate and compete in the rapidly evolving AI landscape. By establishing a scalable and efficient AI infrastructure, MediaTek advances its pioneering lead chip design, software development, and advanced AI applications, driving productivity and pushing the boundaries of technological advancement.
NVIDIA DGX SuperPOD offers leadership-class accelerated infrastructure and scalable performance for the most challenging AI workloads—with industry-proven results.