Build, customize, and deploy generative AI models.
NVIDIA NeMo™, part of the NVIDIA AI platform, is an end-to-end, cloud-native enterprise framework to build, customize, and deploy generative AI models with billions of parameters.
The NeMo framework provides an accelerated workflow for training with 3D parallelism techniques, a choice of several customization techniques, and optimized at-scale inference of large-scale models for language and image applications, with multi-GPU and multi-node configurations. NeMo makes generative AI model development easy, cost-effective, and fast for enterprises.
NeMo frameworks support the development of text-to-text, text-to-image, and image-to-image foundation models.
Use state-of-the-art training techniques to maximize throughput and minimize training time for foundation models with billions or trillions of parameters.
Cloud-native framework with all dependencies pre-packaged and installed with validated receipts for training language and image generative AI models to convergence and deploy for inference.
An open-source approach offering full flexibility across the pipeline—from data processing, to training, to inference of generative AI models.
Train and deploy foundation models of any size on any GPU infrastructure. Supported on all NVIDIA DGX™ systems, NVIDIA DGX™ Cloud, Microsoft Azure, Oracle Cloud Infrastructure, and Amazon Web Services.
Offers tools to customize foundation models for enterprise hyper-personalization.
Battle hardened, tested, and verified containers built for enterprises.
NeMo framework delivers high levels of training efficiency, making training of large-scale foundation models possible, using 3D parallelism techniques such as:
In addition, selective activation recomputing optimizes recomputation and memory usage across tensor parallel devices during backpropagation.
NeMo framework makes enterprise AI practical by offering tools to:
Deploy generative AI models for inference using NVIDIA Triton Inference Server™. With powerful optimizations from FasterTransformer, you can achieve state-of-the-art accuracy, latency, and throughput inference performance on single-GPU, multi-GPU, and multi-node configurations.
Bring your own dataset and tokenize data to a digestible format. NeMo includes comprehensive preprocessing capabilities for data filtration, deduplication, blending, and formatting on language datasets, on Piles and multilingual C4 (mC4). These help researchers and engineers save months of development and compute time, letting them focus on building applications.
NeMo framework makes generative AI possible from day one with prepackaged scripts, reference examples, and documentation across the entire pipeline.
Building foundation models is also made easy through an auto-configurator tool, which automatically searches for the best hyperparameter configurations to optimize training and inference for any given multi-GPU configuration, training, or deployment constraints.
Cloud service for enterprise hyper-personalization and at-scale deployment of intelligent large language models.
Accelerated cloud service for enterprises that need custom generative AI models for creating high-res, photorealistic images, videos, and 3D content.
AI Sweden accelerated LLM industry applications by making the power of a 100 billion parameter model for regional languages easily accessible to the Nordic ecosystem. AI Sweden is digitizing Sweden’s historical records and building language models from this unstructured data that can be commercialized in enterprise applications.
Image Courtesy of Korea Telecom
South Korea’s leading mobile operator builds billion-parameter large language models trained with the NVIDIA DGX SuperPOD platform and NeMo framework to power smart speakers, and customer call centers
Learn how to download, optimize, and deploy a 1.3 billion parameter GPT-3 model with NeMo framework and NVIDIA’s generative AI framework.
Learn how to preprocess data in a multi-node environment, automatically select the best hyperparameters to minimize training time for multiple GPT-3 and T5 configurations, train the model at scale, and deploy the model in a multi-node production setting with an easy-to-use set of scripts.
Bootstrap your enterprise's LLM journey using pretuned hyperparameter configurations for GPT-3 models. Learn how to train a large-scale NLP model with NeMo framework.