Ways to Get Started With NVIDIA NeMo

Accelerate AI agent specialization, optimization, and governance with an open suite of libraries and agent skills.

The Journey From AI Models to Agentic AI Applications

Explore Models

Try NVIDIA-optimized foundation models like NVIDIA Nemotron™.

Specialize With NeMo

Specialize, optimize, and govern AI agents with NVIDIA NeMo™.

Deploy Agentic Workflows

Jump-start building your AI solutions with NVIDIA Blueprints.

Tools and Skills for Accelerating Specialized AI Agent Optimization

The AI agent lifecycle is an end-to-end process for developing and improving AI agents in production applications. NeMo integrates with existing AI tools and agent frameworks to optimize specialized agents across their lifecycle.

Features Use This Tool Get Started
Build agentic AI applications using open, highly accurate, energy-efficient models. <strong>NVIDIA Nemotron</strong><br> Use advanced, multimodal AI reasoning models with open weights, open data, and recipes. <div class="nv-text"> <ul> <li><a href="https://build.nvidia.com/search/models?filters=publisher%3Anvidia&q=Nemotron&ncid=no-ncid" target="_blank">Try Nemotron Models</a></li> <li><a href="https://huggingface.co/collections/nvidia/nvidia-nemotron-v3" target="_blank">Download Models</a></li> <li><a href="https://openrouter.ai/nvidia" target="_blank">Nemotron APIs on OpenRouter</a></li> </ul> </div>
Build, fine-tune, and align generative AI models at scale with code-level control and flexibility. <strong>NeMo Framework</strong><br> Collection of open source libraries for data generation, data curation, pretraining, post-training, reinforcement learning, evaluation, and guardrailing of multimodal models, scaling from a single GPU to thousands. <div class=""nv-text""> <ul> <li><a href=""https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html"" target=""_blank"">Documentation</a></li> <li><a href=""https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo"" target=""_blank"">Download Container</a></li> <li><a href=""https://github.com/NVIDIA-NeMo"" target=""_blank"">Access Open Source Code</a></li> </ul> </div>
Prepare large multimodal datasets for AI development pipelines. <strong>NeMo Curator</strong><br> Clean, filter, and prepare multimodal data with an open, GPU-accelerated Python library. <div class=""nv-text""> <ul> <li><a href=""https://docs.nvidia.com/nemo/curator/latest/"" target=""_blank"">Documentation</a></li> <li><a href=""https://github.com/NVIDIA/NeMo-Curator"" target=""_blank"">Access Open Source Code</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/Curator/tree/main/.claude/skills"" target=""_blank"">Agent Skills</a></li> </ul> </div>
Generate high-quality synthetic datasets for training, fine-tuning, or evaluating models. <strong>NeMo Data Designer</strong><br> Design domain-specific datasets from scratch or seed examples to eliminate data bottlenecks and accelerate AI development. <div class=""nv-text""> <ul> <li><a href=""https://nvidia-nemo.github.io/DataDesigner/latest/"" target=""_blank"">Documentation</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/DataDesigner"" target=""_blank"">Access Open Source Code</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/DataDesigner/tree/main/skills/data-designer"" target=""_blank"">Agent Skills</a></li> </ul> </div>
<strong>NeMo Anonymizer</strong><br> Perform context-aware data anonymization to protect PII while preserving insights. <div class=""nv-text""> <ul> <li><a href=""https://nvidia-nemo.github.io/Anonymizer/latest/"" target=""_blank"">Documentation</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/Anonymizer"" target=""_blank"">Access Open Source Code</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/Anonymizer/tree/main/skills/anonymizer"" target=""_blank"">Agent Skills</a></li> </ul> </div>
<strong>NeMo Safe Synthesizer</strong><br> Generate safe, synthetic versions of your sensitive datasets with no one-to-one mapping to original records. <div class=""nv-text""> <ul> <li><a href=""https://nvidia-nemo.github.io/Safe-Synthesizer/latest/"" target=""_blank"">Documentation</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/Safe-Synthesizer"" target=""_blank"">Access Open Source Code</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/Safe-Synthesizer/blob/main/.agents/README.md"" target=""_blank"">Agent Skills</a></li> </ul> </div>
Train and specialize agents with reinforcement learning (RL). <strong>NeMo RL</strong><br> Post-train and align models at scale with advanced RL techniques. <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/rl/latest/index.html" target="_blank">Documentation</a></li> <li><a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo-rl?version=v0.6.0" target="_blank">Download Container</a></li> <li><a href="https://github.com/NVIDIA-NeMo/RL" target="_blank">Access Open Source Code</a></li> <li><a href="https://github.com/NVIDIA-NeMo/RL/tree/main/skills" target="_blank">Agent Skills</a></li> </ul> </div>
Define and manage scalable RL environments for agent training. <strong>NeMo Gym</strong><br> Build, manage, and scale RL environments to generate high-quality, verifiable rollout data for RL training needed for agent specialization. <div class=""nv-text""> <ul> <li><a href=""https://docs.nvidia.com/nemo/gym/main/about/"" target=""_blank"">Documentation</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/Gym"" target=""_blank"">Access Open Source Code</a></li> <li><a href=""https://github.com/NVIDIA-NeMo/Gym/tree/main/.claude/skills"" target=""_blank"">Agent Skills</a></li> </ul> </div>
Integrate and expose easy-to-use APIs to accelerate model fine-tuning and alignment to power agentic workflows. <strong>NeMo Customizer</strong><br> Microservice for easy and scalable fine-tuning and reinforcement learning with proprietary domain data. <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/microservices/latest/fine-tune/index.html" target="_blank">Documentation</a></li> <li><a href="https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo-microservices/containers/customizer" target="_blank">Download Microservice</a></li> </ul> </div>
Evaluate the performance of your model and agent pipeline. <strong>NeMo Evaluator</strong><br> Evaluate model and agent performance with streamlined deployment, benchmark support, and advanced harnesses. <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/evaluator/latest/" target="_blank">Documentation</a></li> <li><a href="https://github.com/NVIDIA-NeMo/Evaluator" target="_blank">Access Open Source SDK</a></li> <li><a href="https://github.com/NVIDIA-NeMo/Evaluator/tree/main#agentic-skills" target="_blank">Agent Skills</a></li> </ul> </div>
Validate agent and model safety before launch. <strong>NeMo Auditor</strong><br> Red-team and scan vulnerabilities for hardening agents and LLMs prior to deployment. <div class=""nv-text""> <ul> <li><a href=""https://docs.nvidia.com/nemo/microservices/latest/audit/index.html"" target=""_blank"">Documentation</a></li> <li><a href=""https://github.com/NVIDIA/garak"" target=""_blank"">Garak Library</a></li> </ul> </div>
Ensure your agent�s responses are safe and on-topic. <strong>NeMo Guardrails</strong><br> A programmable orchestration layer for ensuring safety, security, and topical relevance at runtime. <div class=""nv-text""> <ul> <li><a href=""https://docs.nvidia.com/nemo/guardrails/latest/"" target=""_blank"">Documentation</a></li> <li><a href=""https://huggingface.co/collections/nvidia/nemoguard"" target=""_blank"">Try Hugging Face Models</a></li> <li><a href=""https://github.com/NVIDIA/NeMo-Guardrails"" target=""_blank"">Access Open Source Toolkit</a></li> </ul> </div>
Deploy your model for high-performance inference. <strong>NVIDIA NIM<sup>TM</sup></strong><br> Containerized microservices for secure and reliable deployment of AI models anywhere. <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nim/" target="_blank">Documentation</a></li> <li><a href="https://build.nvidia.com/explore/discover" target="_blank">Try NIM Microservices</a></li> </ul> </div>
Optimize agent harnesses with intercepts, telemetry, and guardrails. <strong>NeMo Relay</strong><br> Connect and observe black-box and general-purpose agent harnesses into the NeMo Platform. <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/relay/latest/" target="_blank">Documentation</a></li> <li><a href="https://github.com/NVIDIA/NeMo-Relay" target="_blank">Access Open-Source Code</a></li> </ul> </div>

FAQs

NVIDIA NeMo is an agent-first, open suite of libraries and skills for accelerating AI agent specialization, optimization, and governance.

NeMo integrates with existing AI tools and agent frameworks to optimize specialized agents across any cloud, on-premises, or hybrid environment.

NeMo is available open source and supported as part of NVIDIA AI Enterprise. Pricing and licensing details can be found here.

NeMo can be used to customize large language models (LLMs), vision language models (VLMs),  automatic speech recognition (ASR), and text-to-speech (TTS) models.

NVIDIA AI Enterprise includes NVIDIA Business-Standard Support. For additional available support and services, such as NVIDIA Business-Critical Support, a technical account manager, training, and professional services, see the NVIDIA Enterprise Support and Service Guide.

NVIDIA NeMo framework is an open-source generative AI framework built for researchers and developers who are looking for fine-grained control and code-level flexibility to efficiently build generative AI models at scale. It supports pretraining, post-training, and reinforcement learning of multimodal generative AI models.


NVIDIA NeMo microservices is an enterprise-ready, API-first modular offering built on the NeMo framework, to enable developers to easily and rapidly customize and deploy AI agents at scale. It simplifies model fine-tuning, evaluation, guardrailing, and synthetic data generation. They seamlessly integrate into existing AI platforms, enabling enterprises to accelerate custom AI agent development and continuously optimize them through data flywheel workflows.

NeMo Curator is an open-source library that improves generative AI model accuracy by curating high-quality multimodal datasets. It consists of a set of Python modules expressed as APIs that make use of Dask, cuDF, cuGraph, and Pytorch to scale data curation tasks, such as data download, text extraction, cleaning, filtering, exact/fuzzy deduplication, and text classification to thousands of compute cores

NeMo Data Designer is a purpose-built microservice (also available as an open library) for AI developers that provides a programmatic way to generate synthetic data through configurable schemas and AI-powered generation models. It’s designed to integrate seamlessly into your AI development workflow.

NeMo Anonymizer is an open library that supports context-aware data anonymization within free text. After detecting sensitive entities, you can label, redact, hash, replace, or fully rewrite them.

NeMo Safe Synthesizer is an open library for generating synthetic versions of sensitive datasets. It creates entirely novel records with no one-to-one mapping to the original. It unlocks insights of your data while maintaining data privacy.

NeMo Customizer is a high-performance, scalable microservice that simplifies the customization and alignment of LLMs for domain-specific use cases using advanced fine-tuning and reinforcement learning techniques.

NeMo Auditor probes LLMs offline, in batch, with a wide range of automated attacks and edge-case prompts to uncover safety and security vulnerabilities. It’s built on the open NVIDIA Garak library and is designed to plug in to evaluation workflows and CI/CD pipelines, so teams can regularly run audits and use the resulting reports to harden models and deployments over time.

NeMo Evaluator provides scalable benchmarking for generative AI applications, including LLMs, RAG pipelines, and AI agents. It features an open source SDK for flexible experimentation with over 100 benchmarks, alongside a cloud-native microservice that automates enterprise-grade evaluation workflows using LLM-as-a-judge scoring and specialized performance metrics.

NeMo Guardrails enables developers to apply programmable guardrails to LLMs and agentic AI applications. It includes a suite of NVIDIA Nemotron Safety models for content safety, PII, jailbreak detection, and topic control with advanced reasoning capabilities and multilingual and multimodal support. With NeMo Guardrails, organizations can control agentic data access, how they should respond, and which tools and data sources they can access to ensure custom policy compliance and safety controls.

NeMo RL is an open source library for advanced reinforcement learning algorithms and scalable post-training to optimize and align AI agents at enterprise scale.

NeMo Gym is an open source library for building reinforcement learning (RL) training environments for large language models (LLMs). NeMo Gym provides infrastructure to develop environments, scale rollout collection, and integrate seamlessly with your preferred training framework. It helps generate data needed for RL training to equip AI agents and models with domain-specific skills.

NeMo Retriever is an open source library, featuring Nemotron RAG models, for building scalable pipelines to extract multimodal data from complex documents that delivers up to 50% better accuracy and 15x faster multimodal PDF extraction.

NeMo Relay is an integration layer for connecting black-box and general-purpose agent harnesses into the NeMo Platform. It helps teams bring existing agent workflows into NeMo so they can use platform capabilities for optimization, evaluation, and continuous improvement without rebuilding their agent stack from scratch.

NVIDIA NIM, part of NVIDIA AI Enterprise, is an easy-to-use runtime designed to accelerate the deployment of generative AI across enterprises. This versatile microservice supports a broad spectrum of AI models—from open-source community models to NVIDIA AI Foundation models, as well as bespoke custom AI models. Built on the robust foundations of the inference engines, it’s engineered to facilitate seamless AI inferencing at scale, ensuring that AI applications can be deployed across the cloud, data center, and workstation.

Retrieval-augmented generation is a technique that lets LLMs create responses from the latest information by connecting them to the company’s knowledge base. NeMo works with various third-party and community tools, including Milvus, Llama Index, and LangChain, to extract relevant snippets of information from the vector database and feed them to the LLM to generate responses in natural language. Explore the AI Chatbot Using RAG Workflow page to get started building production-quality AI chatbots that can accurately answer questions about your enterprise data.

NVIDIA Blueprints are comprehensive reference workflows built with NVIDIA Nemotron open models and libraries. Each blueprint includes reference code, deployment tools, customization guides, and a reference architecture, accelerating the deployment of AI solutions like AI agents and digital twins, from prototype to production.

NVIDIA AI Enterprise brings together microservices, frameworks, and libraries for AI development with advanced GPU orchestration and infrastructure management in a fully supported, production-ready, commercial software suite. Deploy leading open source tools and AI models with confidence, improve productivity and time to value, and run AI workloads at scale with optimized resource utilization.