Manage the AI agent lifecycle with a comprehensive toolkit for building, monitoring, and optimizing AI agents in production.
Experience the end-to-end, enterprise-ready toolkit for managing AI agents across their lifecycle.
The NVIDIA NeMo software toolkit is a comprehensive platform for building, customizing, and deploying AI agents. It combines open source libraries and microservices to accelerate AI development.
| Features | Use This Tool | Get Started |
|---|---|---|
| Build agentic AI applications using open, highly accurate, energy-efficient models. | <strong>NVIDIA Nemotron</strong><br> State-of-the-art open models for reasoning, RAG, speech, vision, and safety, to build enterprise-ready agentic AI applications. | <div class="nv-text"> <ul> <li><a href="https://build.nvidia.com/search/models?filters=publisher%3Anvidia&q=Nemotron&ncid=no-ncid" target="_blank">Try Nemotron Models</a></li> </ul> </div> |
| Build, fine-tune, and align generative AI models at scale with code-level control and flexibility. | <strong>NeMo Framework</strong><br> Collection of open source libraries for data generation, data curation, pretraining, post-training, reinforcement learning, evaluation, and guardrailing of multimodal models, scaling from a single GPU to thousands. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/#framework" target="_blank">Documentation</a></li> <li><a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo" target="_blank">Download Container</a></li> <li><a href="https://github.com/NVIDIA/NeMo" target="_blank">Access Open-Source Code</a></li> </ul> </div> |
| Generate high-quality synthetic datasets for training, fine-tuning, or evaluating models. | <strong>NeMo Data Designer</strong><br> Design domain-specific datasets from scratch or seed examples to eliminate data bottlenecks and accelerate AI development. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/microservices/latest/generate-synthetic-data/index.html" target="_blank">Documentation</a></li> <li><a href="https://build.nvidia.com/nemo/data-designer" target="_blank">Try Data Designer</a></li> <li><a href="https://github.com/NVIDIA/GenerativeAIExamples/tree/main/nemo/NeMo-Data-Designer" target="_blank">Example Notebooks</a></li> </ul> </div> |
| Prepare large multimodal datasets for AI development pipelines. | <strong>NeMo Curator</strong><br> Clean, filter, and prepare multimodal data with an open, GPU-accelerated Python library. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/curator/latest/" target="_blank">Documentation</a></li> <li><a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo-curator" target="_blank">Download Container</a></li> <li><a href="https://github.com/NVIDIA/NeMo-Curator" target="_blank">Access Open-Source Code</a></li> </ul> </div> |
| Train and specialize agents with reinforcement learning (RL) | <strong>NeMo RL</strong><br> Post-train and align models at scale with advanced RL techniques. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/rl/latest/index.html" target="_blank">Documentation</a></li> <li><a href="https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo-rl?version=v0.5.0" target="_blank">Download Container</a></li> <li><a href="https://github.com/NVIDIA-NeMo/RL" target="_blank">Access Open-Source Code</a></li> </ul> </div> |
| Define and manage scalable RL environments for agent training | <strong>NeMo Gym</strong><br> Build, manage, and scale RL environments to generate high-quality, verifiable rollout data for RL training needed for agent specialization. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/gym/0.1.0/index.html" target="_blank">Documentation</a></li> <li><a href="https://github.com/NVIDIA-NeMo/Gym" target="_blank">Access Open-Source Code</a></li> </ul> </div> |
| Integrate and expose easy-to-use APIs to accelerate model fine-tuning and alignment and power agentic AI workflows. | <strong>NeMo Customizer</strong><br> Simplify and scale fine-tuning with proprietary domain data. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/microservices/latest/fine-tune/index.html" target="_blank">Documentation</a></li> <li><a href="https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo-microservices/containers/customizer" target="_blank">Download Microservice</a></li> </ul> </div> |
| Evaluate the performance of your model and agent pipeline. | <strong>NeMo Evaluator</strong><br> Evaluate model and agent performance with streamlined deployment, benchmark support, and advanced harnesses. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/microservices/latest/evaluate/index.html" target="_blank">Documentation</a></li> <li><a href="https://github.com/NVIDIA-NeMo/Evaluator" target="_blank">Access Open-Source SDK</a></li> <li><a href="https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo-microservices/containers/evaluator" target="_blank">Download Microservice</a></li> </ul> </div> |
| Build RAG pipelines to connect your agent to data. | <strong>NeMo Retriever</strong><br> Build high-accuracy, scalable pipelines to extract text, tables, charts, and images from complex documents, powered by open Nemotron models, and prepare enterprise data for RAG. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/retriever/latest/" target="_blank">Documentation</a></li> <li><a href="https://huggingface.co/collections/nvidia/nemotron-rag-68f01e412f2dc5a5db5f30ed" target="_blank">Try Hugging Face Models</a></li> <li><a href="https://build.nvidia.com/explore/retrieval" target="_blank">Try Retriever Models</a></li> </ul> </div> |
| Ensure your agent's responses are safe and on topic. | <strong>NeMo Guardrails</strong><br> Tap into a programmable orchestration layer to ensure safety, security, and topical relevance at runtime. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/guardrails/latest/" target="_blank">Documentation</a></li> <li><a href="https://huggingface.co/collections/nvidia/nemoguard" target="_blank">Try Hugging Face Models</a></li> <li><a href="https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo-microservices/containers/guardrails" target="_blank">Download Microservice</a></li> <li><a href="https://github.com/NVIDIA/NeMo-Guardrails" target="_blank">Access Open-Source Toolkit</a></li> </ul> </div> |
| Deploy your model for high-performance inference. | <strong>NVIDIA NIM</strong><br> Securely and reliably deploy AI models anywhere with containerized microservices. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nim/" target="_blank">Documentation</a></li> <li><a href="https://build.nvidia.com/explore/discover" target="_blank">Try NIM Microservices</a></li> </ul> </div> |
| Monitor and optimize the performance of your AI agent. | <strong>NeMo Agent Toolkit</strong><br> Profile, evaluate, and optimize agentic systems with open-source, framework-agnostic library. | <div class="nv-text"> <ul> <li><a href="https://docs.nvidia.com/nemo/agent-toolkit/latest/" target="_blank">Documentation</a></li> <li><a href="https://github.com/NVIDIA/NeMo-Agent-Toolkit" target="_blank">Access Open-Source Code</a></li> </ul> </div> |
NVIDIA NeMo is a comprehensive, enterprise-ready platform for building, deploying, and optimizing agentic systems—from data curation, model customization and evaluation, to deployment, orchestration, and continuous optimization. It seamlessly integrates with existing AI ecosystems and platforms to create a foundation for building AI agents, fast-tracking the path to production of agentic systems on any cloud, on-premises, or hybrid environment. It supports rapid scaling and effortless creation of data flywheels that continuously improve AI agents with the latest information.
NeMo is available open source and supported as part of NVIDIA AI Enterprise. Pricing and licensing details can be found here.
NeMo can be used to customize large language models (LLMs), vision language models (VLMs), automatic speech recognition (ASR), and text-to-speech (TTS) models.
NVIDIA AI Enterprise includes NVIDIA Business-Standard Support. For additional available support and services, such as NVIDIA Business-Critical Support, a technical account manager, training, and professional services, see the NVIDIA Enterprise Support and Service Guide.
NVIDIA NeMo framework is an open-source generative AI framework built for researchers and developers who are looking for fine-grained control and code-level flexibility to efficiently build generative AI models at scale. It supports pretraining, post-training, and reinforcement learning of multimodal generative AI models.
NVIDIA NeMo microservices is an enterprise-ready, API-first modular offering built on the NeMo framework, to enable developers to easily and rapidly customize and deploy AI agents at scale. It simplifies model fine-tuning, evaluation, guardrailing, and synthetic data generation. They seamlessly integrate into existing AI platforms, enabling enterprises to accelerate custom AI agent development and continuously optimize them through data flywheel workflows.
NeMo Data Designer is a purpose-built microservice (also available as an open library) for AI developers that provides a programmatic way to generate synthetic data through configurable schemas and AI-powered generation models. It’s designed to integrate seamlessly into your AI development workflow.
NeMo Curator is an open-source library that improves generative AI model accuracy by curating high-quality multimodal datasets. It consists of a set of Python modules expressed as APIs that make use of Dask, cuDF, cuGraph, and Pytorch to scale data curation tasks, such as data download, text extraction, cleaning, filtering, exact/fuzzy deduplication, and text classification to thousands of compute cores
NeMo Customizer is a high-performance, scalable microservice that simplifies the customization and alignment of LLMs for domain-specific use cases using advanced fine-tuning and reinforcement learning techniques.
NeMo Auditor probes LLMs offline, in batch, with a wide range of automated attacks and edge-case prompts to uncover safety and security vulnerabilities. It’s built on the open NVIDIA Garak library and is designed to plug in to evaluation workflows and CI/CD pipelines, so teams can regularly run audits and use the resulting reports to harden models and deployments over time.
NeMo Evaluator provides scalable benchmarking for generative AI applications, including LLMs, RAG pipelines, and AI agents. It features an open source SDK for flexible experimentation with over 100 benchmarks, alongside a cloud-native microservice that automates enterprise-grade evaluation workflows using LLM-as-a-judge scoring and specialized performance metrics.
NeMo Guardrails enables developers to apply programmable guardrails to LLMs and agentic AI applications. It includes a suite of NVIDIA Nemotron Safety models for content safety, PII, jailbreak detection, and topic control with advanced reasoning capabilities and multilingual and multimodal support. With NeMo Guardrails, organizations can control agentic data access, how they should respond, and which tools and data sources they can access to ensure custom policy compliance and safety controls.
NeMo RL is an open source library for advanced reinforcement learning algorithms and scalable post-training to optimize and align AI agents at enterprise scale.
NeMo Gym is an open source library for building reinforcement learning (RL) training environments for large language models (LLMs). NeMo Gym provides infrastructure to develop environments, scale rollout collection, and integrate seamlessly with your preferred training framework. It helps generate data needed for RL training to equip AI agents and models with domain-specific skills.
NeMo Retriever is an open source library, featuring Nemotron RAG models, for building scalable pipelines to extract multimodal data from complex documents that delivers up to 50% better accuracy and 15x faster multimodal PDF extraction.
The open-source NVIDIA NeMo Agent Toolkit delivers framework-agnostic profiling, evaluation, and optimization for production AI agent systems. It captures granular metrics on cross-agent coordination, tool usage efficiency, and computational costs, enabling data-driven optimizations through NVIDIA Accelerated Computing. It can be used to parallelize slow workflows, cache expensive operations, and maintain system accuracy during model updates. Compatible with OpenTelemetry and major agent frameworks, the toolkit reduces cloud spend while providing insights to scale from single agents to enterprise-grade digital workforces.
NVIDIA NIM, part of NVIDIA AI Enterprise, is an easy-to-use runtime designed to accelerate the deployment of generative AI across enterprises. This versatile microservice supports a broad spectrum of AI models—from open-source community models to NVIDIA AI Foundation models, as well as bespoke custom AI models. Built on the robust foundations of the inference engines, it’s engineered to facilitate seamless AI inferencing at scale, ensuring that AI applications can be deployed across the cloud, data center, and workstation.
Retrieval-augmented generation is a technique that lets LLMs create responses from the latest information by connecting them to the company’s knowledge base. NeMo works with various third-party and community tools, including Milvus, Llama Index, and LangChain, to extract relevant snippets of information from the vector database and feed them to the LLM to generate responses in natural language. Explore the AI Chatbot Using RAG Workflow page to get started building production-quality AI chatbots that can accurately answer questions about your enterprise data.
NVIDIA Blueprints are comprehensive reference workflows built with NVIDIA Nemotron open models and libraries. Each blueprint includes reference code, deployment tools, customization guides, and a reference architecture, accelerating the deployment of AI solutions like AI agents and digital twins, from prototype to production.
NVIDIA AI Enterprise is an end-to-end, cloud-native software platform that accelerates data science pipelines and streamlines the development and deployment of production-grade AI applications, including generative AI, computer vision, speech AI, and more. It includes best-in-class development tools, frameworks, pretrained models, microservices for AI practitioners, and reliable management capabilities for IT professionals to ensure performance, API stability, and security.