Get Started With NVIDIA Triton

Find the right license to deploy, run, and scale AI for any application on any platform..

NVIDIA Triton Licensing Options

 

GitHub

For individuals looking to get access to Triton open-source code for development.  

NVIDIA NGC™

For individuals looking to access free Triton containers for development.  

NVIDIA AI Enterprise

For enterprises looking to purchase Triton for production. 

Features

NVIDIA Triton™ Inference Server      
Custom builds (Windows, NVIDIA® Jetson™), PyTriton      
Prebuilt Docker container (version dependencies: CUDA®, framework backends)      
Triton Management Service - Model orchestration for large-scale deployment      
Enterprise support – 24x7 case filing, 8-5 NVIDIA live agent      
Long-term support branch for up to three years      
CVE scans, security notifications, timely patches, and maintenance releases      
API stability with production releases      
NVIDIA customer portal and knowledge base      
Access to NVIDIA AI workflows and reference architectures      
Management and orchestration of workloads and infrastructure      
Hands-on NVIDIA LaunchPad labs    

FAQs

NVIDIA Triton Inference Server, or Triton for short, is an open-source inference serving software. It lets teams deploy, run, and scale AI models from any framework (TensorFlow, NVIDIA TensorRT™, PyTorch, ONNX, XGBoost, Python, custom, and more) on any GPU- or CPU-based infrastructure (cloud, data center, or edge). For more information, please visit the Triton webpage.

Triton Management Service is for automated and resource-efficient orchestration of models for inference at scale. It automates deployment of Triton instances, model loading on demand, unloading when not in use, and more. Triton Management Service is available exclusively with NVIDIA AI Enterprise, an enterprise-ready AI software platform.

Triton Model Analyzer is an offline tool for optimizing inference deployment configurations (batch size, number of model instances, etc.) for throughput, latency, and/or memory constraints on the target GPU or CPU. It supports analysis of a single model, model ensembles, and multiple concurrent models.

Triton is included with NVIDIA AI Enterprise, an end-to-end AI software platform that offers enterprise-grade support, security stability, and manageability for the entire software stack across data center and cloud.

NVIDIA AI Enterprise includes bsiness-standard support. There are additional support and services available, including business-critical support, access to a technical account manager, training, and professional services. For more information, please visit the Enterprise Support and Services User Guide.

Yes, there are several labs that use Triton in NVIDIA Launchpad.

Yes, Triton is the top ecosystem choice for AI inference and model deployment. Triton is available in AWS, Microsoft Azure, and Google Cloud marketplaces with NVIDIA AI Enterprise. It’s also available in Alibaba Cloud, Amazon Elastic Kubernetes Service (EKS), Amazon Elastic Container Service (ECS), Amazon SageMaker, Google Kubernetes Engine (GKE), Google Vertex AI, HPE Ezmeral, Microsoft Azure Kubernetes Service (AKS), Azure Machine Learning, and Oracle Cloud Infrastructure Data Science Platform.

Stay up to date on the latest AI inference news from NVIDIA.