NVIDIA Triton Management Service

Automate the deployment of multiple Triton Inference Server instances in Kubernetes with resource-efficient model orchestration.

What Is NVIDIA Triton Management Service?

NVIDIA Triton™, part of the NVIDIA® AI platform, offers a new functionality called Triton Management Service  (TMS) that automates the deployment of multiple Triton Inference Server instances in Kubernetes with resource-efficient model orchestration on GPUs and CPUs. This software application manages deployment of Triton Inference Server instances with one or more AI models, allocates models to individual GPUs/CPUs, and efficiently collocates models by frameworks. Triton Management Service enables large-scale inference deployment with high performance and hardware utilization. TMS, available exclusively with NVIDIA AI Enterprise, an enterprise-grade AI software platform, enables large-scale inference deployment with high performance and hardware utilization.

Explore the Benefits of Triton Management Service

Simplified Deployment

Automates deploying and managing Triton Server Instances on Kubernetes and helps group models from different frameworks for efficient use of memory.

Resource Maximization

Loads models on demand, unloads models when not in use via a lease system, and places as many models as possible on a single GPU server.

Monitoring and Autoscaling

Monitors each Triton Inference Server’s health, capacity, and autoscale based on latency and hardware utilization.

Large-Scale Inference

Use Triton Management Service to manage inference deployment from a single model to hundreds of models efficiently. Deploy on premises or on any public cloud.

More Resources

Get an Introduction

Understand the key functionality of TMS that helps automate the deployment of multiple Triton Inference Server instances.

Explore the AI Inference Platform

Explore NVIDIA’s end-to-end AI inference platform, which includes all the hardware and software necessary for driving faster, more accurate AI inference.

Get the Latest News

Read about the latest inference updates and announcements.

Check Out an Ebook

Discover the modern landscape of AI inference, production use cases from companies, and real-world challenges and solutions.

Stay up to date on the latest AI inference news from NVIDIA.