NVIDIA AI Foundation Models and Endpoints

Optimized for enterprise generative AI.

What Are NVIDIA AI Foundation Models and Endpoints?

Achieve the best performance on NVIDIA accelerated infrastructure and streamline the transition to production AI with NVIDIA AI Foundation Models. With pretrained generative AI models, enterprises can create custom models faster and take advantage of the latest training and inference techniques. And with NVIDIA AI Foundation Endpoints, their applications can be connected to these models running on a fully accelerated stack to test performance.

Accelerate Time to Custom Enterprise Models

NVIDIA AI Foundation Models are optimized, tested, and hosted on the NVIDIA AI platform, making it fast and easy to evaluate, further customize, and seamlessly run them at peak performance on any accelerated platform.

Build Custom Generative AI Models for Enterprise Applications

NVIDIA AI foundry service — a collection of NVIDIA AI Foundation Models, NVIDIA NeMo™ framework and tools, and NVIDIA DGX™ Cloud gives enterprises an end-to-end solution for creating custom generative AI models.

Start with State-of-the-Art Generative AI Models

Try leading foundation models, including Llama 2, Stable Diffusion, and NVIDIA’s Nemotron-3 8B family, optimized for the highest performance efficiency.

Customize the Foundation Models

Tune and test the models with proprietary data using NVIDIA NeMo.

Build Models Faster in the Cloud

Customize models on DGX Cloud, a serverless AI-training-as-a-service platform for enterprise developers.

Benefits of NVIDIA AI Foundation Models and Endpoints

Performance Optimized

Lower your TCO and increase energy efficiency by running inference up to 4x faster.


Use lean, high-performing large language models (LLMs) built from responsibly sourced datasets.

Try Models on the Fly

Experience models’ peak performance directly from a browser with GUI or API.

Ready-to-Integrate APIs

Connect your applications to API endpoints and test their real-world performance running on a fully-accelerated stack.

Deploy Your Models Anywhere

Run the model anywhere, from cloud to data center to workstations, with NVIDIA AI Enterprise.

Experience Optimized Generative AI Models

NVIDIA AI Foundation Models include leading community- and NVIDIA-built models to support various use cases, including content generation, image creation, drug discovery, and IT service automation.

Llama 2

Llama 2 is a large language AI model capable of generating text and code in response to prompts.

Stable Diffusion XL

Stable Diffusion XL (SDXL) generates expressive images with shorter prompts and inserts words inside images.


Nemotron-3 8B is an enterprise-grade Question-Answering LLM that enterprises can customize for their domains.

Power Your Enterprise Applications With Retrieval-Augmented Generation (RAG)

Build AI chatbots that connect with your custom LLMs and knowledge bases to accurately and naturally answer domain-specific questions in real time.

Success Stories

Generative AI is impacting every industry today—from IT services and telecommunications to finance and retail.  Putting generative AI into practice requires enterprises to have access to an AI foundry to build custom models using proprietary data and deploy them at scale. See how the world’s leading organizations are serving their customers with NVIDIA AI.

Ecosystem Partners

Let’s Get Started

Try the latest, fully optimized NVIDIA AI Foundation Models today from the NGC catalog, Azure ML model catalog, or Hugging Face.

Notify me as new models are optimized and added to NVIDIA’s collection of AI foundation models.

Explore additional generative AI resources and tools.