Accelerate Innovation in the Cloud

Diagnosing cancer. Predicting hurricanes. Automating business operations. These are some of the breakthroughs possible when you use accelerated computing to unveil the insights hiding in vast volumes of data. Amazon Web Services (AWS) and NVIDIA have collaborated since 2010 to deliver the most powerful and advanced GPU-accelerated cloud to help customers build a more intelligent future.

Power New Capabilities With AWS and NVIDIA

Healthcare

Accelerate drug discovery and genomic analysis using NVIDIA BioNeMo™ and NIM™ microservices on AWS HealthOmics. Researchers can access optimized AI models for protein structure prediction and generative chemistry, reducing time-to-insight and enabling cost-effective, scalable biology workflows.

Financial Services

Enhance fraud detection and identity verification with the NVIDIA AI Blueprint for financial fraud detection on AWS for financial institutions to identify subtle patterns and anomalies in transaction data.

Automotive and Manufacturing

Simulate physically accurate industrial digital twins, processes, and operations with NVIDIA Omniverse™ on AWS. Automakers and logistics companies, including Amazon Robotics, simulate production lines and autonomous mobile robots in virtual environments to optimize workflows before physical deployment.

Public Sector

Enable agencies to harness large-scale AI and HPC with full-stack accelerated computing to support missions such as generative AI, large-scale data analytics, physics simulations, and physical AI. AWS European Sovereign Cloud, powered by the NVIDIA Blackwell platform, NVIDIA Run:ai, and NVIDIA AI Enterprise, enables European organizations to securely deploy AI applications.

Telecommunications

Optimize network operations and customer experiences with the Telco AI Fellowship, a collaboration with AWS and NVIDIA. Utilize agentic AI to drive operational efficiency and new revenue streams across voice, video, and data.

Media and Entertainment

Streamline content creation with cloud-based virtual workstations using NVIDIA RTX™ on AWS. AI-accelerated production pipelines deliver higher quality content faster, data analytics provide deeper insights, distribution and monetization are optimized, and software-defined infrastructure is enhancing live entertainment.

Energy

Accelerate subsurface exploration and production, optimize field equipment and operations, increase grid reliability and resiliency, and boost renewable energy generation.

Explore Success Stories

Perplexity: Serving 800+ Million User Queries per Month With AI

Perplexity built pplx-api using NVIDIA A100 Tensor Core GPUs on AWS and NVIDIA TensorRT™-LLM, achieving up to 3.1x lower latency and 4.3x lower first-token latency compared to other platforms.​ The startup slashed inference costs by 4x—saving $600,000 annually—while scaling to hundreds of GPUs, with NVIDIA H100 GPUs delivering 50% lower latency and 200% higher throughput than A100s.​

Noetik: Powering Precision Cancer Therapies With Machine Learning

Noetik, a member of the NVIDIA Inception Program, uses NVIDIA Hopper™ Tensor Core GPUs on AWS SageMaker HyperPod to train multimodal foundation models for precision cancer immunotherapy. This enables processing of 1 petabyte of human tumor data—profiling over 200 million cells—to accelerate therapeutic discovery and unlock treatments tailored to individual patients.

Fireworks.ai: Generative AI Inference for Developers

Fireworks.ai built a lightning-fast, cost-optimized generative AI inference solution using Amazon EC2 P5 instances powered by NVIDIA H100 Tensor Core GPUs. The platform delivers 4x higher throughput per instance than open-source solutions, cuts latency by up to 50%, and reduces overall costs by 4x for some customers. Developers can run, fine-tune, and customize foundation models including Llama 2, Stable Diffusion XL, and StarCoder while meeting HIPAA and SOC2 Type II compliance standards.

A-Alpha Bio: AI-Accelerated Drug Discovery

A-Alpha Bio accelerated drug discovery by deploying NVIDIA BioNeMo™ on AWS, achieving 12x faster inference and processing 108 million protein-binding predictions—10x more than initially planned.​ Using Amazon EC2 P5 instances powered by NVIDIA H100 Tensor Core GPUs, the biotech startup reduced experimental cycles by 1–2 iterations, cutting costs while discovering superior monoclonal antibody candidates for therapeutics.​

Synthesia: AI-Enhanced Video Production

Synthesia transformed AI video production by deploying NVIDIA GPU-powered Amazon EC2 instances, achieving a 30x improvement in ML model training throughput.​ Using Amazon EC2 P5 instances with NVIDIA H100 Tensor Core GPUs and P4 instances with NVIDIA A100 GPUs, the AI startup reduced training time for voice models from days to hours while supporting 456% user growth.​

Innophore: Advancing Speed, Accuracy, and Scale in Drug Discovery

Innophore accelerates drug discovery using NVIDIA BioNeMo to analyze protein structures with its Catalophore technology.​ The platform completed mapping of the entire human organism’s protein structures in two weeks—a task that previously took over a year.​ This improves accuracy in predicting off-target drug effects by 30% within top-ranked hits.

NVIDIA Accelerated Infrastructure—From Cloud to Edge—on AWS

Amazon Elastic Cloud Compute (EC2)

Access a broad range of NVIDIA GPU-accelerated instances on Amazon EC2 on demand to meet the diverse computational requirements of AI, machine learning, data analytics, graphics, cloud gaming, virtual desktops, and HPC applications. Starting from single-GPU instances to thousands of GPUs in EC2 UltraClusters, AWS customers can provision the right-sized GPU to accelerate time to solution and reduce total costs of running their cloud workloads.

Amazon EC2 P6e With NVIDIA GB300 NVL72

Amazon EC2 P6e UltraServers, powered by NVIDIA GB300 NVL72 systems, deliver breakthrough AI performance. P6e-GB300 provides 1.5x GPU memory and compute for frontier models, making it ideal for training reasoning models—including mixture-of-experts (MoE) architectures—and for inference of enterprise copilots and agentic AI applications.

Amazon EC2 P6 With NVIDIA B300

Amazon EC2 P6 instances, powered by the NVIDIA Blackwell platform, deliver up to 2x performance improvements for AI training and inference. P6-B300 provides 1.5x GPU memory and compute versus P6-B200, ideal for large-scale distributed training. Perfect for medium- to large-scale MoE models and agentic AI applications.

Amazon EC2 G7e With NVIDIA RTX PRO 6000 Blackwell Server Edition

Amazon EC2 G7e instances with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs are available to advance AI inference, scientific computing, and spatial computing workloads. G7e instances deliver up to 2.3x inference performance compared to G6e with 1.85x GPU memory bandwidth. Built on the AWS Nitro System to optimize compute and memory resource management, G7e instances secure sensitive AI workloads and data.

AWS Integration With NVLink Fusion

AWS will support NVIDIA NVLink™ Fusion—a platform for custom AI infrastructure—enabling deployment of its custom-designed silicon, including Trainium4 chips for inference and agentic AI model training, Graviton CPUs for a broad range of workloads, and the Nitro System virtualization infrastructure.

AWS and NVIDIA Physical AI

AWS and NVIDIA are deepening their collaboration to accelerate physical AI, which enables autonomous machines such as robots and self-driving cars to perceive, understand, reason, and perform complex actions in the real, physical world. By combining AWS’s scalable cloud infrastructure with NVIDIA’s full-stack solution, developers can train, simulate, and deploy physical AI more efficiently.

Train on NVIDIA Cosmos World Foundation Models

Available as NVIDIA NIM microservices on Amazon EKS and AWS Batch, NVIDIA Cosmos™ world foundation models (WFMs) help developers build physical AI applications that understand complex physical interactions. These models simulate real-world physics and scenarios, enabling robots to reason about their environments. They are critical for training general-purpose foundation models for humanoid robots, such as NVIDIA GR00T, on AWS infrastructure.

Simulate Using NVIDIA Isaac Lab and Isaac Sim

The open NVIDIA Isaac™ Lab and Isaac Sim™ frameworks are now available on Amazon EC2 G6e instances, giving teams a scalable way to run robot learning and simulation in the cloud. Developers can train policies in Isaac Lab and validate behavior in Isaac Sim using physically accurate virtual environments and synthetic data generation before deploying to real robots. The workflow can connect with AWS to speed up perception model training and reinforcement learning at scale.

Deploy on NVIDIA Jetson Thor

NVIDIA Jetson Thor™ series modules provide the ultimate platform for physical AI and robotics, delivering up to 2070 FP4 TFLOPS of AI compute and 128 GB of memory. The NVIDIA Blackwell-powered robotics supercomputer enables key workloads across humanoid robotics, spatial intelligence, multi-sensor processing, and agentic AI.

Simplify Development and Maximize Performance With NVIDIA-Optimized Software

NVIDIA-Optimized Software on AWS

Access the computational power of NVIDIA GPU-accelerated instances on AWS to develop and deploy your applications at scale with fewer compute resources, accelerating time to solution and reducing TCO. To maximize performance and developer productivity, NVIDIA offers a wide range of GPU-optimized software for a broad range of workloads, including data science, data analytics, AI and machine learning training, AI and machine learning inference, HPC, and graphics.

NVIDIA Nemotron Nano 3 on Amazon Bedrock

Amazon Bedrock now supports NVIDIA Nemotron™ 3 Nano 30B A3B model, NVIDIA's latest breakthrough in efficient language modeling that delivers high reasoning performance, native tool-calling support, and extended context processing with 256k token context window. This model employs an efficient hybrid MoE architecture to ensure higher throughput than its predecessors for agentic and coding workloads while maintaining the reasoning depth of a larger model.

NVIDIA AI Enterprise on AWS Marketplace

NVIDIA AI Enterprise is a secure, end-to-end, cloud-native suite of AI software. It accelerates data science pipelines and streamlines the development, deployment, and management of predictive AI models to automate essential processes and deliver rapid insights from data. NVIDIA AI Enterprise includes an extensive library of full-stack software, including NVIDIA AI workflows, frameworks, pretrained models, and infrastructure optimization. Global enterprise support and regular security reviews ensure business continuity and that AI projects stay on track.

NVIDIA Run:ai on AWS Marketplace

NVIDIA Run:ai simplifies AI infrastructure management for organizations by providing a control plane for GPU infrastructure in Kubernetes-native environments. The platform addresses GPU utilization, workload prioritization, and visibility into GPU consumption by introducing a virtual GPU pool and enabling dynamic, policy-based scheduling. NVIDIA Run:ai integrates with various AWS services, including Amazon EC2, EKS, SageMaker HyperPod, IAM, and CloudWatch, to optimize performance, simplify operations, and provide a unified foundation for AI/ML workloads.

NVIDIA-Accelerated AWS Services

NVIDIA and AWS collaborate closely on integrations to bring the power of NVIDIA-accelerated computing to a broad range of AWS services. Whether you provision and manage the NVIDIA GPU-accelerated instances on AWS yourself or leverage them in managed services like Amazon SageMaker or Amazon Elastic Kubernetes Service (EKS), you have the flexibility to choose the optimal level of abstraction you need.

Amazon EMR

Leverage the NVIDIA RAPIDS™ Accelerator for Apache Spark within Amazon EMR to accelerate Apache Spark 3.x data science pipelines without any code changes on NVIDIA GPU-accelerated AWS instances. This integration enables data scientists to run extract, transform, and load (ETL), data processing, and machine learning pipelines at massive scale and lower cloud costs by getting more done in less time and with fewer cloud-based instances.

Amazon SageMaker AI

NVIDIA AI software and GPU-accelerated instances can accelerate each step of AI and machine learning workflows within Amazon Sagemaker, including data preparation, model training, and inference serving. To deploy AI models into production faster and lower inference costs, Amazon SageMaker has integrated NVIDIA Triton Inference Server™, enabling features like multi-framework support, dynamic batching, and concurrent model execution that maximize performance on both CPU and GPU instances on AWS.

Amazon Bedrock With Mantle

Amazon Bedrock enables enterprises and startups to build agentic AI applications at production scale. The platform includes NVIDIA Nemotron models available directly in the Amazon Bedrock model catalog, NVIDIA NIM microservices on the Amazon Bedrock Marketplace and SageMaker Jumpstart, NVIDIA NeMo Agent Toolkit integrated with Amazon Bedrock AgentCore for agent-driven, composable services, and GPU-accelerated serverless vector inferencing.

Developer Resources and Quick-Start Guides

NVIDIA Developer Program

Get access to an online space devoted to your needs, including advanced software tools, technical documentation, learning resources, and peer and domain expert help to accelerate your work in AI.

NVIDIA Deep Learning Institute (DLI)

Develop and master the skills you need to advance your knowledge in AI, accelerated computing, data science, graphics, simulation, and more with hands-on courses and expert-led training.

NVIDIA Inception for Startups

Join this free program designed to help AI startups evolve faster with advanced technology, opportunities to connect with investors, and access to the latest developer tools and technical resources from NVIDIA.

Access the Power of AWS and NVIDIA

Amazon EC2 Instances

NVIDIA AI Enterprise

NVIDIA Nemotron Models on AWS