Accelerate Innovation in the Cloud

Diagnosing cancer. Predicting hurricanes. Automating business operations. These are some of the breakthroughs possible when you use accelerated computing to unveil the insights hiding in vast volumes of data. Amazon Web Services (AWS) and NVIDIA have collaborated for over 13 years to deliver the most powerful and advanced GPU-accelerated cloud to help customers build a more intelligent future.

Power New Capabilities With AWS and NVIDIA

Healthcare

Deliver personalized medicine and accelerate breakthroughs in biomedical research with AWS and NVIDIA solutions.

Media and Entertainment

Realize the potential of cloud computing for digital content creation. Adapt your resources as your studio’s demands grow, and access the best creative talent across the globe.

Financial Services

Boost risk management, improve data-backed decisions and security, and enhance customer experiences with generative AI, deep learning, machine learning, and natural language processing (NLP) solutions.

Digital Twins and the Metaverse

Harness the power of large-scale simulation for industrial and scientific applications.

Enterprise AI and Machine Learning

Reduce development time, lower costs, improve accuracy and performance, and have more confidence in AI outcomes with NVIDIA solutions running on AWS.

High-Performance Computing

Learn how AWS and NVIDIA high-performance computing (HPC) solutions are optimized to work together, cost-effectively solving the world’s most complex problems.

Explore Customer Stories

Video Call Transcription

Software company Read.ai built their video call transcription platform on NVIDIA® Riva and reduced costs by 20–30 percent using Amazon EC2 G5 instances powered by NVIDIA A10G Tensor Core GPUs.

Machine Learning in Life Sciences

Life sciences company Paige is furthering cancer treatment with a hybrid machine learning workflow built using Amazon EC2 P4d instances powered by NVIDIA A100 Tensor Core GPUs.

VFX Studio in the Cloud

Netflix deployed their visual effects (VFX) studio to facilitate remote collaboration among a global workforce using Amazon EC2 G5 instances powered by NVIDIA A10G GPUs.

Generative AI for Content

Iternal Technologies used Amazon EC2 Instances powered by NVIDIA GPUs to help their customers supercharge their marketing, improving ROI by 30X with generative AI. Because Iternal is part of NVIDIA Inception, they were among the first to gain access to NVIDIA Riva’s voice cloning capabilities to get a proof-of-concept generative AI voice product up and running in two weeks.

HPC and Machine Learning for Retail

Automotive company Reezocar estimates vehicle repairs swiftly and accurately using AWS HPC and machine learning infrastructure powered by NVIDIA GPUs. With this infrastructure, the company can meticulously detect car dents and imperfections and estimate repair costs in milliseconds, helping to extend the serviceable life of vehicles.

Generative AI for Gaming

Codeway optimized price performance for their generative AI application, Wonder, using NVIDIA GPU-powered Amazon EC2 G5 instances, saving 48 percent on compute costs.

NVIDIA Accelerated Infrastructure—From Cloud to Edge—on AWS

Amazon Elastic Cloud Compute (EC2)

Access a broad range of NVIDIA GPU-accelerated instances on Amazon EC2 on demand to meet the diverse computational requirements of AI, machine learning, data analytics, graphics, cloud gaming, virtual desktops, and HPC applications. Starting from single-GPU instances to thousands of GPUs in EC2 UltraClusters, AWS customers can provision the right-sized GPU to accelerate time to solution and reduce total costs of running their cloud workloads.

Amazon EC2 G5 With NVIDIA A10G

Featuring NVIDIA A10G Tensor Core GPUs and support for NVIDIA RTX™ technology, EC2 G5 instances are ideal for graphics-intensive applications like video editing, rendering, 3D visualization, and photorealistic simulations. Additionally, they can be used to accelerate AI inference and single-GPU AI training workloads.

Amazon EC2 G5g With NVIDIA T4G

Featuring NVIDIA T4G Tensor Core GPUs and AWS Graviton2 processors, EC2 G5g instances are best suited for cloud game development and Android-in-the-cloud gaming services. They can also be used for cost-effective AI inference using Arm®-enabled software from the NVIDIA NGC™ catalog.

Amazon EC2 P4d With NVIDIA A100 40GB

Featuring eight NVIDIA A100 40GB Tensor Core GPUs, EC2 P4d instances deliver the highest performance for AI and HPC. For multi-node AI training and distributed HPC workloads, you can scale from few to thousands of NVIDIA A100 GPUs in EC2 UltraClusters.

Amazon EC2 P5 With NVIDIA H100 80GB:

Tensor Core GPUs deliver the highest performance in Amazon EC2 for deep learning and HPC applications. They help you accelerate your time to solution by up to 6X compared to previous-generation GPU-based EC2 instances and reduce the cost to train machine learning models by up to 40 percent.

AWS Hybrid Cloud and Edge Solutions

Leverage the power of NVIDIA-accelerated computing across a broad range of AWS hybrid cloud and edge solutions to meet the low-latency, real-time requirements of workloads like AI, machine learning, gaming, content creation, and augmented reality (AR) and virtual reality (VR) streaming. NVIDIA’s performance-optimized and cloud-native software stack ensures that you get the best performance for your applications, wherever they need to run—cloud to edge.

AWS Panorama

AWS Panorama is a collection of machine learning devices and an SDK that brings computer vision to on-premises internet protocol (IP) cameras. AWS Panorama edge devices are built on NVIDIA Jetson™ system on modules (SOMs) and use the NVIDIA JetPack™ SDK to accelerate AI at the edge for industrial inspection, traffic monitoring, and supply chain management use cases.

AWS Outposts

With NVIDIA T4 Tensor Core GPUs in AWS Outposts, you can meet security and latency requirements in a wide variety of AI and graphics applications in on-premises data centers. Combined with access to GPU-optimized software from NGC, you can derive insights from vast amounts of data orders-of-magnitude faster than CPUs alone.

AWS Wavelength

AWS Wavelength brings the AWS cloud to the edge of the 5G mobile network to develop and deploy ultra-low-latency  applications. AWS Wavelength zones offer access to NVIDIA GPU-accelerated instances to speed up applications such as game streaming, AR/VR, and AI inference at the edge.

AWS IoT Greengrass

AWS IoT Greengrass extends AWS services to edge devices, such as NVIDIA Jetson platforms, to develop AI models and deploy them at the edge to act locally on generated data. Combined with the NVIDIA DeepStream SDK, you can build and deploy high-throughput, low-latency vision AI applications at the edge.

Simplify Development and Maximize Performance With NVIDIA-Optimized Software

NVIDIA-Optimized Software on AWS

Access the computational power of NVIDIA GPU-accelerated instances on AWS to develop and deploy your applications at scale with fewer compute resources, accelerating time to solution and reducing TCO. To maximize performance and developer productivity, NVIDIA offers a wide range of GPU-optimized software for a broad range of workloads, including data science, data analytics, AI and machine learning training, AI and machine learning inference, HPC, and graphics.

NVIDIA NGC

NVIDIA NGC is the portal of enterprise services, software, management tools, and support for end-to-end AI and digital twin workflows. The NGC software catalog provides a range of resources that meet the needs of data scientists, developers, and researchers with varying levels of expertise, including containers, pretrained models, domain-specific SDKs, use case-based collections, and Helm charts for the fastest AI implementations. To take AI workloads to production with NGC software, you can access enterprise-grade support, training, and services with NVIDIA AI Enterprise.

NVIDIA AI Enterprise on AWS

NVIDIA AI Enterprise is a secure, end-to-end, cloud-native suite of AI software. It accelerates data science pipelines and streamlines the development, deployment, and management of predictive AI models to automate essential processes and deliver rapid insights from data. NVIDIA AI Enterprise includes an extensive library of full-stack software, including NVIDIA AI workflows, frameworks, pretrained models, and infrastructure optimization. Global enterprise support and regular security reviews ensure business continuity and that AI projects stay on track.

NVIDIA RTX Virtual Workstation

The NVIDIA RTX Virtual Workstation (RTX vWS) for GPU-accelerated graphics helps creative and technical professionals maximize their productivity from anywhere by providing  access to the most demanding professional design and engineering applications from the cloud.  Amazon EC2 G5 (NVIDIA A10G) and G4dn (NVIDIA T4) instances, combined with the RTX vWS Amazon Machine Image (AMI), enables the industry’s most advanced 3D graphics platform, including the latest real-time ray tracing with RTX technology in virtual machines.

NVIDIA-Accelerated AWS Services

NVIDIA and AWS collaborate closely on integrations to bring the power of NVIDIA-accelerated computing to a broad range of AWS services. Whether you provision and manage the NVIDIA GPU-accelerated instances on AWS yourself or leverage them in managed services like Amazon SageMaker or Amazon Elastic Kubernetes Service (EKS), you have the flexibility to choose the optimal level of abstraction you need.

Amazon EMR

Leverage the NVIDIA RAPIDS™ Accelerator for Apache Spark within Amazon EMR to accelerate Apache Spark 3.x data science pipelines without any code changes on NVIDIA GPU-accelerated AWS instances. This integration enables data scientists run their extract, transform, and load (ETL), data processing, and machine learning pipelines at massive scale and lower cloud costs by getting more done in less time and with fewer cloud-based instances.

Amazon SageMaker

NVIDIA AI software and GPU-accelerated instances can accelerate each step of AI and machine learning workflows within Amazon Sagemaker, including data preparation, model training, and inference serving. To deploy AI models into production faster and lower inference costs, Amazon SageMaker has integrated NVIDIA Triton™ Inference Server, enabling features like multi-framework support, dynamic batching, and concurrent model execution that maximize performance on both CPU and GPU instances on AWS.

Amazon Titan

A team of experienced scientists and developers at AWS creating Amazon Titan foundation models for Amazon Bedrock, a generative AI service, uses NVIDIA NeMo™, an end-to-end, cloud-native framework for building, customizing, and deploying generative AI models anywhere. 

And the Elastic Fabric Adapter (EFA) from AWS provides customers with an UltraCluster Networking infrastructure that can directly connect more than 10,000 GPUs and bypass the operating system and CPU using NVIDIA GPUDirect®.

Developer Resources and Quick-Start Guides

MONAI Label Workshops

Learn how you can make use of MONAI—an open-source AI framework for healthcare—in your work. Join us to get a hands-on experience.

BioNeMo Now on AWS

Researchers and developers at leading pharmaceutical and techbio companies can now easily deploy NVIDIA Clara™ software and services, including NVIDIA BioNeMo™, for accelerated healthcare through AWS.

Accelerate Your Startup

Explore the program that provides cutting-edge startups around the world with critical access to go-to-market support, technical expertise, training, and funding opportunities.

AI Capabilities Using TensorRT-LLM

Previously, creating detailed product listings required significant time and effort for sellers, but this simplified process gives them more time to focus on other tasks. The NVIDIA TensorRT-LLM software is available today on GitHub and can be accessed through NVIDIA AI Enterprise, which offers enterprise-grade security, support, and reliability for production AI.

NVIDIA CloudXR

NVIDIA CloudXR™ is NVIDIA’s extended reality (XR) streaming technology, built on RTX and RTX Virtual Workstation software. By using CloudXR alongside Amazon NICE DCV streaming protocols, you can use on-demand compute resources for all aspects of your immersive application development.

NVIDIA Triton Inference Server in Amazon SageMaker

This blog provides an overview of NVIDIA Triton Inference Server and SageMaker, shows the benefits of using Triton Inference Server containers, and showcases how easy it is to deploy your own machine learning models. To work from a sample notebook that supports this blog post, download it here.

NVIDIA Riva at Scale With Amazon EKS

This step-by-step guide show you how to deploy and scale NVIDIA Riva speech skills on Amazon EKS with Traefik-based load balancing.

Amazon Music Uses SageMaker With NVIDIA to Optimize Machine Learning Training and Inference

Take a look inside the journey Amazon Music took to optimize performance and cost using SageMaker, NVIDIA Triton Inference Server, and NVIDIA TensorRT®. We show how the seemingly simple, yet intricate, search bar works, ensuring a seamless Amazon Music experience with little-to-zero typo delays and relevant real-time search results.

NVIDIA Clara Parabricks on AWS

Amazon.com, one of the most visited ecommerce websites in the world, uses an AI model that automatically corrects misspelled words in search queries to let customers more effortlessly shop. Amazon measures the success of their accelerated search results based on latency—how fast typos are corrected—and the number of successful sessions.

Access the Power of AWS and NVIDIA

Amazon EC2 P5 Instances

NVIDIA AI Enterprise

NVIDIA RTX Virtual Workstations