AI Storage Ecosystem for the Data Center
Rearchitecting inference storage for the next frontier of AI.
Overview
NVIDIA® CMX™ context memory storage is an AI‑native context tier for long‑context, multi‑turn, and agentic AI inference. Powered by the NVIDIA BlueField®‑4 storage processor, it extends GPU memory with a shared, pod‑level context tier optimized for ephemeral key-value (KV) cache. The platform provides a high‑bandwidth path that reduces latency, cost, and power overhead for large-scale inference workloads, helping deliver higher throughput and better power efficiency on NVIDIA Rubin platforms.
Products
From accelerated context memory and secure data movement to Ethernet fabrics and inference frameworks, NVIDIA CMX is the result of extreme co-design across compute, networking, storage, and software.
Product Benefits
NVIDIA CMX introduces a dedicated context tier that improves sustained throughput and power efficiency for KV‑cache-intensive, long‑context workloads compared with traditional storage approaches.
Scale AI services with a highly efficient, KV-cache-optimized storage tier that reclaims essential power, freeing more of the data center power budget for GPUs instead of traditional storage.
Optimize data paths and reduce stalls by reusing precomputed KV cache from the CMX tier instead of recomputing it. This boosts tokens per second and throughput for multi-turn, agentic inference. CMX reduces time to first token and time to last token, so answers stream sooner and finish faster, even as models, context windows, and concurrency grow.
Provide high-speed, pod-wide access to AI-native context to enable multi-turn agents to coordinate, share state, and scale seamlessly as workloads grow, while reducing duplication of KV cache and stranded capacity across nodes.
Deliver massive KV-cache capacity to support long-context reasoning, multi-agent workflows, trillion-parameter models, and longer-context windows for many simultaneous users.
NVIDIA STX is a modular reference architecture for AI storage, co-designed with leading storage partners and built on NVIDIA accelerated compute, networking, and AI software. NVIDIA STX provides the foundation for building a universal data engine that accelerates the full AI lifecycle, from training and analytics to real-time agentic inference.
Ecosystem
Resources
Connect with the NVIDIA enterprise sales team or the right partner in the NVIDIA Partner Network (NPN) program to get started.
Talk to an NVIDIA specialist about your business needs.
Sign up to get the latest news, updates, and more from NVIDIA.