HPC-X ScalableHPC Toolkit

Increase Scalability and Performance of Messaging Communications

The NVIDIA Mellanox HPC-X® ScalableHPC toolkit is a comprehensive software package that includes MPI, SHMEM/PGAS communications libraries and various acceleration packages. This full-featured, tested and packaged toolkit enables MPI and SHMEM/PGAS programming languages to achieve high performance, scalability and efficiency, and assures communication libraries are fully optimized by the NVIDIA interconnect solutions.

Comprehensive Software

NVIDIA offers a comprehensive suite of software that utilizes the advantage of the NVIDIA hardware-based acceleration engines to maximize application performance. These acceleration engines are part of the NVIDIA switch with the SHARP acceleration engine and network adapters that include; CORE-Direct engine, hardware tag-matching and more. This innovative approach dramatically reduces MPI operations time, frees up valuable CPU resources and decreases the amount of data traversing the network allowing unprecedented scale to reach evolving performance demands.

Comprehensive Software

Software and Acceleration Packages

HPC-X ScalableHPC

The NVIDIA HPC-X ScalableHPC Toolkit is a comprehensive MPI and SHMEM/PGAS software suite for high performance computing environments. HPC-X provides enhancements to significantly increase the scalability and performance of message communications in the network.

Fabric Collective Accelerator

Fabric Collective Accelerator (FCA) technology is a MPI-integrated software package that utilizes CORE-Direct technology for implementing the MPI collective communications. FCA can be used with all major commercial and open-source MPI solutions that exist and are being used for high-performance applications.


The NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) technology improves the performance of MPI operations by offloading them from the CPU, to the switch network, and eliminating the need to send data multiple times, decreasing the amount of data traversing the network and dramatically reduces MPI operation time.

Unified Communication X

The UCX is an opens-source communication framework for data centric and high-performance applications. Developed in collaboration between industry, laboratories, and academia, UCX provides a low-overhead communication path for near native-level performance with cross platform unified API supporting various network Host Card Adapters (HCAs) and processor technologies (x86, ARM and PowerPC).


Message Passing Interface (MPI) is a standardized, language-independent specification for writing message-passing programs. HPC-X MPI is a high performance implementation of Open MPI which has been optimized to take advantage of the additional Mellanox acceleration capabilities while providing a seamless integration with the industry leading commercial and open-source application software packages.


The HPC-X OpenSHMEM programming library is a one-side communications library that supports a unique set of parallel programming features including point-to-point and collective routines, synchronizations, atomic operations, and a shared memory paradigm used between the processes of a parallel programming application.

Messaging Accelerator

The NVIDIA Mellanox Messaging Accelerator (MXM) provides enhancements to parallel communication libraries by fully utilizing the underlying networking infrastructure provided by NVIDIA HCA/switch hardware.


We're here to help you build the most efficient, high performance network.

Configure a Cluster

Academy Online Courses

Ready to Purchase