For Deep Learning performance, please go here.


Modern HPC data centers are key to solving some of the world’s most important scientific and engineering challenges. The NVIDIA Data Center GPUs fundamentally change the economics of the data center, delivering breakthrough performance with dramatically fewer servers, less power consumption, and reduced networking overhead, resulting in total cost savings of 5X-10X.

The number of CPU-only servers replaced by a single GPU-accelerated server is called the node replacement factor (NRF). To arrive at NRF, we measure application performance with up to 8 CPU-only servers. Then we use linear scaling to scale beyond 8 servers to calculate the NRF. The NRF will vary by application.


Detailed H200 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

22.4-AT_23.4

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H2002x H2004x H2008x H200
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.143016021,2212,427
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x30x59x120x239x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.443066121,2462,504
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x29x59x119x240x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes59.601,3072,6175,41610,438
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x22x44x91x175x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes55.291,3322,6665,37710,684
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x24x48x97x193x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes234.704,3358,67717,25532,487
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x18x37x74x138x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes232.724,5338,91617,73333,431
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x19x38x76x144x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes2.9890180361722
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x30x61x121x242x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.622034068121,624
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x8x16x32x63x

AMBER is measured by running multiple independent instances using MPS


FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.0.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H2002x H2004x H2008x H200
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no186241497
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x10x17x27x33x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno2573520129
Fun3D [waverider-5M]NRFwaverider-5Myes1x11x20x33x43x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno735100543221
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x11x20x34x52x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno1,088--4226
Fun3D [waverider-20M]NRFwaverider-20Myes1x--35x56x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno3,117--12372
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x--37x63x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

2024

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H2002x H2004x H2008x H200
GROMACS [ADH Dodec]ns/dayADH Dodecyes1897691,5812,6095,233
GROMACS [ADH Dodec]NRFADH Dodecyes1x4x8x14x28x
GROMACS [STMV]ns/daySTMVyes144473123182
GROMACS [STMV]NRFSTMVyes1x3x6x12x18x

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H2002x H2004x H2008x H200
GTCMpush/Secmpi#proc.inyes898231,5412,9845,265
GTCNRFmpi#proc.inyes1x10x19x36x63x

LAMMPS

Molecular Dynamics

Classical molecular dynamics package

VERSION

Stable_2Aug2023

ACCELERATED FEATURES

  • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H2002x H2004x H2008x H200
LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.51E+081.39E+092.53E+094.52E+097.43E+09
LAMMPS [LJ 2.5]NRFLJ 2.5yes1x4x7x14x22x
LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.85E+085.55E+081.03E+091.86E+092.95E+09
LAMMPS [EAM]NRFEAMyes1x3x6x11x17x
LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.37E+061.12E+072.01E+073.34E+075.00E+07
LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x12x22x36x54x
LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes5.09E+054.00E+067.95E+061.57E+073.00E+07
LAMMPS [SNAP]NRFSNAPyes1x9x18x36x68x
LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes1.03E+081.02E+091.79E+093.20E+095.79E+09
LAMMPS [Tersoff]NRFTersoffyes1x12x20x37x66x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_a2f9e61

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H2002x H2004x H2008x H200
MILCTotal Time (sec)Apex Mediumno31,577969526300188
MILCNRFApex Mediumyes1x29x53x94x149x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3.b04

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H2002x H2004x H2008x H200
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes64.493116151,2412,469
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x5x10x19x38x
NAMD [apoa1_nptsr_cuda]ns/dayapoa1_nptsr_cudayes65.193216351,2662,525
NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x5x10x19x39x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes71.143957941,5593,018
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x6x11x22x42x
NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes6.582754107214
NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x4x8x16x32x
NAMD [stmv_nptsr_cuda]ns/daystmv_nptsr_cudayes6.712755109219
NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x4x8x16x33x
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes6.973265129258
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x5x9x19x37x

NAMD is measured by running multiple independent instances using MPS


SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

devel_d2105bb

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H2002x H2004x H2008x H200
SPECFEM3DTotal Time (Sec)four_material_simple_modelno3863821129
SPECFEM3DNRFfour_material_simple_modelyes1x11x21x37x49x


Detailed GH200 96GB application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

22.4-AT_23.4

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB4x GH200 96GB
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.142971,211
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x29x119x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.443021,259
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x29x121x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes59.601,2865,333
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x22x89x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes55.291,3135,502
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x24x100x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes234.704,38717,023
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x19x73x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes232.724,49617,315
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x19x74x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes2.9894374
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x31x126x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.62202807
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x8x31x

AMBER is measured by running multiple independent instances using MPS


FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.0.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB4x GH200 96GB
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no1862410
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x10x25x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno2573613
Fun3D [waverider-5M]NRFwaverider-5Myes1x11x30x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno73510538
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x10x29x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno1,088-48
Fun3D [waverider-20M]NRFwaverider-20Myes1x-30x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno3,117-138
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x-33x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

2024

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB4x GH200 96GB
GROMACS [ADH Dodec]ns/dayADH Dodecyes1898452,865
GROMACS [ADH Dodec]NRFADH Dodecyes1x4x15x
GROMACS [STMV]ns/daySTMVyes1448126
GROMACS [STMV]NRFSTMVyes1x4x12x

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB4x GH200 96GB
GTCMpush/Secmpi#proc.inyes898142,590
GTCNRFmpi#proc.inyes1x10x31x

ICON

Weather and Climate

A global unified atmosphere model for numerical weather prediction and climate modeling research

VERSION

2.6.7_RC

ACCELERATED FEATURES

  • Full model of dynamics and physics

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://code.mpimet.mpg.de/projects/iconpublic

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB4x GH200 96GB
ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno964179109
ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x5x9x
ICON [QUBICC 160 km resolution]Integrate_nh (sec)QUBICC 160 km resolutionno79817295
ICON [QUBICC 160 km resolution]NRFQUBICC 160 km resolutionyes1x5x8x

LAMMPS

Molecular Dynamics

Classical molecular dynamics package

VERSION

Stable_2Aug2023

ACCELERATED FEATURES

  • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB
LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.51E+081.52E+09
LAMMPS [LJ 2.5]NRFLJ 2.5yes1x4x
LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.85E+085.87E+08
LAMMPS [EAM]NRFEAMyes1x3x
LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.37E+061.12E+07
LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x12x
LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes5.09E+054.01E+06
LAMMPS [SNAP]NRFSNAPyes1x9x
LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes1.03E+081.07E+09
LAMMPS [Tersoff]NRFTersoffyes1x12x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_a2f9e61

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB
MILCTotal Time (sec)Apex Mediumno31,577920
MILCNRFApex Mediumyes1x31x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3.b04

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB4x GH200 96GB
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes64.493091,169
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x5x18x
NAMD [apoa1_nptsr_cuda]ns/dayapoa1_nptsr_cudayes65.193211,225
NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x5x19x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes71.143921,424
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x6x20x
NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes6.5827106
NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x4x16x
NAMD [stmv_nptsr_cuda]ns/daystmv_nptsr_cudayes6.7127108
NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x4x16x
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes6.9732127
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x5x18x

NAMD is measured by running multiple independent instances using MPS


SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

devel_d2105bb

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x GH200 96GB4x GH200 96GB
SPECFEM3DTotal Time (Sec)four_material_simple_modelno3864112
SPECFEM3DNRFfour_material_simple_modelyes1x11x35x


Detailed H100 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

22.4-AT_23.4

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.142495051,0201,9972895811,2292,408
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x25x50x101x197x28x57x121x238x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.442535131,0292,1132925931,2392,747
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x24x49x99x202x28x57x119x263x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes59.601,1092,2454,5349,2161,2462,5085,14911,308
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x19x38x76x155x21x42x86x190x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes55.291,1322,2934,6329,4431,2922,5885,26311,507
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x20x41x84x171x23x47x95x208x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes234.703,8987,72015,44931,6574,2808,50417,04334,274
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x17x33x66x135x18x36x73x146x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes232.723,9907,95615,57131,3604,3448,83117,88836,412
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x17x34x67x135x19x38x77x156x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes2.988116132264585170341681
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x27x54x108x216x29x57x114x229x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.621793577151,4301953917811,563
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x7x14x28x56x8x15x31x61x

AMBER is measured by running multiple independent instances using MPS


FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.0.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no186291710102716108
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x8x14x24x25x9x15x25x32x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno2574324141440221410
Fun3D [waverider-5M]NRFwaverider-5Myes1x9x17x29x28x10x18x30x42x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno735127673831116623623
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x9x16x28x35x9x17x30x47x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno1,088--5038--4628
Fun3D [waverider-20M]NRFwaverider-20Myes1x--29x38x--31x52x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno3,117--15198--14080
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x--30x47x--33x57x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

2024

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM
GROMACS [ADH Dodec]ns/dayADH Dodecyes1896911,3212,5725,1177761,4402,5965,164
GROMACS [ADH Dodec]NRFADH Dodecyes1x4x7x14x27x4x8x14x27x
GROMACS [STMV]ns/daySTMVyes144067100-4372120177
GROMACS [STMV]NRFSTMVyes1x3x6x10x-3x6x12x17x

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM
GTCMpush/Secmpi#proc.inyes897401,3382,3813,9537681,4222,7805,196
GTCNRFmpi#proc.inyes1x9x16x29x48x9x17x33x62x

LAMMPS

Molecular Dynamics

Classical molecular dynamics package

VERSION

Stable_2Aug2023

ACCELERATED FEATURES

  • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM
LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.51E+081.07E+091.80E+093.36E+094.79E+091.29E+092.33E+094.19E+097.03E+09
LAMMPS [LJ 2.5]NRFLJ 2.5yes1x3x5x10x14x4x7x13x21x
LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.85E+084.74E+08-1.58E+09-5.19E+089.63E+081.74E+092.84E+09
LAMMPS [EAM]NRFEAMyes1x3x-9x-3x5x10x16x
LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.37E+069.66E+061.68E+072.88E+073.94E+071.05E+071.91E+073.12E+074.75E+07
LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x11x18x31x43x11x21x34x52x
LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes5.09E+053.35E+066.06E+061.30E+072.55E+073.90E+067.76E+061.53E+072.91E+07
LAMMPS [SNAP]NRFSNAPyes1x9x14x29x58x9x18x35x66x
LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes1.03E+088.53E+081.43E+092.80E+094.18E+099.93E+081.80E+093.30E+095.53E+09
LAMMPS [Tersoff]NRFTersoffyes1x10x16x32x48x11x21x38x63x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_a2f9e61

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

Application Metric Test Modules Bigger is better Dual Sapphire Rapids 8480+ (CPU-Only) 1x H100 NVL 2x H100 NVL 4x H100 NVL 1x H100 SXM 2x H100 SXM 4x H100 SXM 8x H100 SXM
MILC Total Time (sec) Apex Medium no 31,577 1,286 806 385 1,163 623 355 215
MILC NRF Apex Medium yes 1x 22x 35x 73x 24x 45x 79x 130x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3.b04

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes64.492735501,1062,2092995961,1812,300
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x4x9x17x34x5x9x18x36x
NAMD [apoa1_nptsr_cuda]ns/dayapoa1_nptsr_cudayes65.192805661,1362,2663066121,2122,364
NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x4x9x17x35x5x9x19x36x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes71.143396841,3772,7383777591,4902,910
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x5x10x19x38x5x11x21x41x
NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes6.582347941882551101203
NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x4x7x14x29x4x8x15x31x
NAMD [stmv_nptsr_cuda]ns/daystmv_nptsr_cudayes6.712448961932652104208
NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x4x7x14x29x4x8x16x31x
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes6.9727551102223161123245
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x4x8x16x32x4x9x18x35x

NAMD is measured by running multiple independent instances using MPS


SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

devel_d2105bb

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM
SPECFEM3DTotal Time (Sec)four_material_simple_modelno3865227141046241410
SPECFEM3DNRFfour_material_simple_modelyes1x8x16x30x44x10x18x32x45x


Detailed L40S application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

22.4-AT_23.4

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L40S2x L40S4x L40S8x L40S
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes59.609831,9723,9948,259
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x16x33x67x139x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes55.291,0072,0094,0588,532
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x18x36x73x154x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes234.704,0378,08116,43632,051
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x17x34x70x137x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes232.724,0658,25916,72833,529
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x17x35x72x144x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes2.9891183366732
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x31x61x123x246x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.621913817631,526
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x7x15x30x60x

AMBER is measured by running multiple independent instances using MPS


FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.0.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L40S2x L40S4x L40S8x L40S
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no186-663419
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x-4x7x13x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno257166854525
Fun3D [waverider-5M]NRFwaverider-5Myes1x2x5x9x16x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno735-24112770
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x-4x9x15x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno1,088--17897
Fun3D [waverider-20M]NRFwaverider-20Myes1x--8x15x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno3,117---295
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x---15x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

2024

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L40S2x L40S4x L40S8x L40S
GROMACS [ADH Dodec]ns/dayADH Dodecyes1896491,4342,6145,198
GROMACS [ADH Dodec]NRFADH Dodecyes1x3x8x14x27x
GROMACS [STMV]ns/daySTMVyes144369103-
GROMACS [STMV]NRFSTMVyes1x3x6x10x-

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L40S2x L40S4x L40S8x L40S
GTCMpush/Secmpi#proc.inyes894428091,5913,066
GTCNRFmpi#proc.inyes1x5x10x19x37x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_a2f9e61

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L40S2x L40S4x L40S
MILCTotal Time (sec)Apex Mediumno31,5774,0472,0511,343
MILCNRFApex Mediumyes1x7x14x21x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3.b04

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L40S2x L40S4x L40S8x L40S
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes64.492304579001,816
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x4x7x14x28x
NAMD [apoa1_nptsr_cuda]ns/dayapoa1_nptsr_cudayes65.192304579101,803
NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x4x7x14x28x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes71.142996021,1932,369
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x4x8x17x33x
NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes6.58173467135
NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x3x5x10x20x
NAMD [stmv_nptsr_cuda]ns/daystmv_nptsr_cudayes6.71173570139
NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x3x5x10x21x
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes6.97224488176
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x3x6x13x25x

NAMD is measured by running multiple independent instances using MPS


SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

devel_d2105bb

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L40S2x L40S4x L40S8x L40S
SPECFEM3DTotal Time (Sec)four_material_simple_modelno386171864423
SPECFEM3DNRFfour_material_simple_modelyes1x2x4x10x19x


Detailed L4 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

22.4-AT_23.4

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L4 2x L44x L48x L4
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.1452106212426
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x5x10x21x42x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.4454108215433
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x5x10x21x41x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes59.602595191,0392,142
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x4x9x17x36x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes55.292655331,0662,132
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x5x10x19x39x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes234.701,2252,4504,9319,899
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x5x10x21x42x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes232.721,2412,4815,01810,161
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x5x11x22x44x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes2.98204081162
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x7x14x27x54x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.62114227455910
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x4x9x18x36x

AMBER is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x L4 2x L44x L48x L4
GTCMpush/Secmpi#proc.inyes891843346641,234
GTCNRFmpi#proc.inyes1x2x4x8x15x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_a2f9e61

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

Application Metric Test Modules Bigger is better Dual Sapphire Rapids 8480+ (CPU-Only) 2x L44x L48x L4
MILC Total Time (sec) Apex Medium no 31,577 5,875 3,002 1,587
MILC NRF Apex Medium yes 1x 5x 9x 18x


Detailed A100 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

22.4-AT_23.4

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.141823687411,4701743366961,414
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x18x36x73x145x17x33x69x139x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.441863707501,4851773437121,460
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x18x35x72x142x17x33x68x140x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes59.607981,6063,2216,3797711,4873,0626,349
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x13x27x54x107x13x25x51x107x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes55.298171,6303,2466,5597901,5773,1816,655
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x15x29x59x119x14x29x58x120x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes234.702,8885,73811,50323,8682,8755,58211,50623,739
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x12x24x49x102x12x24x49x101x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes232.722,9665,87711,72123,5152,8665,91211,89124,101
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x13x25x50x101x12x25x51x104x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes2.985310721442752105210420
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x18x36x72x143x18x35x70x141x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.621342695371,0741352705391,078
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x5x10x21x42x5x11x21x42x

AMBER is measured by running multiple independent instances using MPS


FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.0.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no1864826161148261512
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x5x9x15x21x5x9x16x21x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno2577240231672392216
Fun3D [waverider-5M]NRFwaverider-5Myes1x6x10x17x26x6x10x18x26x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno73519510357351991035735
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x6x11x19x31x5x10x19x31x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno1,088--8146--8147
Fun3D [waverider-20M]NRFwaverider-20Myes1x--18x31x--18x31x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno3,117--228124---128
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x--20x37x---36x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

2024

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
GROMACS [ADH Dodec]ns/dayADH Dodecyes1894539331,672-4808831,3513,263
GROMACS [ADH Dodec]NRFADH Dodecyes1x2x5x9x-3x5x7x17x
GROMACS [STMV]ns/daySTMVyes14244581130234166-
GROMACS [STMV]NRFSTMVyes1x2x3x7x12x2x3x6x-

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

Application Metric Test Modules Bigger is better Dual Sapphire Rapids 8480+ (CPU-Only) 1x A100 SXM4 80GB 2x A100 SXM4 80GB 4x A100 SXM4 80GB 8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
GTCMpush/Secmpi#proc.inyes894899061,7873,3274908881,7262,796
GTCNRFmpi#proc.inyes1x6x11x21x40x6x11x21x34x

LAMMPS

Molecular Dynamics

Classical molecular dynamics package

VERSION

Stable_2Aug2023

ACCELERATED FEATURES

  • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.51E+086.96E+081.30E+092.34E+094.05E+096.65E+081.21E+091.99E+09-
LAMMPS [LJ 2.5]NRFLJ 2.5yes1x2x4x7x12x2x3x6x-
LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.85E+083.01E+085.62E+081.01E+091.67E+092.92E+085.41E+089.28E+08-
LAMMPS [EAM]NRFEAMyes1x2x3x6x10x2x3x5x-
LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.37E+065.66E+061.05E+071.76E+072.76E+075.70E+061.01E+071.70E+071.95E+07
LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x5x11x19x30x5x11x18x21x
LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes5.09E+052.21E+064.40E+068.73E+061.66E+072.08E+064.21E+068.21E+061.58E+07
LAMMPS [SNAP]NRFSNAPyes1x6x10x20x38x5x10x19x36x
LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes1.03E+085.49E+081.02E+091.83E+093.16E+095.29E+089.28E+081.51E+091.58E+09
LAMMPS [Tersoff]NRFTersoffyes1x6x12x21x36x5x11x17x18x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_a2f9e61

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

Application Metric Test Modules Bigger is better Dual Sapphire Rapids 8480+ (CPU-Only) 1x A100 SXM4 80GB 2x A100 SXM4 80GB 4x A100 SXM4 80GB 8x A100 SXM4 80GB 1x A100 PCIe 80GB 2x A100 PCIe 80GB 4x A100 PCIe 80GB
MILC Total Time (sec) Apex Medium no 31,577 2,035 1,188 625 358 2,090 1,119 612
MILC NRF Apex Medium yes 1x 14x 24x 45x 78x 13x25x46x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3.b04

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes64.491763506871,3811733406871,370
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x3x5x11x21x3x5x11x21x
NAMD [apoa1_nptsr_cuda]ns/dayapoa1_nptsr_cudayes65.191803587141,4211783507041,409
NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x3x5x11x22x3x5x11x22x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes71.142204368671,7342164248501,705
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x3x6x12x24x3x6x12x24x
NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes6.58152958116142857114
NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x2x4x9x18x2x4x9x17x
NAMD [stmv_nptsr_cuda]ns/daystmv_nptsr_cudayes6.71153059119152958117
NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x2x4x9x18x2x4x9x17x
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes6.97173469137173367135
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x2x5x10x20x2x5x10x19x

NAMD is measured by running multiple independent instances using MPS


SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

devel_d2105bb

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
SPECFEM3DTotal Time (Sec)four_material_simple_modelno3867740211379412215
SPECFEM3DNRFfour_material_simple_modelyes1x4x11x20x33x4x11x20x30x


Detailed A30 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

22.4-AT_23.4

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A302x A304x A308x A30
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.1489180356732
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x9x18x35x72x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.4491183365743
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x9x18x35x71x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes59.604088221,6253,334
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x7x14x27x56x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes55.294158411,6693,401
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x8x15x30x62x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes234.701,5143,0055,97412,461
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x6x13x25x53x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes232.721,5403,0536,17112,483
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x7x13x27x54x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes2.982958116231
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x10x19x39x78x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.6297194388775
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x4x8x15x30x

AMBER is measured by running multiple independent instances using MPS


FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.0.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A302x A304x A308x A30
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no18697492617
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x2x5x9x14x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno257142733924
Fun3D [waverider-5M]NRFwaverider-5Myes1x2x6x10x17x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno735-20110659
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x-5x10x18x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno1,088--15283
Fun3D [waverider-20M]NRFwaverider-20Myes1x--10x18x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno3,117---235
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x---19x

GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A302x A304x A308x A30
GTCMpush/Secmpi#proc.inyes892875331,0501,872
GTCNRFmpi#proc.inyes1x3x6x13x22x

LAMMPS

Molecular Dynamics

Classical molecular dynamics package

VERSION

Stable_2Aug2023

ACCELERATED FEATURES

  • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A302x A304x A308x A30
LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.37E+063.09E+065.86E+061.04E+071.40E+07
LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x3x5x11x15x
LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes5.09E+051.12E+062.23E+064.43E+068.55E+06
LAMMPS [SNAP]NRFSNAPyes1x2x6x10x19x
LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes1.03E+082.67E+084.97E+088.58E+081.10E+09
LAMMPS [Tersoff]NRFTersoffyes1x3x5x10x13x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_a2f9e61

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A302x A304x A308x A30
MILCTotal Time (sec)Apex Mediumno31,5774,7012,0301,084713
MILCNRFApex Mediumyes1x6x14x26x39x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3.b04

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A302x A304x A308x A30
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes64.49-183367728
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x-3x6x11x
NAMD [apoa1_nptsr_cuda]ns/dayapoa1_nptsr_cudayes65.19-188376746
NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x-3x6x11x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes71.14111222445886
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x2x3x6x12x
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes6.97--3469
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x--5x10x

NAMD is measured by running multiple independent instances using MPS


SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

devel_d2105bb

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A302x A304x A308x A30
SPECFEM3DTotal Time (Sec)four_material_simple_modelno386159814223
SPECFEM3DNRFfour_material_simple_modelyes1x2x4x10x19x


Detailed A40 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

22.4-AT_23.4

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A402x A404x A408x A40
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.1497197397819
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x10x19x39x81x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.4499200403839
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x9x19x39x80x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes59.604919961,9884,028
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x8x17x33x68x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes55.295031,0142,0404,248
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x9x18x37x77x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes234.701,9253,8847,76916,230
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x8x17x33x69x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes232.721,9523,9778,03716,580
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x8x17x35x71x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes2.983263127254
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x11x21x43x85x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.62119238475950
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x5x9x19x37x

AMBER is measured by running multiple independent instances using MPS


GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

2024

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A402x A404x A408x A40
GROMACS [ADH Dodec]ns/dayADH Dodecyes1893146251,1132,534
GROMACS [ADH Dodec]NRFADH Dodecyes1x2x3x6x13x

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A402x A404x A408x A40
GTCMpush/Secmpi#proc.inyes893035531,0881,926
GTCNRFmpi#proc.inyes1x3x7x13x23x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_a2f9e61

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A402x A404x A408x A40
MILCTotal Time (sec)Apex Mediumno31,5776,0053,0941,7011,034
MILCNRFApex Mediumyes1x5x9x17x27x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3.b04

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A402x A404x A408x A40
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes64.49105211423845
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x2x3x7x13x
NAMD [apoa1_nptsr_cuda]ns/dayapoa1_nptsr_cudayes65.19109221441885
NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x2x3x7x14x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes71.141462955931,187
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x2x4x8x17x
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes6.9711214285
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x2x3x6x12x

NAMD is measured by running multiple independent instances using MPS


SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

devel_d2105bb

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterDual Sapphire Rapids 8480+ (CPU-Only)1x A402x A404x A408x A40
SPECFEM3DTotal Time (Sec)four_material_simple_modelno3862031035334
SPECFEM3DNRFfour_material_simple_modelyes1x2x3x8x13x