GPU-Accelerated Abaqus

Get started today with this GPU Ready Apps Guide.

Abaqus Standard

Abaqus/Standard employs solution technology ideal for static and low-speed dynamic events where highly accurate stress solutions are critically important. Sealing pressure in a gasket joint, steady-state rolling of a tire, and crack propagation in a composite airplane fuselage are a few use cases. Abaqus makes it possible to analyze a model both in the time and frequency domain within a single simulation.

Abaqus/Standard runs up to 3.7X faster on NVIDIA GPU accelerated systems compared to CPU-only systems, enabling users to run more finite element simulations in a workday and increasing productivity.

Read the white paper on Accelerating Abaqus simulations using NVIDIA GPUS.  See how NVIDIA GPUs accelerate Dassault Systèmes SIMULIA’s Abaqus/Standard FEA Solver here.

Abaqus Runs Up To 3.7X Faster On GPUs

Installation

Download and Installation Instructions

Abaqus standard (version R2017x), can be downloaded from the 3ds SIMULIA website and a brief installation instructions is detailed below:-

1. Download the "tar" files and the contents of these tar archives must be extracted in a single directory. This should produce a single subdirectory named "AM_SIM_Abaqus_Extend.AllOS" containing the installation software within.

2. Run the shell script and optionally install the documentation, license, and remaining applications in their proper order as described in the Installation Guide.

Abaqus Licensing Tokens — Cost Savings

Abaqus licensing is token-based so that it is flexible enough for users to run a variety of Abaqus analyses, and use Abaqus/CAE for building simulation models or for looking at results. The number of tokens is calculated from the number of CPU cores used for the simulation run. Abaqus uses the following decaying function formula to determine the number of tokens. In this formula, N is the number of CPU cores.

The graph below illustrates this equation, showing how the number of Abaqus licensing tokens increases as the number of cores increases.

How the number of Abaqus licensing tokens increases as the number of cores increases.

The blue curve represents the CPU-only case, while the green curve represents the CPU and GPU case. The staircase pattern in these curves shows how the increase in the required number of tokens decays as the number of CPU cores increases. When a GPU is included in the simulation, it is counted as a single CPU core for the purposes of calculating the number of required tokens. This way of counting a GPU is represented by the green staircase-patterned curve with the corresponding CPU+GPU core count in the secondary X-axis on the top.

The cost benefit of using GPUs for simulations is illustrated by the two sets of computing configurations indicated by the dotted lines
in Figure :

  • The first dotted line shown at the 8-core mark on the primary X-axis indicates that for 8 CPU cores, 12 tokens are required. If a GPU is included in the simulation run, the CPU core count is 9 but the number of tokens remains at 12, as shown by the single red dot.
  • The second dotted line shown at the 16-core mark on the primary X-axis indicates that for 16 CPU cores, 16 tokens are required. Adding 1 or 2 GPUs to 16 CPU cores increases the CPU core count to 17 or 18 respectively, but the number of tokens would remain at 16, as shown by the pair of red dots.

The staircase pattern shows wider steps with increasing cores for both curves, highlighting the fact that it is cost-effective when more CPU cores are used. The benefit is much higher when GPUs are used instead of additional CPUs.

Running Jobs

To run the Abaqus simulation on GPUs, the -gpus flag must be included in the command line. From Release 6.14 onwards, the DMP split feature (DMP and SMP) can be combined with GPU acceleration by adding the -threads flag or the -mp_host_split flag with the -gpus flag.

$Abaqus_2017 -interactive -j $job_name -inp $input_file_name -cpus $no_of_cpu_cores -gpus
$no_of_gpus_per_dmp -mp_host_split $no_of_dmp_per_node >& $output_file

Flags and functions

1. –cpus: specifies number of cpu cores for the job.

2. –gpus specifies number of gpus per dmp process.

3. –mp_host_split: specifies number of dmp processes per node.

4. –thread: This flag can be used instead of -mp_host_split flag and specifies number of threads per dmp process.

From Abaqus version 2016 onwards, it is no longer necessary to set the GPUs in exclusive mode. However, it is always a good practice to check if GPUs are over-subscribed when multiple Abaqus jobs are running. If so, set the GPUs in exclusive mode for the DMP processes to go to separate GPUs. The GPUs are set in exclusive mode by running the following nvidia-smi command:

$nvidia-smi -c 3

With a two-CPU socket machine, create two DMP processes and use two GPUs —one GPU for each DMP process.

In addition, place a local abaqus_v6.env file with the following contents in the project/run directory to override and specify additional commands for performance improvement.

# Overwrite files without questions
ask_delete=OFF
# Modify the Host List based on the number of Compute Nodes Used and
specify the CPU cores per node accordingly
# Set mpi CPU affinity mode at socket granularity
mp_mpirun_options = “-prot -aff:automatic:bandwidth:socket”
import os
os.environ['ABA_SRM_BALANCED']='ON'

Benchmarks

Each Tesla P100 card has a single Pascal GPU and each Tesla K80 card has two Kepler GPUs. The runs were accomplished using variations of the command lines/env for different Abaqus versions. See white paper on Abaqus computing with NVIDIA GPUs for env settings and other configurations.

Abaqus/Standard 2017 performance on CPU and NVIDIA GPU systems

Recommended System Configurations

Hardware Configuration

Workstation

Parameter
Specs

CPU Architecture

x86

System Memory

48 GB or more

Disk

Minimum 500 GB

CPUs

2 CPU sockets (8+ cores, 2+ GHz)

GPU Model

Quadro GP100 for double precision compute

GPUs

1

Servers

Parameter
Specs

CPU Architecture

x86

System Memory

96-192GB

CPUs/Nodes

2 (8+ cores, 2+ GHz)

Total # of Nodes

1-10+

GPU Model

Tesla P100

GPUs/Node

1

Interconnect

Infiniband

Build Your Ideal GPU Solution Today.