Google Cloud Platform

Run Parabricks on Google Cloud Platform (GCP)


Overview

  • Parabricks provides a VM that can be easily run on GCP.
  • This VM comes pre-installed with the Parabricks genomic pipeline and algorithms.
  • By the end of this tutorial, you will having a running instance of Parabricks deployed on GCP and should be able to SSH into the instance to run genomic analyses (for example, germline pipeline).
  • This installation method will work well for customers who want to run one or more servers on GCP with Parabricks installed. For advanced use cases, you also have the option of installing the software yourself (see the installation guide for Server Installation) on local servers.

Obtain a license key

Running Parabricks on GCP requires a valid software license. Free trials are available so you can test the software without incurring licensing fees from Parabricks. You can contact support@parabricks.com if you'd like to request a license for a software pilot. (Note: GCP will charge its own fees for any computing resources that you consume.)


Set Required GCP Quotas

By default, GCP Compute Engine has resource quotas in place. Parabricks requires some special compute resources which are not enabled under the default quotas. In order to run Parabricks you will need to increase some of the quotas.

Quotas for regular (non-preemptible) Virtual Machine instances:

Quota Name

Minimum Per Parabricks Node

NVIDIA V100 GPUs

4-8 when using V100 GPUs

NVIDIA P100 GPUs

4 when using P100 GPUs

CPUs

48 for nodes with 4 GPUs.
80 for nodes with 8 GPUs.

Local SSD (GB)

(Optional) If you attach a data volume to the App in Google Martketplace, it must be smaller than this quota.


Quotas for preemptible Virtual Machine instances: The quotas for preemptible instances are the same as the ones above except that GCP adds the word "Preemptible" in front of them. For example, "CPUs" becomes "Preemptible CPUs" and "NVIDIA V100 GPUs" becomes "Preemptible NVIDIA V100 GPUs".


Run a VM with Parabricks on GCP

 

Permissions Required. In order to deploy Parabricks to GCP using the Marketplace App, you will need the following permissions: create a new VM (including a new IP address, firewall), create deployments using Deployment Manager, access the GCP Marketplace, connect via ssh to your instance and (possibly) increase quota limits so that you can successfully deploy the solution without exceeding your project's quotas


Step 1: Visit your GCP Console and choose the project in which you want to launch Parabricks. Find the Marketplace and click on it.


Step 2: Search for the "Parabricks" application in the Marketplace and select it. Choose "Launch".

Step 3: Verify that you are satisfied with the virtual machine settings and choose "Deploy". (The default options are suitable for many use cases but most options are configurable in case you wish to change them to suite your needs.)

Step 4: The virtual machine may take 5-15 minutes to startup. One the instance has started, you should be able to SSH into it to run genomic analysis using Parabricks.


SSH into your instance & install your license

Copy your local license file up to the virtual machine:

$ gcloud compute scp ~/your-parabricks-license.bin \
<your-instance>:~/

You can ssh into your instance directly from the GCP Console or else from your local terminal (assuming you have gcloud installed and have permission to ssh into the instance):

$ gcloud compute --project <your-project> ssh --zone <zone-of-your-instance> \
<name-of-your-instance>

Copy your license from home folder to installation location

$ sudo cp ~/license.bin /opt/parabricks

Now you are ready to run Parabricks on Google Cloud Platform


Verifying Parabricks installation on your instance

# Step 3: verify your installation.
# This should display the parabricks version number:
$ pbrun version