Sequanta Technologies

Healthcare and Life Sciences

Sequanta Technologies Accelerates Methylation Alignment 21x With NVIDIA Parabricks

Sequanta Technologies

Objective

Sequanta Technologies, one of the leading multi-omics and sequencing service providers in China, uses NVIDIA® Parabricks® to accelerate multi-omics analysis. With Parabricks, Sequanta Technologies was able to reduce whole-genome sequencing (WGS) from 7 hours to 31 minutes and 5 seconds—almost 14x faster than CPU. Sequanta Technologies also experienced significant speedups compared to its traditional alignment method with 21x acceleration with Parabricks BWA-Meth.

Customer

Sequanta Technologies

Use Case

Accelerated Computing Tools & Techniques
Data Science

Products

NVIDIA Parabricks
NVIDIA RAPIDS
NVIDIA T4

About Sequanta Technologies

By introducing cutting-edge multi-omics technologies to the China market and serving a wide commercial user base, Sequanta Technologies is one of the leading omics service providers in China. Sequanta Technologies provides next-generation sequencing (NGS) and multi-omics services, generating more than 1.5 petabytes of data every month from its distributed sequencing labs. As a result, an immense amount of valuable data is produced for downstream analysis.

Sequanta Technologies is at the forefront of sequencing technology supporting genomics, transcriptomics, proteomics, microbiomics, and multi-omics—revolutionizing life sciences through precision sequencing and multi-omics solutions.

As the first NGS company using NVIDIA GPUs to accelerate segment analysis in China, Sequanta Technologies is an established leader and innovator in the space that runs two major business units:

  1. FLASH-SEQ: A multi-city NGS sequencing platform in China. With 10 laboratories, Sequanta Technologies is one of the largest service platforms in China that is an NGS sequencing service vendor.
  2. Sequanta Technologies Multi-Omics: One of the largest multi-omics vendors in China that focuses on pharmaceutical companies for clinical and research stages. Since 2021, Sequanta Technologies has conducted more than 300 cohorts in China.

“We provide our customers with the total solution from the wet lab to the dry lab,” says George Fei, cofounder and CIO at Sequanta Technologies. “We saw the potential of GPUs to accelerate discovery in life sciences. We are doing a lot of revolutionary activities in our industry to digitize our productivity for NGS.”

Accelerating Analysis With NVIDIA

As one of the largest sequencing centers in China, an immense amount of data is generated from Sequanta Technologies’ sequencers. As a result, Sequanta Technologies needed a solution that would be able to accommodate large-scale datasets and streamline analysis. The team implemented NVIDIA Parabricks, a scalable genomics software suite for secondary analysis, that provides GPU-accelerated versions of trusted, open-source tools.

“In 2021, we introduced NVIDIA Parabricks and GPUs to the team and implemented these technologies to accelerate the workload of the multi-omics analysis,” recalls Fei. “This collaboration is to leverage the Parabricks platform to accelerate multi-omics analysis. We are seeing the potential of AI to enable research and accelerate the progress of research to enable our customers.”

Genomic Processing Acceleration With Parabricks

Sequanta Technologies used Parabricks to accelerate both whole exome sequencing (WES) and whole genome sequencing (WGS). However, the company experienced significant delays using the Genome Analysis Toolkit (GATK) and CPUs. Using GATK, WES took 15 hours to complete and WGS took 50 hours to complete. After applying CPU acceleration to both, WES dropped down to 2 hours, and WGS resulted in 7 hours.

However, a significant speedup resulted when Parabricks was applied to these assay types. With Parabricks, WES took only 2 minutes and 37 seconds—almost 46x faster compared to the CPU-accelerated method, and almost 344x faster than GATK. WGS took 31 minutes and 5 seconds—almost 14x faster than CPU acceleration, and almost 97x faster than GATK.

That acceleration directly impacts patient outcomes, since data analysis was typically a bottleneck. “GATK workloads cost us more than 30–50 hours to get a single sample result from WGS data,” explains Jiawei Wang, bioinformatics director at Sequanta Technologies. “Using Parabricks, we can reduce the time to less than 1 hour.”

“GATK workloads cost us more than 30–50 hours to get a single sample result from WGS data. Using Parabricks, we can reduce the time to less than 1 hour.”

Jiawei Wang
Bioinformatics Director, Sequanta Technologies

Assay Parabricks CPU Acceleration GATK
WES (13 GB) 2 minutes, 37 seconds ~2 hours ~15 hours
WGS (117 GB) 31 minutes, 5 seconds ~7 hours ~50 hours

Data and benchmarks provided by Sequanta Technologies.

Methylation Alignment With Parabricks

Not only are Parabricks’ results consistent with open-source tools, making it valuable for reproducibility and transparency, but it also provides significant acceleration to historically time-intensive analysis steps. In addition to WES and WGS acceleration, Sequanta Technologies also wanted to improve methylation alignment.

For alignment of bisulfite-treated DNA sequencing reads (BS-Seq), BWA-Meth is used to detect DNA methylation. On a 110-gigabyte dataset, alignment took 21 hours to complete using the traditional alignment method. Using eight NVIDIA T4 GPUs and the GPU-accelerated version of BWA-Meth in Parabricks, alignment runtime decreased to only 1 hour. This resulted in a 21x acceleration improvement using Parabricks for methylation alignment versus the traditional method.

Pipeline Data Volume Alignment Runtime
Parabricks: PB-bwa-methyl (GPU accelerated) 110 GB 1 hour
Sequanta Technologies Bismark Pipeline 110 GB 21 hours

Data and benchmarks provided by Sequanta Technologies.

Single-Cell Analysis With NVIDIA CUDA-X Data Science Libraries

Sequanta Technologies supports workloads from a wide range of use cases and applications—including single-cell analysis. However, single-cell data processing can be incredibly time-intensive, particularly as dataset sizes continue to increase. 

NVIDIA CUDA-X™ Data Science (RAPIDS) is an open-source suite of GPU-accelerated data science and AI libraries that improves performance across data pipelines. CUDA-X DS is often used in genomic applications for single-cell and tertiary analysis. On a dataset of 70,000 human lung cells, Sequanta Technologies witnessed significant speedups in the preprocessing step using CUDA-X DS compared to Scanpy. With Scanpy, preprocessing took 37 minutes to complete, and with CUDA-X DS, the step took ~22 seconds—almost a 101x speedup.

Analysis Step NVIDIA CUDA-X DS (GPU) Scanpy
Preprocessing 22.35 seconds 37 minutes

Data and benchmarks provided by Sequanta Technologies.

Powering a Complete Solution

From accelerating methylation alignment to preprocessing time for single-cell analysis, Sequanta Technologies uses a breadth of NVIDIA technology to reduce runtimes. Leveraging both NVIDIA hardware and software—including T4 GPUs, Parabricks, and CUDA-X DS—Sequanta Technologies accelerates traditionally time-intensive processes. As a result, NVIDIA powers a complete solution that enables Sequanta Technologies to address a wide range of diverse use cases and deliver immediate value to their customers.

Learn more about NVIDIA solutions for genomics.

Related Customer Stories