NVIDIA Clara Parabricks Pipelines:Variant Processing: VQSR
Accelerated variant filteration using VQSR
Build a recalibration model to score variant quality and apply a score cutoff to filter variants.
Quick Start
CLI
$ pbrun vqsr --in-vcf sample.vcf \
--out-vcf output.vcf
--out-recal output.recal \
--out-tranches output.tranches \
--resource omni,known=false,training=true,truth=true,prior=12.0:1000G_omni2.5.hg38.vcf \
--annotation QD --annotation MQ --annotation MQRankSum -annotation ReadPosRankSum
Compatible GATK4 Command
gatk VariantRecalibrator -V sample.vcf \
-O output.recal \
--tranches-file output.tranches \
--resource omni,known=false,training=true,truth=true,prior=12.0:1000G_omni2.5.hg38.vcf \
-an QD -an MQ -an MQRankSum -an ReadPosRankSum \
--mode BOTH
gatk ApplyVQSR -V sample.vcf \
--recal-file output.recal \
--tranches-file output.tranches \
-O output.vcf \
--mode BOTH
Options
Path to the input vcf file.
Path to the output vcf file.
Path to the output recal file.
--out-tranches (required)
Path to the output tranches file.
Known, truth, and training sets. The format string is
<set name>,known=<boolean>,training=<boolean>,truth=<boolean>,prior=<float>:<path to the vcf file>. There must be at least one resource that is training and one resource that is truth. Any resource can be both. This option can be used multiple times.
Annotation which should be used for calculations. This option can be used multiple times.
Defaults to BOTH.
Type of variants to include in the recalibration. Possible values are SNP, INDEL, orBOTH.
Defaults to 8.
Max number of Gaussians for the positive model.
--truth-sensitivity-level
The truth sensitivity level at which to start filtering..
The VQSLOD score below which to start filtering.