GI Genius™ module integrated in the clinical setting with SaMD that detects colorectal polyps.

Healthcare and Life Sciences

MIT and Recursion Launch an Open‑Source Protein Co‑Folding Model Accelerated by NVIDIA

Overview

MIT’s Jameel Clinic and CSAIL researchers, partnering with Recursion, built Boltz‑2. This first fully open-source biomolecular foundation model co-folds protein complexes and simultaneously predicts their binding affinities with accuracy rivaling physics-based FEP methods. Leveraging the BioHive‑2 GPU supercomputer, cuEquivariance kernels, and a ready‑to‑deploy NIM™ microservice, the team trained and released an open‑sourced Boltz‑2 (code + weights under an MIT license), delivering accelerated structure‑and‑affinity predictions for broad, community‑scale drug‑discovery workflows.

Partner

Recursion

Audience

Executive

Use Case

Supercomputing, Model Development

Products

NVIDIA DGX SuperPOD with DGX H100
NVIDIA BioNeMo NIM microservice

Key Takeaways

  • cuEquivariance implementations accelerate triangle operations up to 5x, also reducing memory requirements.
  • Boltz-2 NIM accelerates protein structure prediction inference up to 2x–3x over the open-source implementation.
  • Training of the Boltz-2 model was enabled by Recursion’s BioHive-2 Supercomputer.

From AlphaFold to an Open Foundation Model

Boltz‑2 marks a decisive inflection point for AI‑driven drug discovery. Developed by MIT’s Jameel Clinic and CSAIL in partnership with Recursion, the model unifies complex‑structure prediction and binding‑affinity estimation in a single, open‑source package. Trained on Recursion’s BioHive‑2 supercomputer and accelerated by NVIDIA cuEquivariance kernels, it approaches the chemical accuracy of physics‑based free‑energy perturbation (FEP) simulations while returning results in about 20 seconds on one A100 GPU. With Boltz-2 available (code, weights) under an MIT license, any organization can retrain, fine-tune, and deploy Boltz‑2 without legal or operational friction. Enterprises looking to scale inference can also deploy the accelerated NVIDIA Boltz-2 NIM for production use cases under an NVIDIA AI Enterprise license.

The success of DeepMind’s AlphaFold2 in 2020 commoditized single-protein structure prediction, with its successor, AlphaFold3, enabling the advanced modeling of multiple molecules simultaneously (DNA, RNA, protein, and small molecules). MIT released an open-source alternative to AlphaFold3 in late 2024 with Boltz‑1, the first open model capable of co-folding biomolecular complexes at near-AlphaFold3 accuracy. Six months later, the same team introduced Boltz‑2, adding molecule-target affinity prediction functionality and retraining on ~3 million assay‑labeled examples to predict both pose and potency in one forward pass.

Across a community benchmark for affinity predictions, Boltz‑2 achieves a mean Pearson correlation of roughly 0.62–0.66, much closer than previous comparable models, yet 1,000 times faster than state-of-the-art physics pipelines. On CASP16 affinity data, the model ranked first overall, and in retrospective high-throughput screens (MF-PCBA), it doubled the average precision relative to docking and prior ML approaches. A single A100 processes a ligand‑protein pair in ~20 GPU‑seconds; scaling across BioHive‑2 lowers wall‑clock times for million‑compound libraries to hours, not weeks.

Engineering Boltz-2 at Supercomputer Scale

Boltz‑2’s leap in capability rests on two engineering pillars.

First, training ran on BioHive‑2, which ranks #35 on the TOP500 list of the most powerful supercomputers in the world and delivers a four‑fold speedup versus its predecessor. BioHive-2 is an NVIDIA DGX SuperPOD™ AI supercomputer, powered by 63 DGX™ H100 systems with NVIDIA H100 Tensor Core GPUs interconnected by NVIDIA Quantum-2 InfiniBand networking. BioHive-2 builds on Recursion’s success with its first supercomputer, BioHive-1, and was designed to handle the immense workloads that Recursion puts it through, processing the 50 petabytes of biological, chemical, and patient data that the company has amassed. The goal is to direct Recursion scientists to wet-lab experiments that have the most potential for success.

“With AI in the loop today, we can get 80% of the value with 40% of the wet lab work, and that ratio will improve going forward,” said Recursion Chief Technology Officer Ben Mabey.

Second, scaling a model like Boltz-2 required highly efficient training and inference. NVIDIA engineers profiled the model pre‑release and pinpointed triangle attention and triangle multiplication as the key bottlenecks. Partnering closely with the Boltz‑2 research teams, NVIDIA built and validated custom cuEquivariance kernels that deliver up to 5x faster triangle operations, reduce memory usage, and integrate seamlessly into any structure-prediction model, starting with Boltz‑2 itself, delivering up to 2x–3x training and inference speedups. These kernels now power the trunk and PairFormer layers, ensuring quadratic scaling across GPUs compared to cubic complexity in the original implementation. cuEquivariance is available for developers on GitHub and is already becoming widely adopted in the ecosystem.

“These kernels are long-awaited and will become an integral part of the Boltz family of models, helping address the bottlenecks in speed and memory consumption,” said Gabriele Corso, researcher at MIT.

Enterprise‑Ready NIM Deployment

Boltz‑2 NIM is NVIDIA’s day‑zero, production‑ready packaging of MIT and Recursion’s open‑source Boltz‑2 foundation model, delivered as an accelerated API microservice. The microservice ingests protein, DNA, RNA, or ligand sequences and returns high-confidence 3D complex structures and binding affinity predictions at up to 2x–3x acceleration over the open-source alternative.

For enterprise biopharma, this translates into higher throughput and materially lower compute spend, compressing structure-affinity design cycles and accelerating your R&D pipeline. While the Boltz-2 NIM can be tested in R&D environments at no cost here, product-grade deployment requires NVIDIA AI Enterprise support.

Learn about the Bolz-2 NIM and the rest of the NVIDIA AI platform for healthcare and life sciences today.

Learn more about NVIDIA solutions for healthcare and life sciences.

Related Customer Stories