Healthcare & Life Sciences

A New Molecular Language for Generative AI in Small-Molecule Drug Discovery

Objective

Using their massive volumes of precise, experimental data, Terray Therapeutics leverages NVIDIA DGX™ Cloud to train foundation models for chemistry and generative AI to design small molecules.

Customer

Terray Therapeutics

Use Case

Generative AI

Products

NVIDIA Base Command Platform
NVIDIA DGX Cloud
NVIDIA AI Enterprise

The chemical compound space is functionally infinite, with over 1060—or novemdecillion—possible drug-like molecules. The goal of small-molecule drug discovery is to explore this vast chemical space in search of a handful of molecules that satisfy a multi-parameter optimization problem. Typical drug discovery programs are highly inefficient and fundamentally constrained, as they can only explore a few dozen to a few hundred compounds per week.

Terray Therapeutics’ goal is to change the way small-molecule therapeutics are discovered and developed. The company’s platform uniquely blends experimentation and computation to deliver on the promise of generative AI for small-molecule drug discovery—finding solutions to the toughest therapeutic challenges. Terray believes that high-quality, scaled data is the answer to unlocking generative AI for small molecules, and everything the company does is grounded in an iterative approach, producing massive amounts of precise, purpose-built data that enables generative optimization of small molecules. With an equal emphasis on novel wet lab science and AI, Terray improves human health by transforming the speed, cost, and success rate of small-molecule drug development.

Terray Therapeutics

Image courtesy of Terray Therapeutics

Scaling Model Development to Leverage Billions of Data Points

The Terray platform measures hundreds of millions of interactions between small molecules and biological targets daily, with a growing database of 50 billion experimental biophysical measurements. This precision enables novel solutions, parallel target screening, and rapid hit-to-lead efforts with millions of molecules.

The first step in using generative AI in drug discovery is having vast amounts of precise, experimental data, including many promising starting points for drug design. But equally important is the ability to compute on this data to design actionable molecules. To translate between the language of molecules and the language of computation (and vice versa), Terray has developed COATI, a multimodal encoder-decoder model for chemical space. The model converts chemical structures into useful numerical representations to process data more efficiently with AI. A molecule’s numerical representation can be used as input to “decode,” or generate, molecules with desired properties, enabling generative molecular design.

When Terray was first developing COATI, they were using a mix of systems, including on-premises GPU-based servers and traditional cloud services. Initially, this infrastructure was functional, until they scaled up their models. As models became bigger and more complex, provisioning and configuration of distributed training runs became challenging.

"I would spend hours setting up training runs, and it was very tedious," said Edward Williams, machine learning engineer at Terray. "For distributed training, we utilize torchrun. As we scaled up our models, it became more and more difficult to allocate resources and ensure that training code was synchronized across all nodes. Tracking and handling failures was similarly tedious—if something failed, I'd learn after the fact rather than immediately. The time it took to just set up training runs, the manual process of propagating changes across nodes, coupled with the inability to know if I can get an additional node to run my experiments on, was hindering experimentation and our team's ability to scale our research efforts."

  • Small-molecule drug discovery involves exploring a chemical space that's functionally infinite, with typical approaches only able to explore a few dozen to a few hundred compounds per week.
  • Terray Therapeutics pioneers generative AI for small-molecule drug discovery, driven by high-quality, scaled data and a blend of experimentation and computation.
  • Terray developed COATI, a foundation model for chemistry pretrained on a dataset of hundreds of millions of small molecules. COATI translates molecules into mathematical representations, enabling generative AI to design novel, optimized molecules.
  • NVIDIA DGX Cloud significantly improved the COATI development process, reducing model training from a week to just a day, and enabled more efficient experimentation with dedicated GPUs and on-demand resource scaling.

Image courtesy of Terray Therapeutics

NVIDIA DGX Cloud: Dedicated Multi-Node Training Platform for Generative AI

"Because we wanted to continuously improve our invertible representation of chemical space, we needed a platform that would enable rapid experimentation along with ease of management,” said John Parkhill, director of machine learning at Terray. "DGX Cloud offered us a solution that worked seamlessly with the ease and simplicity of cloud. Its high-speed network, purpose-built for multi-node training, was particularly crucial for our needs. Because we’re dealing with datasets of terabytes or larger, we require significant computational resources to train our models effectively."

"Additionally, the ability to rapidly conduct trial-and-error experiments is highly valuable in our model development research, as identifying the most effective hyperparameters is often a challenging task. Fast job execution on DGX Cloud enabled us to quickly identify failures and make the necessary adjustments to the models. For instance, I could perform numerous ablation studies, such as disabling model features, to determine if, for example, altering elements of the transformer’s tokenizer is impactful or inconsequential," said Williams.

"Our process for setting up training jobs went from the hassle of manually pushing code to remote machines and ensuring synchronization to the simplicity of pressing ‘run’ on DGX Cloud. We didn't even have to modify our existing code by much. With the Base Command Platform, the orchestration of multi-node training jobs was essentially automated for us. This enabled us to scale in a way that would have been impossible.”

Having a fixed allocation of nodes on DGX Cloud also created greater efficiencies. "It's a very miserable experience constantly asking for GPU instances from traditional cloud services that they seem to be unable to make available. If I need a new node for an experiment I'm working on, I would not know if and when I would be able to get one. With DGX Cloud, I don't need to worry about that," said Williams.

"As a data scientist, my boundary is no longer a small GPU workstation; it's the entire cloud capacity of Terray. DGX Cloud with Base Command Platform lets me go from a single node to a 32-GPU cluster with push-button simplicity,” Parkhill added. “DGX Cloud gives us the level of abstraction our developers need so they can focus on innovation instead of infrastructure.”

Terray leverages a hybrid solution approach, where they train and build their models on DGX Cloud and deploy and run inference on their on-prem cluster with NVIDIA RTX™ A6000 GPUs. As workloads spike, DGX Cloud provides elasticity and liquidity of resources.

"NVIDIA AI experts were key to our success." Williams said. "We had a dedicated expert inspecting our logs to ensure everything ran smoothly and identifying any issues. By identifying straightforward optimizations in PyTorch and CUDA® that we hadn't thought of, they significantly improved the efficiency of our workloads. Additionally, they assisted in developing scripts that provided valuable insights into telemetry data, allowing us to monitor memory activity and enhance performance. The support from NVIDIA AI experts allowed us to shift our focus from optimizing the process to conducting experiments, as this is primarily an R&D project."

"Our process for setting up training jobs went from the hassle of manually pushing code to remote machines and ensuring synchronization to the simplicity of pressing ‘run’" on DGX Cloud.”

Edward Williams
Machine Learning Engineer, Terray Therapeutics

”As a data scientist, my boundary is no longer a small GPU workstation; it's the entire cloud capacity of Terray. DGX Cloud with Base Command Platform lets me go from a single node to a 32-GPU cluster with push-button simplicity.”

John Parkhill,
Director of Machine Learning, Terray Therapeutics

Fueling Experimentation and Model Optimization With 4X More Resource Utilization

Small-molecule research is an iterative process that involves the continuous cycle of designing, making, testing, analyzing, and refining compounds to achieve desired properties. Parkhill said, "The ease of use of DGX Cloud provided exceptional performance and helped us iterate faster in evaluating hyperparameters for COATI, enabling us to achieve 4X more utilization compared to alternative cloud services. It used to take us a week to train a model, and we were getting it done in a day."

Parkhill added, "We’re now able to easily explore the vast chemical space to find rare molecules with desired properties, like selectivity and potency. We can also instruct the model to generate candidates with specific properties for analysis or discover entirely new molecules that resemble known ones but have more optimal features."

Finding new molecules that resemble synthesized ones is important, because it serves as a valuable starting point, leveraging existing knowledge and understanding of chemical properties. This lets researchers predict behavior, including safety and efficacy, more effectively, ultimately accelerating the drug development process.

"Our model gets better through time as we generate more and more molecules in the lab and do iterative training on DGX Cloud."

“DGX Cloud's ease of use and exceptional performance helped us iterate faster in finding target molecules, enabling us to achieve 4X more utilization compared to alternate cloud services.”

John Parkhill,
Director of Machine Learning, Terray Therapeutics

Looking Ahead

The emerging field of generative molecular design and optimization has the potential to significantly improve the clinical success rate of small-molecule development. Terray's pioneering work is paving the way for industry-wide adoption of their groundbreaking model.

"The key to impactful generative AI is precise data at scale that can be iterated quickly, and we have that at Terray," said Narbe Mardirossian, chief technology officer at Terray. "Thanks to DGX Cloud, we were able to develop a molecular language that enabled efficient, constrained, generative optimization of molecules for programs in hit-to-lead and lead optimization. With these tools, we’re looking forward to bringing multiple new therapies to patients in need."

“It used to take us a week to train a model, and we were getting it done in a day.”

John Parkhill,
Director of Machine Learning, Terray Therapeutics

Results

  • Improved infrastructure utilization by over 4X versus alternate cloud services
  • Reduced training time from a week to a day
  • Took less than one day to onboard onto DGX Cloud
  • Can train multiple COATI variants in parallel to find the optimal pretrained embedding

The fastest way to get started using the DGX platform is NVIDIA DGX Cloud, a serverless AI-training-as-a-service platform purpose-built for enterprises developing generative AI.