Healthcare and Life Sciences

Deloitte Builds Drug Discovery Pipelines With Generative AI in a Few Clicks


NVIDIA DGX™ Cloud on Oracle Cloud Infrastructure (OCI) is enabling Deloitte to accelerate drug discovery in its Quartz Atlas AI solution with generative AI. They’re using large language model (LLM)-powered knowledge graphs, scientific pipelines with NVIDIA BioNeMo™, custom models, and even training their own chemistry language models (CLMs) and protein language models (PLMs) before seamlessly deploying at scale with NVIDIA NIM inference microservices.


Deloitte Consulting LLP

Use Case

Generative AI / LLMs


NVIDIA AI Enterprise

Accelerating Medical Breakthroughs Through AI Innovation

As the research powerhouse of the world's top consulting service provider1 , Deloitte's Center for Integrated Research is dedicated to exploring transformative opportunities across industries. With a strong emphasis on healthcare, the research team set out to leverage the potential of AI in accelerating the drug discovery process. Drug discovery is a lengthy and costly process, taking over 10–15 years and costing, on average, over $1–2 billion for each new drug to be approved for clinical use. Those costs are accompanied by a daunting 90 percent failure rate2 . Given this, Deloitte’s team recognized the need for enhanced preclinical models, rigorous target validation, and improved decision-making strategies before embarking on clinical trials. Their goal was to significantly lower the rate of unsuccessful trials, ultimately improving the drug development journey.

1 Deloitte. Deloitte Ranked No. 1 Consulting Service Provider Worldwide by Revenue in Gartner® Market Share Report. July 2023.

2 NIH National Library of Medicine. Why 90% of Clinical Drug Development Fails and How to Improve It? July 2022.

Quartz Atlas AI visually represents the connections between the Bet-v-1 birch allergen protein and associated entities. These connections originate from both wet-lab experiments and LLM-generated links sourced from PLMs or CLMs. Integrating experimental data with learned world models from PLMs and CLMs enriches scientists' understanding by providing a multimodal context.

Unraveling Insights From Vast Multimodal and Multi-Domain Datasets

Advancing a drug candidate to phase I clinical trials is a significant milestone for pharmaceutical companies. However, nine out of 10 candidates fail during subsequent phases, including phase I, II, and III trials, reflecting the challenges of drug discovery. This complex pipeline begins with identifying disease-related targets, screening compounds for effectiveness, optimizing lead compounds for safety and efficacy, conducting preclinical testing, and progressing successful candidates through clinical trials. Integrating data into this process is a major challenge, from integrating diverse biological data sources in target identification to analyzing massive datasets in screening. Recognizing the importance of data integration in AI-powered drug discovery, Deloitte sought to use generative AI to streamline the process, aiming to save time and costs.

"As researchers, we often deal with multimodal data, from text to graphs and images, spanning various scientific domains. We read through patents and scour through research papers looking for information on antibodies and understanding the relationships between molecules,” said Dan Ferrante, AI leader for innovation and R&D at Deloitte Consulting LLP. “We wanted to harmonize this fragmented multimodal data coming from dozens of open-source datasets, including versions of archives like PubMed, the Uniprot dataset for proteins, antibody datasets, small molecules datasets, etc. These resources play a crucial role in everyday decision-making regarding biologics and small molecules. The challenge was not only to input these large volumes of data into advanced deep learning models, but also to train them on custom large language models for both protein and chemoinformatics to analyze and learn patterns for accurate predictions. This research required a robust AI computing infrastructure and a highly optimized software stack."

  • Running experiments on DGX Cloud boosted developer productivity by 50 percent, while streamlining multi-node training saved 7–10 months of setup time.
  • With BioNeMo from NVIDIA AI Enterprise and DGX Cloud, the work of assembling a pipeline that once took 4-6 weeks can now be accomplished with just a few clicks, allowing researchers to dive directly into projects.

Quartz Atlas AI showing an interactive knowledge graph providing deep levels of GenAI-enabled semantic enrichment (LLMs, pLMs, cLMs, etc) on multimodal data through connections and relationships between data points.

Rapid Experimentation With a Scalable Platform and Customizable Generative AI Models

Protein structure prediction aims to anticipate how a protein will fold into its natural shape, which is crucial for understanding its function in the body and identifying potential targets for drug therapies. Deloitte has developed Quartz Atlas AI, an AI drug discovery accelerator that analyzes amino acid sequences (the building blocks of proteins) to determine the best folding method, which can be given by either a protein language model or a fold-style method. This process quickly generates 3D structures and predicts how drugs may bind to specific parts of the protein. A downstream generative AI model further refines the structure of the protein or molecule to pinpoint regions within it that are likely to interact with drugs (overlaying a heatmap of druggable hotspots), aiding in drug development efforts.

"To successfully bring data and scientific pipelines together, we combined NVIDIA’s BioNeMo microservices for optimized structure prediction and Deloitte’s proprietary generative AI models, which are trained with DGX Cloud on Oracle Cloud Infrastructure,” said Ferrante. “We created a robust generative AI-powered knowledge graph with Atlas AI, loading over a dozen datasets, which amounts to 12 million nodes and 97 million connection edge links, totalling 5 terabytes in raw volume, searchable within seconds. We’re able to feed this large amount of multimodal data into our models, map the solution space, and analyze patterns and make predictions. The ability to train against extensive datasets and scale efficiently was made possible by leveraging DGX Cloud and its capability to make multi-node jobs easy. DGX Cloud on OCI provided us access to the latest NVIDIA architecture and low-latency fabric that enabled workload scaling across interconnected clusters optimized for peak performance on our most demanding workloads."

Deloitte is using NVIDIA BioNeMo models, available as NVIDIA NIM microservices, including AlphaFold2, OpenFold, and ESMFold for protein structure prediction, alongside MegaMolBART and MolMIM for molecule generation. By seamlessly mapping these molecules into the solution space, they can easily find similar molecules with corresponding properties, such as toxicity or solubility. This meticulous process is crucial in drug discovery, facilitating the efficient selection of potential candidates, accurate prediction of safety and efficacy, and exploration of diverse chemical spaces. To gain further insights, Deloitte fine-tuned a 15 billion-parameter ESM2 model for predicting protein properties on DGX Cloud, which was utilized by a downstream model to generate novel protein sequences with specific desired properties.

 NVIDIA BioNeMo Framework optimizes training protein

The NVIDIA BioNeMo framework delivers optimized model architectures and tooling for training protein and small-molecule LLMs.

A Boost in Developer Productivity, Alongside Unconstrained Model Size and Scale

Ferrante commented, “In the biology field, many professionals don’t want to deal with the intricacies of infrastructure and write code. However, leveraging the tools and software within DGX Cloud has streamlined this process. With just a few clicks, our developers can select a container and access a notebook, eliminating the need to Secure Shell into the nodes directly. By enabling us to easily run multiple experiments compared to our previous solution with great visibility into the job queue, DGX Cloud has boosted developer productivity by 50 percent.”

“Due to the scale of our datasets, multi-node training was crucial. Previously, orchestrating multi-node training was a manual process, and we had never attempted it on a cloud platform. With DGX Cloud, multi-node training is now as easy as clicking a button, saving us seven to 10 months of infrastructure and tooling work that included hardware setup, container creation, and workload distribution. As a result, our models are no longer constrained by size or data scale, and our training runs have been reduced from four weeks to just eight hours.”

“Previously, constructing the drug discovery pipeline was a laborious process, requiring us to meticulously reverse engineer and debug every line of code, while tracking changes and managing multiple versions. It used to take four to six weeks to assemble a pipeline, but now, with just a few clicks, we can dive straight into projects. Thanks to the scalability of BioNeMo models and the ease of deployment through NVIDIA NIM, R&D tasks have become much smoother. Fine-tuning foundation models from BioNeMo on DGX Cloud and implementing an inference loop have further strengthened the pipeline's robustness,” said Ferrante.

“With Atlas AI in place, Deloitte can provide users with scientific pipelines to obtain actionable insights by combining multiple models together. For example, instead of just folding a molecule or computing a property, it can provide a comprehensive report containing folded structures or properties, equipping users with all the information needed to make informed decisions about the viability of a solution. It can also show relationships between protein structures graphically and their connections, further aiding in understanding complex molecular interactions.”

Beyond a powerful platform, the one-stop team of experts from NVIDIA Enterprise Services was invaluable. “We benefited from NVIDIA's end-to-end support, ranging from platform assistance for multi-node training setup and container updates to application-level guidance, leveraging their extensive expertise in healthcare frameworks and models to optimize our AI models effectively,” said Ferrante.

“By enabling us to easily run more concurrent experiments compared to our previous solution with great visibility into the job queue, DGX Cloud has boosted developer productivity by 50 percent.”

Dan Ferrante
AI Leader for Innovation and R&D Deloitte Consulting LLP

“With DGX Cloud, multi-node training is now as easy as clicking a button, saving us seven to 10 months of infrastructure and tooling work… Our training runs have been reduced from four weeks to just eight hours.”

Dan Ferrante
AI Leader for Innovation and R&D Deloitte Consulting LLP

Looking Forward

“One of the direct applications of Atlas AI was the ability for us to use AI to take drugs that are FDA approved and design in silico a better, patentable version of the molecule. We are able to now load all the drugs that have been patented and all that have been approved by the FDA. Our trained model enables us to identify potential starting compounds with established target binding. Finding viable drugs is extremely challenging due to the vast number of potential compounds and the need for specific properties, making it akin to solving a complex optimization problem. MolMIM, part of NVIDIA BioNeMo and available as a NIM microservice, helps our researchers find molecules with the ideal properties for drug development by maximizing a user-defined scoring function. Using MolMIM, we generate novel compounds, which are optimized for various molecular aspects such as enhanced binding, intestinal permeability, solubility, and prolonged half-life,” Ferrante added.

Deloitte plans to further enhance Atlas AI by integrating it into various healthcare and life sciences applications, such as precision medicine and voice-of-the-patient insight, to enhance patient engagement and optimize health outcomes. “Leveraging BioNeMo and DGX Cloud, we can seamlessly establish a standardized training pipeline for diverse domains, enabling us to fine-tune it for specific protein classes or antibody structure predictions effortlessly,” said Ferrante.

MolMIM performs controlled generation to find molecules with the right properties.

“Leveraging BioNeMo and DGX Cloud, we can seamlessly establish a standardized training pipeline for diverse domains, enabling us to fine-tune it for specific protein classes or antibody structure predictions effortlessly.”

Dan Ferrante
AI Leader for Innovation and R&D Deloitte Consulting LLP


  • Improved developer productivity by 50 percent
  • Saved 7–10 months by eliminating manual setup for multi-node training
  • Reduced training from four weeks to eight hours
  • Reduced drug discovery pipeline development time from 4–6 weeks to just a few clicks

The fastest place to start building generative AI applications is on DGX Cloud, an AI platform for developers.