Accelerating drug discovery with “paradigm shifting” AI model

2 Apr 2026

Written by Maddy Chapman (Digital Editor)

Computational biology Drug discovery and development News

“It’s like a paradigm shift approach… to drive discovery”: a new machine-learning model predicts how molecules will influence gene expression and has been used to pick out promising drug candidates for two tough-to-treat diseases.

In a multi-institute collaboration led by Michigan State University (MI, USA), scientists have created a machine-learning-based drug discovery platform, guided by transcriptomic features, which can be used to screen large compound libraries and optimize lead molecules. Potential therapeutics identified in the study were subject to real-world testing in human cell lines and animal models, ultimately yielding promising new drug candidates for two hard-to-treat diseases: hepatocellular carcinoma (HCC) – the third leading cause of cancer-related death worldwide – and the rare chronic lung disease idiopathic pulmonary fibrosis (IPF).

Identifying drugs that reverse the expression of disease-associated transcriptomic features has been widely explored for identifying drug repurposing candidates, but its potential for de novo drug discovery remains underexplored.

To implement such an approach for screening ultra-large compound libraries, gene expression profiles of the compounds are required. These can be used to train machine learning models so that they can infer gene expression based solely on chemical structures. Despite recent successes demonstrating the potential of using this method in preclinical drug discovery, so far, studies have only included commonly studied compounds, and they have not yet investigated novel compounds or performed lead optimization, an essential step in early drug discovery.

Integrating computational and experimental techniques to decipher neuronal heterogeneity

Here, Andreas Pfenning (Carnegie Mellon University, PA, USA) shares the experimental and computational techniques he’s using to investigate cell heterogeneity in the brain.

In an attempt to fill this gap in the literature, the researchers present ‘gene expression profile predictor on chemical structures’, or GPS – a drug discovery system for the screening of large compound libraries and the design of new compounds that can revoke the transcriptional phenotype.

Firstly, they trained GPS on millions of published experimental measurements, covering more than 70 cell lines. This included gene expression change readings for 978 landmark genes of four commonly studied cell lines that the team decided to focus on: MCF7, HEPG2, PC3 and VCAP.

Then, they used the model to screen a large pool of compounds and identify and validate promising candidates for multiple diseases, honing in on HCC and IPF, both of which urgently need new, effective therapeutics.

Using the human HCC cell lines Hep3B, HepG2 and Huh7, as well as HCC and IPF animal models and IPF human lung tissue samples, the researchers identified and validated therapeutic candidates that reverse disease-associated gene expression. For HCC, they discovered two unique compounds, while for IPF, they identified one repurposing candidate and one novel anti-fibrotic molecule.

In doing this, they have demonstrated the potential of applying a transcriptomics-based approach as a means of discovering new therapeutic targets for treating disease. Hoping to further future drug discovery efforts, the team has shared its code and developed a web portal to allow other researchers to use GPS for virtual compound screening.

“It’s like a paradigm shift approach for people to drive discovery,” Bin Chen, one of the study’s senior authors, declared. “I want more people to test this approach. But most importantly, I want people really to be able to use it to discover new therapeutics.”

“I think it already has been proved that this platform can be applied to two totally different diseases,” another senior author, Xiaopeng Li, added. “So this platform can be used for other diseases, to just unleash the potential.”

Source

Xing J, Tan M, Leshchiner D et al. Deep-learning-based de novo discovery and design of therapeutics that reverse disease-associated transcriptional phenotypes. Cell doi: 10.1016/j.cell.2026.02.016 (2026) (Epub ahead of print).

Accelerating drug discovery with “paradigm shifting” AI model

Source

Click here to view the press release

Office info