Predicting sgRNA efficiency with quantum biology and AI

Written by Aisha Al-Janabi (Content Editor)

Researchers advance understanding of single guide RNA (sgRNA) design to optimize CRISPR/Cas9 cutting efficiency using an explainable AI model.

Researchers at Oak Ridge National Laboratory (TN, USA) used AI, quantum biology and bioengineering to improve sgRNA design for gene editing with CRISPR/Cas9, especially in microbes. This could have implications for drug development and modifying microbes for renewable fuels and chemical production.

CRISPR/Cas9 relies on a sgRNA to direct the Cas9 enzyme to bind and cleave the target site on the genome. Current models to computationally predict effective guide RNAs for CRISPR tools are created using data from a few model species and are inconsistent when used in microbes. “Few have been geared towards microbes where the chromosomal structures and sizes are very different,” explained Carrie Eckert, a co-author of the paper. “We had observed that models for designing the CRISPR Cas9 machinery behave differently when working with microbes and this research validates what we’d known anecdotally.”


More compact CRISPR enzyme engineered

Researchers have modified the AsCas12f enzyme to enhance its gene-editing activities while maintaining its small size.


To improve the design of sgRNA, the researchers looked at what was going on in the cell nuclei using quantum biology, investigating the effects that electronic structure can have on chemical properties and the interactions of nucleotides. Electron distribution influences the likelihood that the Cas9 enzyme–guide RNA complex will effectively bind with the microbe’s DNA.

The researchers built an explainable AI model called iterative Random Forest (iRF), trained on a dataset of around 50,000 guide RNAs that target the genome of Escherichia coli. Explainable AI models are designed to help scientists understand the underlying biological mechanisms, unlike black box algorithms that lack interpretability and keep internal workings hidden from the user.

The model revealed key features about nucleotides, which can improve the selection and design rules of sgRNAs for optimal cutting efficiency. “The model helped us identify clues about the molecular mechanisms that underpin the efficiency of our guide RNAs, giving us a rich library of molecular information that can help us improve CRISPR technology,” explained co-author Erica Prates.

The iRF model was validated using CRISPR/Cas9 cutting experiments on Escherichia coli with a group of guides selected by the model.

Improved guide RNAs will reduce costly ‘typos’ in an organism’s genetic code. “The better we understand the biological process at play and the more data we can feed into our predictions, the better our targets will be, improving the precision and speed of our research,” commented Eckert.