AlphaFold computational modeling in drug discovery: the good, the bad and the AI

Written by Michael Bell (Commissioning Editor)

A new study has utilized AlphaFold machine-learning software to find potential targets in the Escherichia coli (E. coli) essential proteome for developing novel antibacterial drugs.

A team of researchers from the Massachusetts Institute of Technology (MIT; MA, USA) has probed the abilities of AlphaFold in drug discovery.  Through a screen of E. coli’s essential proteins and known antibacterial compounds, aiming to identify successful docking interactions between the drugs and proteins, the team produced some mixed results for AlphaFold’s capabilities. This provides critical information for the use of in silico screens in drug discovery.

Drug discovery is a long, costly and time-consuming process, during which a candidate drug molecule is custom made to bind to a specific biological target, such as a receptor or the active site of a protein. To do this, candidate drugs must be optimized to bind as readily and specifically to a target as possible. In silico experimentation has revolutionized the world of drug discovery, allowing chemists to visualize a world near impossible to capture through any microscope.

Computational modeling allows for swift screening of candidate drug molecules through a myriad of categories. An extensive suite of molecular docking software exists, allowing the virtual monitoring of everything from binding energies to toxicity screenings. One setback of computational modeling is that, when probing an interaction between a drug and a protein, the full structure of the protein must already be known.


The diagnostic potential of AI: from COVID-19 to COPD

A recent study reveals the potential of AI models in diagnosing COVID-19 infection, utilizing a simple mobile application.


Proteins, in the eyes of a chemist, are gargantuan molecules of bewildering size and complexity. Comprised of only five elements (carbon, hydrogen, oxygen, nitrogen and sulfur) and assembled with discrete building blocks linked together by a polypeptide chain, protein structures are simple in theory. However, that chain of amino acids can stretch into the hundreds and even into the thousands of amino acid residues. Moreover, the amino acids in the chain interact with each other; intramolecular attractions occur and in certain circumstances covalent bonds form, folding and contorting the chain into a dynamic three-dimensional structure.

Determining the final structure of a protein can take a team of researchers years and requires advanced experimental techniques such as X-ray diffraction crystallography, two-dimensional nuclear magnetic resonance (NMR) and complex biological assays.

AlphaFold, a machine learning software developed by DeepMind, a subsidiary of Google, has demonstrated its ability to resolve a full protein structure from primary amino acid sequences. In the recent MIT study, led by James Colins, the team tested AlphaFold’s abilities using a screen of the 296 essential proteins of the bacteria E. coli and 218 known antibacterial compounds.

The team screened the docking interactions of the drugs and the proteins; AlphaFold then came up with predictions for potential drug–protein interactions. However, when comparing these predictions with experimentally proven interactions between the drugs and 12 of the proteins, AlphaFold was found to have a false positive rate similar to its true positive results rate.

auROC, a standard metric for calculating the performance of computational models, indicated AlphaFold’s poor performance in this study. “Utilizing these standard molecular docking simulations, we obtained an auROC value of 0.5,” explained lead author of the study James Collins, “which basically says you’re doing no better than if you were randomly guessing.”


Trust your gut… even when it’s synthetic

Researchers have constructed the most complete version of a synthetic gut microbiome, offering an insight into gut-affected processes.


Interestingly, the researchers found that there was little difference in AlphaFold’s performance when it used experimentally determined structures instead of predicted protein structures as a basis for its simulations.

The team proposed that one shortfall of the modeling was treating the protein as a static object, which could not be further from the truth. Proteins exist in a state of flux and can shift their configurations. They refined their study by creating four additional models, which allowed them to provide more data for their machine-learning models.

Overall, the team were hopeful yet realistic about the use of AI in drug discovery, stating that with further improvements and providing more detailed biophysical and biochemical data, the models will become more sophisticated and their predictions will become increasingly accurate. “We’re optimistic that with improvements to the modelling approaches and expansion of computing power, these techniques will become increasingly important in drug discovery,” said Collins. “However, we have a long way to go to achieve the full potential of in silico drug discovery.” Only time will tell if AlphaFold will truly revolutionize the world of computational modeling.