Characterizing and predicting protein modifications with a novel AI tool

Original story from Baylor College of Medicine (TX, USA).
An AI model that reveals how protein modifications link genetic mutations to disease has been developed.
Researchers at Baylor College of Medicine (TX, USA) have developed an AI model that reveals how protein modifications link genetic mutations to disease. The method, called DeepMVP, significantly outperforms previously published models and has implications for the development of novel therapeutics.
“Proteins are responsible for all the functions of the body, from growing tissues to regulating metabolism or fighting disease. Their functions are often regulated by modifications that take place after proteins are made through a process called post-translational modification (PTM),” explained corresponding author Bing Zhang, professor at the Lester and Sue Smith Breast Center and of molecular and human genetics at Baylor.
The modifications include the addition of chemical groups, such as phosphates or sugars, that influence how a protein behaves, where it goes in the cell or how long it lasts. When PTMs go wrong, the proteins may not perform as expected and contribute to diseases like cancer, heart conditions or neurological disorders.
Key cellular complex plays unexpected role in gene expression
Research has uncovered a previously unknown, widespread role of ORC in regulating human cell gene expression.
Understanding where PTMs happen can help predict how mutations in these locations may change a protein’s function in ways that affect a person’s health. For instance, PTMs can be disrupted by DNA mutations that can remove a PTM site in a protein, create a new site or affect nearby regions, altering the protein’s function.
“We developed DeepMVP, a computational model to predict where in a protein PTMs happen and which mutations in those locations can affect PTMs,” commented co-first author Chenwei Wang, a postdoc in the Zhang lab. “To train the model to recognize patterns in protein sequences that indicate PTM sites, we created the PTMAtlas, a curated compendium of known 397,524 PTM sites generated through systematic reprocessing of 241 public datasets. We focused on six common PTMs.”
PTMAtlas includes nearly 400,000 PTM sites across thousands of human proteins. Compared to other databases, PTMAtlas is more comprehensive and accurate – it can predict PTM sites on all human proteins and even in viral proteins like those from SARS-CoV-2. This indicates that DeepMVP is a powerful resource for studying protein modifications.
DeepMVP outperformed eight existing similar tools. Testing its ability to predict how mutations affect PTMs using a curated set of 235 known mutation-PTM pairs from scientific literature showed that DeepMVP correctly predicted the PTM site in 81% of cases and the direction of change (increase or decrease) in 97% of cases.
“We anticipate that DeepMVP can be applied to cancer, neurological conditions and cardiovascular diseases and accelerate discoveries in genetics, cancer biology and drug development,” Zhang concluded. “The tool is freely available to researchers worldwide at https://deepmvp.ptmax.org/.”
This article has been republished from the following materials. Material may have been edited for length and house style. For further information, please contact the cited source. Our press release publishing policy can be accessed here.