Biology has no wallflowers. Inside cells, it’s a party: DNA dances with RNA, enzymes hug small molecules, and proteins interact with proteins. Biologists, however, tend to study proteins one at a time. For example, Barry Honig’s group at Columbia University estimates that, as of 2010, individual structures were solved for about 10 percent of all yeast proteins. Meanwhile, structures were available for only 0.5 percent of protein complexes.
“Structural BLAST” is how Honig describes the PrePPI approach. The BLAST (basic local alignment search tool) analogy resonates with anyone who uses the public, online search tool to get nucleotide or amino acid sequence alignments. PrePPI users plug in protein queries and get search results for predicted PPIs. PrePPI (pronounced preppy) searches for local geometric relationships between proteins based on both experimentally determined and modeled protein structures. “We look broadly for any relationship between two structures because we can get information from these relationships and the models on which they are based, if even they are imperfect,” says Honig. PrePPI is meant to be used like BLAST: to kickstart hypotheses, generate research ideas, and plan experiments.
How PrePPI Works
PrePPI predicts interactions between a pair of query proteins using known structures for the proteins or—and this is the key—for their homologs. Although other PPI researchers have used homology modeling, Honig’s group is the first to apply the strategy on such a large scale. Any complex with a structure that involves the proteins, their homologs, or structural neighbors becomes the template for how the query proteins might interact.
Then PrePPI rates the template complex. It scores the structural fit between the query proteins and their stand-ins and asks whether the query proteins have the right amino acids in the right place to make the bonds of the interaction. PrePPI also checks if the query proteins have properties that make the interaction likely: similar expression profiles, functions, cellular location, and evolutionary patterns. Although each datapoint might be weak on its own, enough evidence adds up to a high-confidence interaction prediction.
Overall, PrePPI is too blunt of an instrument for Sandor Vajda’s research on the structure of protein complexes. But Vajda, who self-identifies as a “classic protein docking guy” and is director of Boston University’s Biomolecular Engineering Research Center, says PrePPI adds to earlier bioinformatics efforts on PPIs. Although the protein interaction field has heaps of experimental data from mass spectrometry experiments and yeast two-hybrid screens, the reliability of these data can be low. So researchers need additional information to weed out false leads. “What Honig brings to the table,” says Vajda, “is adding structure to computational PPI predictions and a good scoring scheme that filters out many false positives.”
In fact, PrePPI performed at least as well as high-throughput experimental methods. PrePPI also showed that data on a distantly related protein can predict a protein’s structure. For example, PrePPI used a template of interacting proteins in the ubiquitin pathway to correctly predict an association between two kinases that were unrelated to the ubiquitin-pathway models.
You can see if your favorite proteins might interact at http://bhapp.c2b2.columbia.edu/PrePPI/. So far, PrePPI predicts yeast and human PPIs, with proteins from other organisms to come. Honig says, “Put in a protein name, and we’ll give you our best shot.”
From PPIs to New Drugs
Future versions of PrePPI might predict associations between proteins and membranes, nucleotides, or—crucial to drug development—small molecules. This is the realm of David Koes, a computational biologist at the University of Pittsburgh who works on drugs that target PPIs. An example is the PPI between the tumor suppressor protein p53 and its negative regulator Mdm2. Interrupting this interaction could combat cancer by freeing p53, and candidate drugs are currently in early trials. The challenge of PPI drug discovery, says Koes, is that “proteins in PPIs evolved to bind large molecules—other proteins—not small molecules like drugs.”
That’s why Koes and colleagues study PPI interfaces: the size, charges, and arrangements of interacting amino acids. The goal is to find small molecules that mimic the PPI interface and might wedge between proteins, blocking interactions “like a foot in a door,” says Koes. To find molecules that target a PPI, he has a package of tools. To begin with, Koes offers PocketQuery, a free online tool that, unlike PrePPI, requires known protein structures. Then based on information from PocketQuery about the PPI interface, additional search tools such as AnchorQuery and ZINCPharmer help him find small molecules that might modulate the PPI (2-3).
An eternal challenge for computer modelers is the complexity of biology, which stubbornly refuses to be reduced to a simple algorithm. However, Koes says, the PrePPI method of combining as much information as possible to create a meaningful model is “the heart of systems biology.” This is the future of personalized medicine—using large amounts of data about a single patient to make individualized predictions about disease, diagnosis, and treatment.
In the end, Koes, Honig, and other modelers also make up for the inevitable simplifications of computer modeling by adding the human element. “We generate interactive programs that incorporate the expert knowledge of the person doing the search,” Koes says. Fast, free, flexible modeling tools like PrePPI, PocketQuery, and others respect the user’s intelligence. The tools do not provide the ultimate answers. Instead, users get information that is a starting point for hypotheses, experiments, and further queries.
And Just For Fun…
More free tools and even contests for exploring PPIs abound. Sandor Vajda was an initiator of the Critical Assessment of Prediction of Interactions (CAPRI), which he describes as “an international experiment, but also a competition.” CAPRI is an ongoing contest, just for fun and glory, with the same give-it-a-try spirit of PrePPI.
Hosted by the European Bioinformatics Institute, CAPRI gets solved structures of protein-protein complexes that have not yet been published from researchers who are willing to wait a few weeks before releasing the structure. CAPRI contestants get information about the proteins in the complex and have a month to submit interaction models. A CAPRI committee determines a “winner” by comparing submitted models to the known, but previously secret, structure. Boston University’s free, online PPI modeling program ClusPro (cluspro.bu.edu/) consistently performs well in CAPRI (4).
All things considered, ClusPro, PrePPI, PocketQuery, and other PPI tools are giving researchers lots to do, on their desktops and at the bench. Vajda says that predicted interactions from PrePPI are a good starting point for exploring PPIs experimentally, validating predicted interactions and determining their physiological importance.
Barry Honig says that PrePPI will contribute to and benefit from experimental research: “Computational modeling starts with experimental data but drives wet lab work, too. It goes both ways.”
1. Zhang, Q. C. C., D. Petrey, L. Deng, L. Qiang, Y. Shi, C. A. A. Thu, B. Bisikirska, C. Lefebvre, D. Accili, T. Hunter, T. Maniatis, A. Califano, and B. Honig. 2012. Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature 490(7421):556-560.
2. Koes D, Khoury K, Huang Y, Wang W, Bista M, Popowicz GM, Wolf S, Holak TA, Dömling A, Camacho CJ. 2012. Enabling Large-Scale Design, Synthesis and Validation of Small Molecule Protein-Protein Antagonists. PLoS ONE 7(3):e32839.
3. Koes, DR, Camacho CJ. 2012. ZINCPharmer: pharmacophore search of the ZINC database. Nucl Acid Res. 40:W409-W414.
4. Kozakov, D., D. R. Hall, D. Beglov, R. Brenke, S. R. Comeau, Y. Shen, K. Li, J. Zheng, P. Vakili, I. C. C. h. Paschalidis, and S. Vajda. 2010. Achieving reliability and high accuracy in automated protein docking: ClusPro, PIPER, SDU, and stability analysis in CAPRI rounds 13-19. Proteins 78(15):3124-3130.