Degree Type


Date of Award


Degree Name

Doctor of Philosophy


Computer Science

First Advisor

Vasant Honavar


Protein-protein interactions play a central role in the formation of protein complexes and the biological pathways that orchestrate virtually all cellular processes. Reliable identification of the specific amino acid residues that form the interface of a protein with one or more other proteins is critical to understanding the structural and physico-chemical basis of protein interactions and their role in key cellular processes, predicting protein complexes, validating protein interactions predicted by high throughput methods, and identifying and prioritizing drug targets in computational drug design. Because of the difficulty and the high cost of experimental characterization of interface residues, there is an urgent need for computational methods for reliable predicting protein-protein interface residues from the sequence, and when available, the structure of a query protein, and when known, its putative interacting partner.

Against this background, this thesis develops improved methods for predicting protein-protein interface residues and protein-protein interfaces from the three dimensional structure of an unbound query protein without considering information of its binding protein partner. Towards this end, we develop (i) ProtInDb (, a database of protein-protein interface residues to facilitate (a) the generation of datasets of protein-protein interface residues that can be used to perform analysis of interaction sites and to train and evaluate predictors of interface residues, and (b) the visualization of interaction sites between proteins in both the amino acid sequences and the 3D protein structures, among other applications; (ii) PoInterS (, a method for predicting protein-protein interaction sites formed by spatially contiguous clusters of interface residues based on the predictions generated by a protein interface residue predictor. PoInterS divides a protein surface into a series of patches composed of several surface residues, and uses the outputs of the interface residue predictors to rank and select a small set of patches that are the most likely to constitute the interaction sites; and (iii) PrISE (, a method for predicting protein-protein interface residues based on the similarity of the structural element formed by the query residue and its neighboring residues and the structural elements extracted from the interface and non-interface regions of proteins that are members of experimentally determined protein complexes. A structural element captures the atomic composition and solvent accessibility of a central residue and its closest neighbors in the protein structure. PrISE decomposes a query protein into a set of structural elements and searches for similar elements in a large set of proteins that belong to one or more experimentally determined complexes. The structural elements that are most similar to each structural element extracted from the query protein are then used to infer whether its central residue is or is not an interface residue. The results of our experiments using a variety of benchmark datasets show that PoInterS and PrISE generally outperform the state-of-the-art structure-based methods for predicting interaction patches and interface residues, respectively.


Copyright Owner

Rafael Armando Jordan



Date Available


File Format


File Size

154 pages