Degree Type

Dissertation

Date of Award

2009

Degree Name

Doctor of Philosophy

Department

Computer Science

Major

Bioinformatics and Computational Biology

First Advisor

Vasant Honavar

Abstract

Protein-protein interaction plays a pivotal role in biological metabolism. It directs many cellular processes like signal transduction, DNA replication and RNA splicing, etc. Identification of protein-protein interaction sites is important to identification of protein functions, improvement of protein-protein docking and rational drug design. Experimental methods to identify protein-protein interaction sites are always time-consuming and costly, which calls for computational methods to be applied in this area.

The research work focuses on three parts:

We have built a Protein-Protein Interface Database (PPIDB) which extracted 71, 486 protein-protein interfaces from experimentally determined protein complex structures in the current version of Protein Data Bank. It facilitates construction of well-characterized datasets of protein-protein interface residues for computational analyses. The database is accessible through the Web Interface http://ppidb.cs.iastate.edu and a set of Web services.

We have made a comprehensive analysis of protein-protein dimeric interfaces, which consists of thirteen physic-chemical properties. The results disclose that interface residues have side chains pointing inward; interfaces are rougher, tend to be flat, moderately convex or concave and protrude more relative to non-interface surface residues; interface residues tend to be surrounded by hydrophobic neighbors.

We have developed NB PPIPS, a Naive Bayes method to predict protein-protein interaction sites on protein surfaces. Trained over a non-redundant data set consisting of 2, 383 proteins and fed with sequence, evolutionary and structural properties, NB PPIPS achieves 60.7% recall and 34.6% precision in 10 fold cross-validation, which greatly improves over the baseline classifier that only utilizes protein sequence information. Attempts are made to apply the NB PPIPS in a two stage prediction of protein-protein interfaces when only protein sequence is known. Modeled protein structures are generated via homologue modeling and fed as inputs into NB PPIPS. The results show that good predictions are obtained only for well modeled structures. NB PPIPS is implemented as an online server to facilitate its usage. It is accessible at http://watson.cs.iastate.edu/nb_ppips .

DOI

https://doi.org/10.31274/etd-180810-4321

Copyright Owner

Feihong Wu

Language

en

Date Available

2012-04-30

File Format

application/pdf

File Size

121 pages

Share

COinS