Degree Type

Dissertation

Date of Award

2005

Degree Name

Doctor of Philosophy

Department

Computer Science

Major

Bioinformatics and Computational Biology

First Advisor

Vasant Honavar

Second Advisor

Drena Dobbs

Abstract

Identification of interface residues involved in protein-protein and protein-DNA interactions is critical for understanding the functions of biological systems. Because identifying interface residues using experimental methods cannot catch up with the pace at which protein sequences are determined, computational methods that can identify interface residues are urgently needed. In this study, we apply machine-learning methods to identify interface residues with the focus on the methods using amino acid sequence information alone. We have developed classifiers for identification of the residues involved in protein-protein and protein-DNA interactions using a window of primary sequence as input. The classifiers were evaluated using both representative datasets and specific cases of interest based on multiple measurements. The results have shown the feasibility of identifying interface residues from sequence. We have also explored information besides primary sequence to improve the performance of sequence-based classifiers. The results show that the performance of sequence-based classifiers can be improved by using solvent accessibility and sequence entropy of the target residue as additional inputs. We have developed a database of protein-protein interfaces that consists of all the protein-protein interfaces derived from the Protein Data Bank. This database, for the first time, makes possible the quick and flexible retrieval of interface sets and various interface features. We have systematically analyzed the characteristics of interfaces using the largest dataset available. In particular, we compared interfaces with the samples that had the same solvent accessibility as the interfaces. This strategy excludes the effect of solvent accessibility on the distributions of residues, secondary structure, and sequence entropy.

DOI

https://doi.org/10.31274/rtd-180813-2240

Publisher

Digital Repository @ Iowa State University, http://lib.dr.iastate.edu/

Copyright Owner

Changhui Yan

Language

en

Proquest ID

AAI3200470

File Format

application/pdf

File Size

121 pages

Share

COinS