Degree Type

Dissertation

Date of Award

2011

Degree Name

Doctor of Philosophy

Department

Biochemistry, Biophysics and Molecular Biology

First Advisor

Robert L. Jernigan

Abstract

Physical characteristics of amino acids are responsible for the folding of protein sequences to their native structures. An understanding of protein sequence-structure relationships is required to solve the folding problem and it is one of the most important problems in computational structural biology. Even though there are tens of thousands of protein structures in the Protein Data Bank, it is not understood why they take their particular structures or why they are limited to a few thousands of folds. It is well known that protein structures are evolutionarily more conserved than sequences and that, often, sequences with low sequence identity can share the same fold. This leads to the concept of protein designability. The designability of a particular conformation is defined as the number of different sequences that fold to it giving unique minimum energy.

Graph features of contact diagrams are employed here to describe the topology of lattice models of proteins and coarse-grained protein structures. The relationship between graphical features and designability of structures is explored here in various ways. It is found that there exists a relationship between some simple geometric graph features and designability. Highly designable structures can be distinguished from poorly designable structures based on those graphical features. This finding confirms the fact that the topology of a protein structure giving rise to its residue-residue interaction network is an important determinant of its designability.

We learn that, the higher the designability of a structure is, the more diverse is its sequence space. However, there are conserved positions, which are more frequently conserved as either polar or hydrophobic. There is a marked difference between the hydrophobic/polar profiles of highly and poorly designable sequences, and thus, they become more clearly distinguishable. These profiles can be used to train machine learning algorithms to predict the designability of sequences.

Copyright Owner

Sumudu Pamoda Leelananda

Language

en

Date Available

2012-04-28

File Format

application/pdf

File Size

134 pages

Share

COinS