Date of Award
Doctor of Philosophy
Biochemistry, Biophysics and Molecular Biology
Robert L. Jernigan
Physical characteristics of amino acids are responsible for the folding of protein sequences to their native structures. An understanding of protein sequence-structure relationships is required to solve the folding problem and it is one of the most important problems in computational structural biology. Even though there are tens of thousands of protein structures in the Protein Data Bank, it is not understood why they take their particular structures or why they are limited to a few thousands of folds. It is well known that protein structures are evolutionarily more conserved than sequences and that, often, sequences with low sequence identity can share the same fold. This leads to the concept of protein designability. The designability of a particular conformation is defined as the number of different sequences that fold to it giving unique minimum energy.
Graph features of contact diagrams are employed here to describe the topology of lattice models of proteins and coarse-grained protein structures. The relationship between graphical features and designability of structures is explored here in various ways. It is found that there exists a relationship between some simple geometric graph features and designability. Highly designable structures can be distinguished from poorly designable structures based on those graphical features. This finding confirms the fact that the topology of a protein structure giving rise to its residue-residue interaction network is an important determinant of its designability.
We learn that, the higher the designability of a structure is, the more diverse is its sequence space. However, there are conserved positions, which are more frequently conserved as either polar or hydrophobic. There is a marked difference between the hydrophobic/polar profiles of highly and poorly designable sequences, and thus, they become more clearly distinguishable. These profiles can be used to train machine learning algorithms to predict the designability of sequences.
Sumudu Pamoda Leelananda
Leelananda, Sumudu Pamoda, "Protein sequence-structure relationships" (2011). Graduate Theses and Dissertations. 10344.