Hierarchical phylogeny construction

Das, Anindya

Hierarchical phylogeny construction

File

Das_iastate_0097E_18149.pdf (2.7 MB)

Date

2019-01-01

Authors

Das, Anindya

Advisor

Xiaoqiu Huang

Organizational Units

Organizational Unit

Computer Science

Computer Science—the theory, representation, processing, communication and use of information—is fundamentally transforming every aspect of human endeavor. The Department of Computer Science at Iowa State University advances computational and information sciences through; 1. educational and research programs within and beyond the university; 2. active engagement to help define national and international research, and 3. educational agendas, and sustained commitment to graduating leaders for academia, industry and government.

History
The Computer Science Department was officially established in 1969, with Robert Stewart serving as the founding Department Chair. Faculty were composed of joint appointments with Mathematics, Statistics, and Electrical Engineering. In 1969, the building which now houses the Computer Science department, then simply called the Computer Science building, was completed. Later it was named Atanasoff Hall. Throughout the 1980s to present, the department expanded and developed its teaching and research agendas to cover many areas of computing.

Dates of Existence
1969-present

Related Units

College of Liberal Arts and Sciences (parent college)

Department

Computer Science

Abstract

Construction of a phylogenetic tree for a number of species from their genome sequence is very important for understanding the evolutionary history of those species. Rapid improvements in DNA sequencing technology have generated sequence data for huge number of similar isolates with a wide range of single nucleotide polymorphism (SNP) rates, where the SNP rate among some isolates can be thousands of times lower than the others. This kind of genome sequences are difficult for the existing methods because the subtree(s) (or clade) consisting of species or isolates with very low SNP rates may have a very low level of resolution and their evolutionary history may not be accurately represented. Identification of the informative columns in the alignment containing important variations in the genome of those species is important in constructing their evolutionary history. Here we describe a method for selecting informative regions for a set of isolates based on the observation that the likelihood of informative columns are sensitive to changes in the tree topology. We show that these informative columns increase the correctness of the phylogenies constructed for the closely related isolates. Then we address the generalized version of this problem by developing a hierarchical approach to phylogeny construction. In this method, the construction is performed at multiple levels, where at each level, groups of isolates with similar levels of similarity are identified and their phylogenetic trees are constructed. We also detect those multiple levels of similarity in an automated manner. Our results show that this new hierarchical approach is much efficient and sometimes more accurate than existing approaches of building the phylogenetic tree with maximum likelihood from the whole alignment for all the isolates.

Copyright

Thu Aug 01 00:00:00 UTC 2019

Collections

Theses and Dissertations

Full item page