Degree Type

Thesis

Date of Award

2020

Degree Name

Doctor of Philosophy

Department

Computer Science

Major

Computer Science

First Advisor

David Fernandez-Baca

Abstract

The tree compatibility problem is a basic special case of the supertree problem. A supertree method is a way to synthesize a collection of phylogenetic trees with partially overlapping taxon sets into a single supertree that represents the information in the input trees. The supertree approach, proposed in the early 90s [5, 6], has been used successfully to build large-scale phylogenies [7].

The original supertree methods were limited to input trees where only the leaves are labeled. We present a new graph-based approach to the following basic problem in phylogenetic tree construction. Let P = {T1 , . . . , Tk } be a collection of rooted phylogenetic trees over various subsets of a set of species. The tree compatibility problem asks whether there is a phylogenetic tree T with the following property: for each i ∈ {1, . . . , k}, Ti can be obtained from the restriction of T to the species set of Ti by contracting zero or more edges. If such a tree T exists, we say that P is compatible and that T displays P.

Our approach leads to a O(MP log2 MP ) algorithm for the tree compatibility problem, where MP is the total number of nodes and edges in P. Our algorithm either returns a tree that displays P or reports that P is incompatible. Unlike previous algorithms, the running time of our method does not depend on the degrees of the nodes in the input trees. Thus, our algorithm is equally fast on highly resolved and highly unresolved trees.

Semi-labeled trees are phylogenies whose internal nodes may be labeled by higher-order taxa. Thus, a leaf labeled Mus musculus could nest within a subtree whose root node is labeled Rodentia, which itself could nest within a subtree whose root is labeled Mammalia. Suppose we are given collection P of semi- labeled trees over various subsets of a set of taxa. The ancestral compatibility problem asks whether there is a semi-labeled tree T that respects the clusterings and the ancestor/descendant relationships implied by the trees in P. We give a O ̃(MP) algorithm for the ancestral compatibility problem, where MP is the total number of nodes and edges in the trees in P. Unlike the best previous algorithm, the running time of our method does not depend on the degrees of the nodes in the input trees.

DOI

https://doi.org/10.31274/etd-20200624-93

Copyright Owner

Yun Deng

Language

en

File Format

application/pdf

File Size

65 pages

Share

COinS