Campus Units

Ecology, Evolution and Organismal Biology

Document Type

Article

Publication Version

Submitted Manuscript

Publication Date

2-18-2021

Journal or Book Title

bioRxiv

DOI

10.1101/2021.02.18.431864

Abstract

With the rapid rise in availability of high-quality genomes for closely related species, methods for orthology inference that incorporate synteny are increasingly useful. Polyploidy perturbs the 1:1 expected frequencies of orthologs between two species, complicating the identification of orthologs. Here we present a method of ortholog inference, Ploidy-aware Syntenic Orthologous Networks Identified via Collinearity (pSONIC). We demonstrate the utility of pSONIC using four species in the cotton tribe (Gossypieae), including one allopolyploid, and place between 75-90% of genes from each species into nearly 32,000 orthologous groups, 97% of which consist of at most singletons or tandemly duplicated genes -- 58.8% more than comparable methods that do not incorporate synteny. We show that 99% of singleton gene groups follow the expected tree topology, and that our ploidy-aware algorithm recovers 97.5% identical groups when compared to splitting the allopolyploid into its two respective subgenomes, treating each as separate “species”.

Comments

This preprint is made available through bioRxiv at doi: 10.1101/2021.02.18.431864.

Copyright Owner

The Authors

Language

en

File Format

application/pdf

Share

COinS