ParsEval: parallel comparison and analysis of gene structure annotations

Thumbnail Image
Date
2012-08-01
Authors
Standage, Daniel
Brendel, Volker
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Research Projects
Organizational Units
Organizational Unit
Genetics, Development and Cell Biology

The Department of Genetics, Development, and Cell Biology seeks to teach subcellular and cellular processes, genome dynamics, cell structure and function, and molecular mechanisms of development, in so doing offering a Major in Biology and a Major in Genetics.

History
The Department of Genetics, Development, and Cell Biology was founded in 2005.

Related Units

Organizational Unit
Bioinformatics and Computational Biology
The Bioinformatics and Computational Biology (BCB) Program at Iowa State University is an interdepartmental graduate major offering outstanding opportunities for graduate study toward the Ph.D. degree in Bioinformatics and Computational Biology. The BCB program involves more than 80 nationally and internationally known faculty—biologists, computer scientists, mathematicians, statisticians, and physicists—who participate in a wide range of collaborative projects.
Journal Issue
Is Version Of
Versions
Series
Department
Genetics, Development and Cell BiologyBioinformatics and Computational Biology
Abstract

Background
Accurate gene structure annotation is a fundamental but somewhat elusive goal of genome projects, as witnessed by the fact that (model) genomes typically undergo several cycles of re-annotation. In many cases, it is not only different versions of annotations that need to be compared but also different sources of annotation of the same genome, derived from distinct gene prediction workflows. Such comparisons are of interest to annotation providers, prediction software developers, and end-users, who all need to assess what is common and what is different among distinct annotation sources. We developed ParsEval, a software application for pairwise comparison of sets of gene structure annotations. ParsEval calculates several statistics that highlight the similarities and differences between the two sets of annotations provided. These statistics are presented in an aggregate summary report, with additional details provided as individual reports specific to non-overlapping, gene-model-centric genomic loci. Genome browser styled graphics embedded in these reports help visualize the genomic context of the annotations. Output from ParsEval is both easily read and parsed, enabling systematic identification of problematic gene models for subsequent focused analysis.

Results
ParsEval is capable of analyzing annotations for large eukaryotic genomes on typical desktop or laptop hardware. In comparison to existing methods, ParsEval exhibits a considerable performance improvement, both in terms of runtime and memory consumption. Reports from ParsEval can provide relevant biological insights into the gene structure annotations being compared.

Conclusions
Implemented in C, ParsEval provides the quickest and most feature-rich solution for genome annotation comparison to date. The source code is freely available (under an ISC license) at http://parseval.sourceforge.net/.

Comments

This article is from BMC Bioinformatics 2012, 13:187, doi:10.1186/1471-2105-13-187. Posted with permission.

Description
Keywords
Citation
DOI
Copyright
Sun Jan 01 00:00:00 UTC 2012
Collections