Campus Units

Computer Science

Document Type

Article

Publication Version

Published Version

Publication Date

2006

Journal or Book Title

Nucleic Acids Research

Volume

34

Issue

1

First Page

201

Last Page

205

DOI

10.1093/nar/gkj419

Abstract

We introduce a data structure called a superword array for finding quickly matches between DNA sequences. The superword array possesses some desirable features of the lookup table and suffix array. We describe simple algorithms for constructing and using a superword array to find pairs of sequences that share a unique superword. The algorithms are implemented in a genome assembly program called PCAP.REP for computation of overlaps between reads. Experimental results produced by PCAP.REP and PCAP on a whole-genome dataset show that PCAP.REP produced a more accurate and contiguous assembly than PCAP.

Comments

This article is published as Huang, Xiaoqiu, Shiaw-Pyng Yang, Asif T. Chinwalla, LaDeana W. Hillier, Patrick Minx, Elaine R. Mardis, and Richard K. Wilson. "Application of a superword array in genome assembly." Nucleic acids research 34, no. 1 (2006): 201-205. doi: 10.1093/nar/gkj419.

Rights

The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions@oxfordjournals.org

Copyright Owner

The Authors

Language

en

File Format

application/pdf

Share

COinS