Campus Units

Electrical and Computer Engineering, Genetics, Development and Cell Biology, Statistics

Document Type

Conference Proceeding

Conference

IEEE Global Conference on Signal and Information Processing (GlobalSIP) 2013

Publication Version

Accepted Manuscript

Link to Published Version

https://doi.org/10.1109/ISIT.2013.6620502

Publication Date

2013

Journal or Book Title

IEEE Global Conference on Signal and Information Processing (GlobalSIP) 2013

DOI

10.1109/ISIT.2013.6620502

Conference Title

IEEE Global Conference on Signal and Information Processing (GlobalSIP) 2013

Conference Date

July 7-12, 2013

City

Istanbul, Turkey

Abstract

In this work we present a flexible, probabilistic and reference-free method of error correction for high throughput DNA sequencing data. The key is to exploit the high coverage of sequencing data and model short sequence outputs as independent realizations of a Hidden Markov Model (HMM). We pose the problem of error correction of reads as one of maximum likelihood sequence detection over this HMM. While time and memory considerations rule out an implementation of the optimal Baum-Welch algorithm (for parameter estimation) and the optimal Viterbi algorithm (for error correction), we propose low-complexity approximate versions of both. Specifically, we propose an approximate Viterbi and a sequential decoding based algorithm for the error correction. Our results show that when compared with Reptile, a state-of-the-art error correction method, our methods consistently achieve superior performances on both simulated and real data sets.

Comments

This is a manuscript of a proceeding from the IEEE Global Conference on Signal and Information Processing 2013: 73, doi:10.1109/ISIT.2013.6620502. Posted with permission.

Rights

Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Copyright Owner

IEEE

Language

en

File Format

application/pdf

Published Version

Share

Article Location

 
COinS