Campus Units

Genetics, Development and Cell Biology, Bioinformatics and Computational Biology, Center for Metabolic Biology, Genome Informatics Facility

Document Type

Article

Publication Version

Submitted Manuscript

Publication Date

3-8-2020

Journal or Book Title

bioRxiv

DOI

10.1101/2020.03.04.925818

Abstract

Implementing RNA-Seq analysis pipelines is challenging as data gets bigger and more complex. With the availability of terabytes of RNA-Seq data and continuous development of analysis tools, there is a pressing requirement for frameworks that allow for fast and efficient development, modification, sharing and reuse of workflows. Scripting is often used, but it has many challenges and drawbacks. We have developed a python package, python RNA-Seq Pipeliner (pyrpipe) that enables straightforward development of flexible, reproducible and easy-to-debug computational pipelines purely in python, in an object-oriented manner. pyrpipe provides high level APIs to popular RNA-Seq tools. Pipelines can be customized by integrating new python code, third-party programs, or python libraries. Researchers can create checkpoints in the pipeline or integrate pyrpipe into a workflow management system, thus allowing execution on multiple computing environments. pyrpipe produces detailed analysis, and benchmark reports which can be shared or included in publications. pyrpipe is implemented in python and is compatible with python versions 3.6 and higher. All source code is available at https://github.com/urmi-21/pyrpipe; the package can be installed from the source or from PyPi (https://pypi.org/project/pyrpipe). Documentation is available on Read the Docs (http://pyrpipe.rtfd.io).

Comments

This preprint is made available through bioRxiv at, doi: 10.1101/2020.03.04.925818.

Copyright Owner

The Authors

Language

en

File Format

application/pdf

Share

COinS