TransAtlasDB: an integrated database connecting expression data, metadata and variants

Thumbnail Image
Date
2018-02-23
Authors
Adetunji, Modupeore
Lamont, Susan
Schmidt, Carl
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Person
Lamont, Susan
Distinguished Professor
Research Projects
Organizational Units
Organizational Unit
Journal Issue
Is Version Of
Versions
Series
Department
Animal Science
Abstract

High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis.

Database URL: https://modupeore.github.io/TransAtlasDB/

Comments

This article is published as Adetunji, Modupeore O., Susan J. Lamont, and Carl J. Schmidt. "TransAtlasDB: an integrated database connecting expression data, metadata and variants." Database 2018 (2018). DOI: 10.1093/database/bay014. Posted with permission.

Description
Keywords
Citation
DOI
Copyright
Mon Jan 01 00:00:00 UTC 2018
Collections