Campus Units

Genetics, Development and Cell Biology, Bioinformatics and Computational Biology

Document Type

Article

Publication Version

Accepted Manuscript

Publication Date

5-2012

Journal or Book Title

Chemistry & Biodiversity

Volume

9

Issue

5

First Page

868

Last Page

887

DOI

10.1002/cbdv.201100355

Abstract

Network-based analysis is indispensable in analyzing high throughput biological data. Based on the assumption that the variation of gene interactions under given biological conditions could be better interpreted in the context of a large-scale and wide variety of developmental, tissue, and disease, we leverage the large quantity of publicly-available transcriptomic data > 40,000 HG U133A Affymetrix microarray chips stored in ArrayExpress (http://www.ebi.ac.uk/arrayexpress/) using MetaOmGraph (http://metnet.vrac.iastate.edu/MetNet_MetaOmGraph.htm). From this data, 18,637 chips encompassing over 500 experiments containing high quality data (18637Hu-dataset) were used to create a globally stable gene co-expression network (18637Hu-co-expressionnetwork). Regulons, groups of highly and consistently co-expressed genes, were obtained by partitioning the 18637Hu-co-expression-network using an MCL clustering algorithm. The regulon were demonstrated to be statistically significant using a gene ontology (GO) term overrepresentation test combined with evaluation of the effects of gene permutations. The regulons include approximately 12% of human genes, interconnected by 31,471 correlations. All network data and metadata is publically available (http://metnet.vrac.iastate.edu/ MetNet_MetaOmGraph.htm). Text mining of these metadata, GO term overrepresentation analysis, and statistical analysis of transcriptomic experiments across multiple environmental, tissue, and disease conditions, has revealed novel fingerprints distinguishing central nervous system (CNS)-related conditions. This study demonstrates the value of mega-scale network-based analysis for biologists to further refine transcriptomic data derived from a particular condition, to study the global relationships between genes and diseases, and to develop hypotheses that can inform future research.

Comments

This is the peer reviewed version of the following article: Feng, Y., Hurst, J., Almeida-De-Macedo, M., Chen, X., Li, L., Ransom, N. and Wurtele, E. S. (2012), Massive Human Co-Expression Network and Its Medical Applications. Chemistry & Biodiversity, 9: 868–887, which has been published in final form at doi:10.1002/cbdv.201100355. This article may be used for non-commercial purposes in accordance With Wiley Terms and Conditions for self-archiving.

Copyright Owner

Verlag Helvetica Chimica Acta AG, Zürich

Language

en

File Format

application/pdf

Share

COinS