Campus Units

Animal Science

Document Type

Article

Publication Version

Published Version

Publication Date

2016

Journal or Book Title

BMC Genomics

Volume

17

First Page

846

DOI

10.1186/s12864-016-3176-2

Abstract

Background: Genome sequencing and subsequent gene annotation of genomes has led to the elucidation of many genes, but in vertebrates the actual number of protein coding genes are very consistent across species (~20,000). Seven years after sequencing the cattle genome, there are still genes that have limited annotation and the function of many genes are still not understood, or partly understood at best. Based on the assumption that genes with similar patterns of expression across a vast array of tissues and experimental conditions are likely to encode proteins with related functions or participate within a given pathway, we constructed a genome-wide Cattle Gene Co-expression Network (CGCN) using 72 microarray datasets that contained a total of 1470 Affymetrix Genechip Bovine Genome Arrays that were retrieved from either NCBI GEO or EBI ArrayExpress.

Results: The total of 16,607 probe sets, which represented 11,397 genes, with unique Entrez ID were consolidated into 32 co-expression modules that contained between 29 and 2569 probe sets. All of the identified modules showed strong functional enrichment for gene ontology (GO) terms and Reactome pathways. For example, modules with important biological functions such as response to virus, response to bacteria, energy metabolism, cell signaling and cell cycle have been identified. Moreover, gene co-expression networks using “guilt-byassociation” principle have been used to predict the potential function of 132 genes with no functional annotation. Four unknown Hub genes were identified in modules highly enriched for GO terms related to leukocyte activation (LOC509513), RNA processing (LOC100848208), nucleic acid metabolic process (LOC100850151) and organic-acid metabolic process (MGC137211). Such highly connected genes should be investigated more closely as they likely to have key regulatory roles.

Conclusions: We have demonstrated that the CGCN and its corresponding regulons provides rich information for experimental biologists to design experiments, interpret experimental results, and develop novel hypothesis on gene function in this poorly annotated genome. The network is publicly accessible at http://www.animalgenome. org/cgi-bin/host/reecylab/d.

Comments

This article is published as Beiki, Hamid, Ardeshir Nejati-Javaremi, Abbas Pakdel, Ali Masoudi-Nejad, Zhi-Liang Hu, and James M. Reecy. "Large-scale gene co-expression network as a source of functional annotation for cattle genes." BMC genomics 17 (2016): 846. doi: 10.1186/s12864-016-3176-2.

Creative Commons License

Creative Commons Attribution 4.0 License
This work is licensed under a Creative Commons Attribution 4.0 License.

Copyright Owner

The Authors

Language

en

File Format

application/pdf

Share

COinS