Joshua A. Udall, Iowa State University
Jordan M. Swanson, Iowa State University
Karl Haller, University of Arizona
Ryan A. Rapp, Iowa State University
Michael Edward Sparks, Iowa State University
Jamie Hatfield, University of Arizona
Yeisoo Yu, University of Arizona
Yingru Wu, CSIRO Plant Industry
Catriona Dowd, CSIRO Plant Industry
Aladdin B. Arpat, University of California - Davis
Brad A. Sickler, University of California - Davis
Thea A. Wilkins, University of California - Davis
Yin Ying Guo, Shanghai Institutes for Biological Sciences
Xiao Ya Chen, Shanghai Institutes for Biological Sciences
Jodi Scheffler, United States Department of Agriculture
Earl Taliercio, United States Department of Agriculture
Ricky Turley, United States Department of Agriculture
Helen McFadden, CSIRO Plant Industry
Paxton Payton, United States Department of Agriculture
Natalya Klueva, Texas Tech University
Randell Allen, Texas Tech University
Deshui Zhang, North Carolina State University
Candace Haigler, North Carolina State University
Curtis Wilkerson, Michigan State University
Jinfeng Suo, Institute of Genetics and Developmental Biology
Stefan R. Schulze, University of Georgia
Margaret L. Pierce, Oklahoma State University
Margaret Essenberg, Oklahoma State University
HyeRan Kim, University of Arizona
Danny J. Llewellyn, CSIRO Plant Industry
Elizabeth S. Dennis, CSIRO Plant Industry
Rod Wing, University of Arizona
Andrew H. Paterson, University of Georgia
Cari Soderlund, University of Arizona
Jonathan F. Wendel, Iowa State UniversityFollow

Document Type


Publication Version

Published Version

Publication Date


Journal or Book Title

Genome Research



First Page


Last Page





Approximately 185,000 Gossypium EST sequences comprising >94,800,000 nucleotides were amassed from 30 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including drought stress and pathogen challenges. These libraries were derived from allopolyploid cotton (Gossypium hirsutum; AT and DT genomes) as well as its two diploid progenitors,Gossypium arboreum (A genome) and Gossypium raimondii (D genome). ESTs were assembled using the Program for Assembling and Viewing ESTs (PAVE), resulting in 22,030 contigs and 29,077 singletons (51,107 unigenes). Further comparisons among the singletons and contigs led to recognition of 33,665 exemplar sequences that represent a nonredundant set of putative Gossypium genes containing partial or full-length coding regions and usually one or two UTRs. The assembly, along with their UniProt BLASTX hits, GO annotation, and Pfam analysis results, are freely accessible as a public resource for cotton genomics. Because ESTs from diploid and allotetraploid Gossypium were combined in a single assembly, we were in many cases able to bioinformatically distinguish duplicated genes in allotetraploid cotton and assign them to either the A or D genome. The assembly and associated information provide a framework for future investigation of cotton functional and evolutionary genomics.


This article is from Genome Research 16 (2006): 441, doi:10.1101/gr.4602906.


Works produced by employees of the U.S. Government as part of their official duties are not copyrighted within the U.S. The content of this document is not copyrighted.



File Format