An extended data mining method for identifying differentially expressed assay-specific signatures in functional genomic studies

Rollins, Derrick; Teh, Ai-ling; Rollins, Derrick K

An extended data mining method for identifying differentially expressed assay-specific signatures in functional genomic studies

File

2010_RollinsDK_ExtendedDataMining.pdf (952.66 KB)

Date

2010-01-01

Authors

Rollins, Derrick

Teh, Ai-ling

Rollins, Derrick K

Authors

Person

Rollins, Derrick K

University Professor Emeritus

Organizational Units

Organizational Unit

Chemical and Biological Engineering

The function of the Department of Chemical and Biological Engineering has been to prepare students for the study and application of chemistry in industry. This focus has included preparation for employment in various industries as well as the development, design, and operation of equipment and processes within industry.Through the CBE Department, Iowa State University is nationally recognized for its initiatives in bioinformatics, biomaterials, bioproducts, metabolic/tissue engineering, multiphase computational fluid dynamics, advanced polymeric materials and nanostructured materials.

History
The Department of Chemical Engineering was founded in 1913 under the Department of Physics and Illuminating Engineering. From 1915 to 1931 it was jointly administered by the Divisions of Industrial Science and Engineering, and from 1931 onward it has been under the Division/College of Engineering. In 1928 it merged with Mining Engineering, and from 1973–1979 it merged with Nuclear Engineering. It became Chemical and Biological Engineering in 2005.

Dates of Existence
1913 - present

Historical Names

Department of Chemical Engineering (1913–1928)
Department of Chemical and Mining Engineering (1928–1957)
Department of Chemical Engineering (1957–1973, 1979–2005)

Department of Chemical and Biological Engineering (2005–present)

Related Units

College of Engineering(parent college)

Department

Chemical and Biological Engineering

Abstract

Background: Microarray data sets provide relative expression levels for thousands of genes for a small number, in comparison, of different experimental conditions called assays. Data mining techniques are used to extract specific information of genes as they relate to the assays. The multivariate statistical technique of principal component analysis (PCA) has proven useful in providing effective data mining methods. This article extends the PCA approach of Rollins et al. to the development of ranking genes of microarray data sets that express most differently between two biologically different grouping of assays. This method is evaluated on real and simulated data and compared to a current approach on the basis of false discovery rate (FDR) and statistical power (SP) which is the ability to correctly identify important genes. Results. This work developed and evaluated two new test statistics based on PCA and compared them to a popular method that is not PCA based. Both test statistics were found to be effective as evaluated in three case studies: (i) exposing E. coli cells to two different ethanol levels; (ii) application of myostatin to two groups of mice; and (iii) a simulated data study derived from the properties of (ii). The proposed method (PM) effectively identified critical genes in these studies based on comparison with the current method (CM). The simulation study supports higher identification accuracy for PM over CM for both proposed test statistics when the gene variance is constant and for one of the test statistics when the gene variance is non-constant. Conclusions. PM compares quite favorably to CM in terms of lower FDR and much higher SP. Thus, PM can be quite effective in producing accurate signatures from large microarray data sets for differential expression between assays groups identified in a preliminary step of the PCA procedure and is, therefore, recommended for use in these applications.

Comments

This article is from BioData Mining 3 (2010): article no. 11, doi: 10.1186/1756-0381-3-11.

Copyright

Fri Jan 01 00:00:00 UTC 2010

Collections

Publications

Full item page