Application of order restricted statistical inference and hidden Markov modeling to problems in biology and genomics

Thumbnail Image
Date
2012-01-01
Authors
Wang, Heng
Major Professor
Advisor
Dan Nettleton
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Altmetrics
Authors
Research Projects
Organizational Units
Organizational Unit
Statistics
As leaders in statistical research, collaboration, and education, the Department of Statistics at Iowa State University offers students an education like no other. We are committed to our mission of developing and applying statistical methods, and proud of our award-winning students and faculty.
Journal Issue
Is Version Of
Versions
Series
Department
Statistics
Abstract

Statistics is a powerful tool in different scientific fields by providing statistical supports in experimental designs, data processing and statistical inference. In this thesis, we conduct theoretical and methodological statistical research with applications in biological and genomic areas.

In Chapter 2, we study the statistical testing problems with order-restricted null hypothesis, where the null parameter space is a union of two disjoint convex cones. We derive the likelihood ratio test and the intersection-union test, and show that the likelihood ratio test is uniformly more powerful than the intersection-union test. We also demonstrate the situation in which the uniformly more powerful tests are constructed, and discuss the applicability of the uniformly more powerful tests to real data analyses.

In Chapter 3, we propose four testing procedures for detecting the monotonic changes in multivariate gene expression distributions. We consider cases in which the treatment factor is ordinal and can be naturally ordered. The proposed procedures focus the detection powers to genes with monotonic departures from mean equality. Also, the proposed methods are able to deal with small sample sizes and high-dimensional distributions.

In Chapter 4, we propose a new methodology, based on a Hidden Markov Model with a mixture emission distribution, to detect copy number variations between different genomics using next generation sequencing read counts. This method demonstrates an improvement comparing to existing methods. We use this method to identify copy number variations between two maize genotypes, and the result is concordant to previous genomic studies using microarray data.

This thesis concludes in Chapter 5, which provides a discussion of future research directions.

Comments
Description
Keywords
Citation
Source
Copyright
Sun Jan 01 00:00:00 UTC 2012