Degree Type


Date of Award


Degree Name

Doctor of Philosophy




Statistical methods are developed for comparing vectors of proportions among several subpopulations when the data are obtained from complex sampling schemes. For example, for two-stage cluster sampling an independent sample of clusters is obtained from each of the subpopulations and individuals are randomly selected from each of the sampled clusters. The true proportions of individuals belonging to the various categories may vary among clusters in the same subpopulation and this variation must be incorporated into the test for the equality of the vectors of proportions. Several alternative testing procedures are considered. Wald statistics provide a general method of obtaining approximate chi-square tests, but the most general Wald statistic requires the estimation of covariance matrices for the specific sampling scheme used to obtain the data. This variance estimation generally requires a large number of clusters. Specifying a model for the population is a method of reducing the sample size required for variance estimation. The Dirichlet-Multinomial model is considered as a model for two-stage cluster sampling. The Dirichlet-Multinomial model assumes that the covariance matrix under two-stage cluster sampling is a multiple of the covariance matrix under simple random sampling. Several methods are considered for estimating the parameters of the Dirichlet-Multinomial and for testing the fit of the model;In special cases, the chi-square test for the equality of the vectors of proportions is shown to be, essentially, a linear combination of Pearson chi-square Statistics and Probability;;Various testing and estimation methods are compared through the analysis of several data sets. For two-stage cluster sampling the numbers of clusters sampled from the subpopulations have a substantial impact on the performance of the test Statistics and Probability; The examples also illustrate the fact that the incorrect application of the Pearson chi-square statistic based on simple random sampling can produce misleading results when the frequencies are obtained from a more complex sampling scheme.



Digital Repository @ Iowa State University,

Copyright Owner

Jeffrey Rupert Wilson



Proquest ID


File Format


File Size

249 pages