Estimating standard errors for importance sampling estimators with multiple Markov chains

Roy, Vivekananda; Tan, Aixin; Flegal, James

Estimating standard errors for importance sampling estimators with multiple Markov chains

File

seimp.pdf (881.87 KB)

Date

2015-06-25

Authors

Roy, Vivekananda

Tan, Aixin

Flegal, James

Organizational Units

Organizational Unit

Statistics

As leaders in statistical research, collaboration, and education, the Department of Statistics at Iowa State University offers students an education like no other. We are committed to our mission of developing and applying statistical methods, and proud of our award-winning students and faculty.

Department

Statistics

Abstract

The naive importance sampling estimator based on the samples from a single importance density can be extremely numerically unstable. We consider multiple distributions importance sampling estimators where samples from more than one probability distributions are combined to consistently estimate means with respect to given target distributions. These generalized importance sampling estimators provide more stable estimators than the naive importance sampling estimators. Importance sampling estimators can also be used in the Markov chain Monte Carlo (MCMC) context, that is, where iid samples are replaced with positive Harris Markov chains with invariant importance distributions. If these Markov chains converge to their respective target distributions at a geometric rate, then under two finite moment conditions a central limit theorem (CLT) holds for the importance sampling estimators. In order to calculate valid asymptotic standard errors, it is required to consistently estimate the asymptotic variance in the CLT. Recently Tan and Doss and Hobert (2015) developed an approach based on regenerative simulation for obtaining consistent estimators of the asymptotic variance. It is well-known that in practice it is often difficult to construct a useful minorization condition that is required in Tan and Doss and Hobert ’s (2015) regenerative simulation method. We provide an alternative estimator for these standard errors based on the easy to implement batch means methods. The multi-chain importance sampling estimators depend on Geyer’s (1994) reverse logistic estimator (of ratios of normalizing constants) which has wide applications, in its own right, in both frequentist and Bayesian inference. We also provide batch means estimator for calculating asymptotically valid standard errors of Geyer’s (1994) reverse logistic estimator. We illustrate the method with an application in Bayesian variable selection in linear regression. In particular, the multi-chain importance sampling estimator is used to perform empirical Bayes variable selection and the batch means estimator is used to obtain standard errors in the large p situation where regenerative method is not applicable.