Sampling techniques for big data analysis in finite population inference

Thumbnail Image
Date
2018-01-29
Authors
Kim, Jae Kwang
Wang, Zhonglei
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Person
Kim, Jae Kwang
Professor
Research Projects
Organizational Units
Organizational Unit
Statistics
As leaders in statistical research, collaboration, and education, the Department of Statistics at Iowa State University offers students an education like no other. We are committed to our mission of developing and applying statistical methods, and proud of our award-winning students and faculty.
Journal Issue
Is Version Of
Versions
Series
Department
Statistics
Abstract

In analyzing big data for finite population inference, it is critical to adjust for the selection bias in the big data. In this paper, we propose two methods of reducing the selection bias associated with the big data sample. The first method uses a version of inverse sampling by incorporating auxiliary infor- mation from external sources, and the second one borrows the idea of data integration by combining the big data sample with an independent proba- bility sample. Two simulation studies show that the proposed methods are unbiased and have better coverage rates than their alternatives. In addition, the proposed methods are easy to implement in practice.

Comments

This is a manuscript that has been accepted for publication in International Statistical Review: https://arxiv.org/abs/1801.09728.

Description
Keywords
Citation
DOI
Source
Copyright
Collections