Improving the precision of estimates of the frequency of rare events

Thumbnail Image
Date
2005-01-01
Authors
Dixon, Philip
Ellison, Aaron
Gotelli, Nicholas
Major Professor
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Person
Dixon, Philip
University Professor
Research Projects
Organizational Units
Organizational Unit
Statistics
As leaders in statistical research, collaboration, and education, the Department of Statistics at Iowa State University offers students an education like no other. We are committed to our mission of developing and applying statistical methods, and proud of our award-winning students and faculty.
Journal Issue
Is Version Of
Versions
Series
Department
Statistics
Abstract

The probability of a rare event is usually estimated directly as the number of times the event occurs divided by the total sample size. Unfortunately, the precision of this estimate is low. For typical sample sizes of N < 100 in ecological studies, the coefficient of variation (cv) of this estimate of the probability of a rare event can exceed 300%. Sample sizes on the order of 103–104 observations are needed to reduce the cv to below 10%. If it is impractical or impossible to increase the sample size, auxiliary data can be used to improve the precision of the estimate. We describe four approaches for using auxiliary data to improve the precision of estimates of the probability of a rare event: (1) Bayesian analysis that includes prior information about the probability; (2) stratification that incorporates information on the heterogeneity in the population; (3) regression models that account for information correlated with the probability; and (4) inclusion of aggregated data collected at larger spatial or temporal scales. These approaches are illustrated using data on the probability of capture of vespulid wasps by the insectivorous plant Darlingtonia californica. All four methods increase the precision of the estimate relative to the simple frequency-based estimate (absolute precision = 1.26, relative precision [cv] = 70%): stratification (absolute precision = 1.10, cv = 62%); regression models (absolute precision = 1.59, cv = 55%); Bayesian analysis with an informative prior probability distribution (absolute precision = 4.28, cv = 47%); and using temporally aggregated data (absolute precision = 6.75, cv = 36%). When informative auxiliary data is available, we recommend including it when estimating the probability of rare events.

Comments

This is an article from Ecology 86 (2005): 1114, doi:10.1890/04-0601. Posted with permission.

Description
Keywords
Citation
DOI
Subject Categories
Copyright
Collections