Degree Type
Dissertation
Date of Award
2020
Degree Name
Doctor of Philosophy
Department
Statistics
Major
Statistics
First Advisor
Heike Hofmann
Abstract
The analysis of data can be conceptualized as a process of sequential steps or actions applied to data in order to achieve a quantitative result. An important aspect of the process is how to ensure that it is reproducible. Reproducibility as it applies to Statistics research involves both statistical reproducibility and computational reproducibility. Achieving reproducibility is not trivial, particularly if the problem is complex or involves data from non-standard sources. Automated bullet evidence comparison as proposed by Hare et al. (2017) involves both a complex data analysis as well as a non-standard form of data. Here, it serves as a large-scale motivating example, to help us study the impact of decision-making on the statistical and computational reproducibility of a quantitative result. We first present a method for data pre-processing and assess its impact on bullet land engraved area (LEA) matching accuracy. This is followed by a large user variability study of the high-resolution bullet LEA scanning process and development of an extended Gauge Repeatability and Reproducibility framework. Finally, we propose a framework for adaptive computational reproducibility in a changing landscape of R packages and present software tools to facilitate the study and management of computational reproducibility in R.
DOI
https://doi.org/10.31274/etd-20200902-126
Copyright Owner
Kiegan Rice
Copyright Date
2020-08
Language
en
File Format
application/pdf
File Size
176 pages
Recommended Citation
Rice, Kiegan, "A framework for statistical and computational reproducibility in large-scale data analysis projects with a focus on automated forensic bullet evidence comparison" (2020). Graduate Theses and Dissertations. 18207.
https://lib.dr.iastate.edu/etd/18207