An Experiment in Demonstrating and Mitigating Bias in Image Classification

Thumbnail Image
Supplemental Files
Date
2021-01-01
Authors
Williams, Samantha
Major Professor
Anthony Townsend
Advisor
Committee Member
Journal Title
Journal ISSN
Volume Title
Publisher
Authors
Research Projects
Organizational Units
Organizational Unit
Information Systems and Business Analytics
In today’s business landscape, information systems and business analytics are pivotal elements that drive success. Information systems form the digital foundation of modern enterprises, while business analytics involves the strategic analysis of data to extract meaningful insights. Information systems have the power to create and restructure industries, empower individuals and firms, and dramatically reduce costs. Business analytics empowers organizations to make precise, data-driven decisions that optimize operations, enhance strategies, and fuel overall growth. Explore these essential fields to understand how data and technology come together, providing the knowledge needed to make informed decisions and achieve remarkable outcomes.
Journal Issue
Is Version Of
Versions
Series
Department
Information Systems and Business Analytics
Abstract

As artificial intelligence becomes more useable in industries, companies must be mindful of potential bias in their models. Bias in datasets and algorithms can cause a disparity in model output and can negatively impact minority groups. This paper describes potential adverse impacts of bias, sources of bias, and techniques for removing bias in machine learning models. The document's final sections examine an experiment in which a machine learning model for image classification was trained on a biased dataset and explored techniques to remove it.

I trained the model on an intentionally biased dataset of dog and cat images to demonstrate the impact of bias. After achieving the baseline results, I then tested several bias mitigation techniques on the model to examine their ability to increase fairness in the output. Two methods directly addressed bias within the data, and the other two techniques addressed the bias within the model. Ultimately, this experiment found that specifying TensorFlow Keras’ class weights within the machine learning model provided the best fairness results by minimizing the difference between the false negative rate and the false positive rate of the testing dataset predictions. However, this technique also reduced the accuracy of the model. In industry, the accuracy and fairness tradeoff should be analyzed and assessed depending on each measure's potential harm.

Comments
Description
Keywords
Citation
DOI
Source
Copyright
Fri Jan 01 00:00:00 UTC 2021