An Experiment in Demonstrating and Mitigating Bias in Image Classification

Williams, Samantha

An Experiment in Demonstrating and Mitigating Bias in Image Classification

File

auto_convert.pdf (1.51 MB)

Supplemental Files

Date

2021-01-01

Authors

Williams, Samantha

Major Professor

Anthony Townsend

Organizational Units

Organizational Unit

Information Systems and Business Analytics

In today’s business landscape, information systems and business analytics are pivotal elements that drive success. Information systems form the digital foundation of modern enterprises, while business analytics involves the strategic analysis of data to extract meaningful insights. Information systems have the power to create and restructure industries, empower individuals and firms, and dramatically reduce costs. Business analytics empowers organizations to make precise, data-driven decisions that optimize operations, enhance strategies, and fuel overall growth. Explore these essential fields to understand how data and technology come together, providing the knowledge needed to make informed decisions and achieve remarkable outcomes.

Department

Information Systems and Business Analytics

Abstract

As artificial intelligence becomes more useable in industries, companies must be mindful of potential bias in their models. Bias in datasets and algorithms can cause a disparity in model output and can negatively impact minority groups. This paper describes potential adverse impacts of bias, sources of bias, and techniques for removing bias in machine learning models. The document's final sections examine an experiment in which a machine learning model for image classification was trained on a biased dataset and explored techniques to remove it.

I trained the model on an intentionally biased dataset of dog and cat images to demonstrate the impact of bias. After achieving the baseline results, I then tested several bias mitigation techniques on the model to examine their ability to increase fairness in the output. Two methods directly addressed bias within the data, and the other two techniques addressed the bias within the model. Ultimately, this experiment found that specifying TensorFlow Keras’ class weights within the machine learning model provided the best fairness results by minimizing the difference between the false negative rate and the false positive rate of the testing dataset predictions. However, this technique also reduced the accuracy of the model. In industry, the accuracy and fairness tradeoff should be analyzed and assessed depending on each measure's potential harm.

Copyright

Fri Jan 01 00:00:00 UTC 2021

Collections

Creative Components

Full item page