Semester of Graduation
Information Systems and Business Analytics
First Major Professor
Master of Science (MS)
As artificial intelligence becomes more useable in industries, companies must be mindful of potential bias in their models. Bias in datasets and algorithms can cause a disparity in model output and can negatively impact minority groups. This paper describes potential adverse impacts of bias, sources of bias, and techniques for removing bias in machine learning models. The document's final sections examine an experiment in which a machine learning model for image classification was trained on a biased dataset and explored techniques to remove it.
I trained the model on an intentionally biased dataset of dog and cat images to demonstrate the impact of bias. After achieving the baseline results, I then tested several bias mitigation techniques on the model to examine their ability to increase fairness in the output. Two methods directly addressed bias within the data, and the other two techniques addressed the bias within the model. Ultimately, this experiment found that specifying TensorFlow Keras’ class weights within the machine learning model provided the best fairness results by minimizing the difference between the false negative rate and the false positive rate of the testing dataset predictions. However, this technique also reduced the accuracy of the model. In industry, the accuracy and fairness tradeoff should be analyzed and assessed depending on each measure's potential harm.
Embargo Period (admin only)
Williams, Samantha, "An Experiment in Demonstrating and Mitigating Bias in Image Classification" (2021). Creative Components. 826.