Optimizing fairness in machine learning systems through random repair of a biased dataset.

dc.contributor.advisorHeskes, T.M.
dc.contributor.authorPhilips, V.W.
dc.date.issued2019-11-25
dc.description.abstractMachine learning systems use datasets that could have biases in their data. These biases could cause unwanted biases in decision making towards groups with sensitive attribute such as race, gender and sexual orientation [5][7]. A possible method to address this problem is a preprocessing method called Random Repair which was introduced in [3] in 2018 [3]. In this thesis we provide insights on the effects that random repair has on fairness and classifier accuracy in machine learning systems. We applied random repair on the adult income dataset and on the COMPAS recidivism dataset. These datasets are known to be biased datasets [5][13]. For each dataset, we compared the fairness and accuracy of a logistic regression classifier and a random forest classifier before and after preprocessing the datasets with random repair. In this thesis we measure the fairness called demographic parity of our classifiers by calculating the disparate impact index. Our research shows that increasing the fairness through random repair results in the desired amount of fairness for both classifiers in both datasets. However, increasing the fairness trough random repair also decreases the classifier accuracy of both classifiers.en_US
dc.embargo.lift10000-01-01
dc.embargo.typePermanent embargoen_US
dc.identifier.urihttps://theses.ubn.ru.nl/handle/123456789/12582
dc.language.isoenen_US
dc.thesis.facultyFaculteit der Sociale Wetenschappenen_US
dc.thesis.specialisationBachelor Artificial Intelligenceen_US
dc.thesis.studyprogrammeArtificial Intelligenceen_US
dc.thesis.typeBacheloren_US
dc.titleOptimizing fairness in machine learning systems through random repair of a biased dataset.en_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
4288149 Philips.pdf
Size:
508.7 KB
Format:
Adobe Portable Document Format