Examining IBM's AIF360 AI fairness platform on performance and relevance utilising ProPublica's investigation into the COMPAS recidivism dataset

No Thumbnail Available
Journal Title
Journal ISSN
Volume Title
The last decade has seen a vast increase in usage and development of Machine Learning (ML) techniques, paired with now an increasing academical, societal, and corporate interest in AI fairness (Gebru et al., 2018). IBM has developed the AI fairness platform AIF360 (Bellamy et al., 2018) to facilitate the transition of fairness research algorithms to mainstream business practise. I show that the algorithms encased within the platform perform well on bias detection when contrasted against ProPublica's investigation into the COMPAS recidivism dataset, coming up with similar but weaker results. I continue to show that the algorithms too perform well on bias mitigation as compared to each other and an original untouched dataset when tested for various fairness metrics. Lastly, I lay down the argument that AIF360 falls at on its own goal of easing the burden of speci cally ML and AI developers to deal with questions on fairness (Bellamy et al., 2018). The absence of ways to manage both the delayed impact of `fair' algorithms and the ambiguity of de ning the interdisciplinary term of fairness is glaring. Furthermore, its simpli cation of AI fairness research algorithms and metrics is not at the request of the ML fairness expert community (Charrington, 2019), and worse even might invoke harm due to facilitating parameter policing. I conclude that as a whole AIF360 seems to be a positive platform for professionals in the eld to collaborate and pool their resources together, but is neutral at best and a danger at worst when considering its relevance and impact for mainstream usage in business.
Faculteit der Sociale Wetenschappen