Men Also Do Laundry: Multi-Attribute Bias Amplification

Dora Zhao, Jerone Andrews, Alice Xiang
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:42000-42017, 2023.

Abstract

The phenomenon of $\textit{bias amplification}$ occurs when models amplify training set biases at test time. Existing metrics measure bias amplification with respect to single annotated attributes (e.g., $\texttt{computer}$). However, large-scale datasets typically consist of instances with multiple attribute annotations (e.g., $\{\texttt{computer}, \texttt{keyboard}\}$). We demonstrate models can learn to exploit correlations with respect to multiple attributes, which are not accounted for by current metrics. Moreover, we show that current metrics can give the erroneous impression that little to no bias amplification has occurred as they aggregate positive and negative bias scores. Further, these metrics lack an ideal value, making them difficult to interpret. To address these shortcomings, we propose a new metric: $\textit{Multi-Attribute Bias Amplification}$. We validate our metric’s utility through a bias amplification analysis on the COCO, imSitu, and CelebA datasets. Finally, we benchmark bias mitigation methods using our proposed metric, suggesting possible avenues for future bias mitigation efforts.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-zhao23a, title = {Men Also Do Laundry: Multi-Attribute Bias Amplification}, author = {Zhao, Dora and Andrews, Jerone and Xiang, Alice}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {42000--42017}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/zhao23a/zhao23a.pdf}, url = {https://proceedings.mlr.press/v202/zhao23a.html}, abstract = {The phenomenon of $\textit{bias amplification}$ occurs when models amplify training set biases at test time. Existing metrics measure bias amplification with respect to single annotated attributes (e.g., $\texttt{computer}$). However, large-scale datasets typically consist of instances with multiple attribute annotations (e.g., $\{\texttt{computer}, \texttt{keyboard}\}$). We demonstrate models can learn to exploit correlations with respect to multiple attributes, which are not accounted for by current metrics. Moreover, we show that current metrics can give the erroneous impression that little to no bias amplification has occurred as they aggregate positive and negative bias scores. Further, these metrics lack an ideal value, making them difficult to interpret. To address these shortcomings, we propose a new metric: $\textit{Multi-Attribute Bias Amplification}$. We validate our metric’s utility through a bias amplification analysis on the COCO, imSitu, and CelebA datasets. Finally, we benchmark bias mitigation methods using our proposed metric, suggesting possible avenues for future bias mitigation efforts.} }
Endnote
%0 Conference Paper %T Men Also Do Laundry: Multi-Attribute Bias Amplification %A Dora Zhao %A Jerone Andrews %A Alice Xiang %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-zhao23a %I PMLR %P 42000--42017 %U https://proceedings.mlr.press/v202/zhao23a.html %V 202 %X The phenomenon of $\textit{bias amplification}$ occurs when models amplify training set biases at test time. Existing metrics measure bias amplification with respect to single annotated attributes (e.g., $\texttt{computer}$). However, large-scale datasets typically consist of instances with multiple attribute annotations (e.g., $\{\texttt{computer}, \texttt{keyboard}\}$). We demonstrate models can learn to exploit correlations with respect to multiple attributes, which are not accounted for by current metrics. Moreover, we show that current metrics can give the erroneous impression that little to no bias amplification has occurred as they aggregate positive and negative bias scores. Further, these metrics lack an ideal value, making them difficult to interpret. To address these shortcomings, we propose a new metric: $\textit{Multi-Attribute Bias Amplification}$. We validate our metric’s utility through a bias amplification analysis on the COCO, imSitu, and CelebA datasets. Finally, we benchmark bias mitigation methods using our proposed metric, suggesting possible avenues for future bias mitigation efforts.
APA
Zhao, D., Andrews, J. & Xiang, A.. (2023). Men Also Do Laundry: Multi-Attribute Bias Amplification. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:42000-42017 Available from https://proceedings.mlr.press/v202/zhao23a.html.

Related Material