Understanding the Origins of Bias in Word Embeddings

Marc-Etienne Brunet, Colleen Alkalay-Houlihan, Ashton Anderson, Richard Zemel
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:803-811, 2019.

Abstract

Popular word embedding algorithms exhibit stereotypical biases, such as gender bias. The widespread use of these algorithms in machine learning systems can amplify stereotypes in important contexts. Although some methods have been developed to mitigate this problem, how word embedding biases arise during training is poorly understood. In this work we develop a technique to address this question. Given a word embedding, our method reveals how perturbing the training corpus would affect the resulting embedding bias. By tracing the origins of word embedding bias back to the original training documents, one can identify subsets of documents whose removal would most reduce bias. We demonstrate our methodology on Wikipedia and New York Times corpora, and find it to be very accurate.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-brunet19a, title = {Understanding the Origins of Bias in Word Embeddings}, author = {Brunet, Marc-Etienne and Alkalay-Houlihan, Colleen and Anderson, Ashton and Zemel, Richard}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {803--811}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/brunet19a/brunet19a.pdf}, url = {https://proceedings.mlr.press/v97/brunet19a.html}, abstract = {Popular word embedding algorithms exhibit stereotypical biases, such as gender bias. The widespread use of these algorithms in machine learning systems can amplify stereotypes in important contexts. Although some methods have been developed to mitigate this problem, how word embedding biases arise during training is poorly understood. In this work we develop a technique to address this question. Given a word embedding, our method reveals how perturbing the training corpus would affect the resulting embedding bias. By tracing the origins of word embedding bias back to the original training documents, one can identify subsets of documents whose removal would most reduce bias. We demonstrate our methodology on Wikipedia and New York Times corpora, and find it to be very accurate.} }
Endnote
%0 Conference Paper %T Understanding the Origins of Bias in Word Embeddings %A Marc-Etienne Brunet %A Colleen Alkalay-Houlihan %A Ashton Anderson %A Richard Zemel %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-brunet19a %I PMLR %P 803--811 %U https://proceedings.mlr.press/v97/brunet19a.html %V 97 %X Popular word embedding algorithms exhibit stereotypical biases, such as gender bias. The widespread use of these algorithms in machine learning systems can amplify stereotypes in important contexts. Although some methods have been developed to mitigate this problem, how word embedding biases arise during training is poorly understood. In this work we develop a technique to address this question. Given a word embedding, our method reveals how perturbing the training corpus would affect the resulting embedding bias. By tracing the origins of word embedding bias back to the original training documents, one can identify subsets of documents whose removal would most reduce bias. We demonstrate our methodology on Wikipedia and New York Times corpora, and find it to be very accurate.
APA
Brunet, M., Alkalay-Houlihan, C., Anderson, A. & Zemel, R.. (2019). Understanding the Origins of Bias in Word Embeddings. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:803-811 Available from https://proceedings.mlr.press/v97/brunet19a.html.

Related Material