An Interactive Visual Demo of Bias Mitigation Techniques for Word Representations From a Geometric Perspective
Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, PMLR 176:330-335, 2022.
Language representations are known to encode and propagate biases, i.e., stereotypical associations between words or groups of words that may cause representational harm. In this demo, we utilize interactive visualization to increase the interpretability of a number of state-of-the-art techniques that are designed to identify, mitigate, and attenuate these biases in word representations, in particular, from a geometric perspective. We provide an open source web-based visualization tool and offer hands-on experience in exploring the effects of these debiasing techniques on the geometry of high-dimensional word vectors. To help understand how various debiasing techniques change the underlying geometry, we decompose each technique into modular and interpretable sequences of primitive operations, and study their effect on the word vectors using dimensionality reduction and interactive visual exploration. This demo is primarily designed to aid natural language processing (NLP) practitioners and researchers working with fairness and ethics of machine learning systems. It can also be used to educate NLP novices in understanding the existence of and then mitigating biases in word embeddings.