[edit]
Constrained Density Matching and Modeling for Cross-lingual Alignment of Contextualized Representations
Proceedings of The 14th Asian Conference on Machine
Learning, PMLR 189:1245-1260, 2023.
Abstract
Multilingual representations pre-trained with
monolingual data exhibit considerably unequal task
performances across languages. Previous studies
address this challenge with resource-intensive
contextualized alignment, which assumes the
availability of large parallel data, thereby leaving
under-represented language communities behind. In
this work, we attribute the data hungriness of
previous alignment techniques to two limitations:
(i) the inability to sufficiently leverage data and
(ii) these techniques are not trained properly. To
address these issues, we introduce supervised and
unsupervised density-based approaches named Real-NVP
and GAN-Real-NVP, driven by Normalizing Flow, to
perform alignment, both dissecting the alignment of
multilingual subspaces into density matching and
density modeling. We complement these approaches
with our validation criteria in order to guide the
training process. Our experiments encompass 16
alignments, including our approaches, evaluated
across 6 language pairs, synthetic data and 5 NLP
tasks. We demonstrate the effectiveness of our
approaches in the scenarios of limited and no
parallel data. First, our supervised approach
trained on 20k parallel data (sentences) mostly
surpasses Joint-Align and InfoXLM trained on over
100k parallel sentences. Second, parallel data can
be removed without sacrificing performance when
integrating our unsupervised approach in our
bootstrapping procedure, which is theoretically
motivated to enforce equality of multilingual
subspaces. Moreover, we demonstrate the advantages
of validation criteria over validation data for
guiding supervised training.