[edit]
Evaluating Machine Translation Quality with Conformal Predictive Distributions
Proceedings of the Twelfth Symposium on Conformal
and Probabilistic Prediction with Applications, PMLR 204:413-429, 2023.
Abstract
This paper presents a new approach for assessing
uncertainty in machine translation by simultaneously
evaluating translation quality and providing a
reliable confidence score. Our approach utilizes
conformal predictive distributions to produce
prediction intervals with guaranteed coverage,
meaning that for any given significance level
$\epsilon$, we can expect the true quality score of
a translation to fall out of the interval at a rate
of 1 - $\epsilon$. In this paper, we demonstrate how
our method outperforms a simple, but effective
baseline on six different language pairs in terms of
coverage and sharpness. Furthermore, we validate
that our approach requires the data exchangeability
assumption to hold for optimal performance.