Analyzing Uncertainty in Neural Machine Translation

Myle Ott, Michael Auli, David Grangier, Marc’Aurelio Ranzato
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:3956-3965, 2018.

Abstract

Machine translation is a popular test bed for research in neural sequence-to-sequence models but despite much recent research, there is still a lack of understanding of these models. Practitioners report performance degradation with large beams, the under-estimation of rare words and a lack of diversity in the final translations. Our study relates some of these issues to the inherent uncertainty of the task, due to the existence of multiple valid translations for a single source sentence, and to the extrinsic uncertainty caused by noisy training data. We propose tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations. Our results show that search works remarkably well but that the models tend to spread too much probability mass over the hypothesis space. Next, we propose tools to assess model calibration and show how to easily fix some shortcomings of current models. We release both code and multiple human reference translations for two popular benchmarks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-ott18a, title = {Analyzing Uncertainty in Neural Machine Translation}, author = {Ott, Myle and Auli, Michael and Grangier, David and Ranzato, Marc'Aurelio}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {3956--3965}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/ott18a/ott18a.pdf}, url = {http://proceedings.mlr.press/v80/ott18a.html}, abstract = {Machine translation is a popular test bed for research in neural sequence-to-sequence models but despite much recent research, there is still a lack of understanding of these models. Practitioners report performance degradation with large beams, the under-estimation of rare words and a lack of diversity in the final translations. Our study relates some of these issues to the inherent uncertainty of the task, due to the existence of multiple valid translations for a single source sentence, and to the extrinsic uncertainty caused by noisy training data. We propose tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations. Our results show that search works remarkably well but that the models tend to spread too much probability mass over the hypothesis space. Next, we propose tools to assess model calibration and show how to easily fix some shortcomings of current models. We release both code and multiple human reference translations for two popular benchmarks.} }
Endnote
%0 Conference Paper %T Analyzing Uncertainty in Neural Machine Translation %A Myle Ott %A Michael Auli %A David Grangier %A Marc’Aurelio Ranzato %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-ott18a %I PMLR %P 3956--3965 %U http://proceedings.mlr.press/v80/ott18a.html %V 80 %X Machine translation is a popular test bed for research in neural sequence-to-sequence models but despite much recent research, there is still a lack of understanding of these models. Practitioners report performance degradation with large beams, the under-estimation of rare words and a lack of diversity in the final translations. Our study relates some of these issues to the inherent uncertainty of the task, due to the existence of multiple valid translations for a single source sentence, and to the extrinsic uncertainty caused by noisy training data. We propose tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations. Our results show that search works remarkably well but that the models tend to spread too much probability mass over the hypothesis space. Next, we propose tools to assess model calibration and show how to easily fix some shortcomings of current models. We release both code and multiple human reference translations for two popular benchmarks.
APA
Ott, M., Auli, M., Grangier, D. & Ranzato, M.. (2018). Analyzing Uncertainty in Neural Machine Translation. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:3956-3965 Available from http://proceedings.mlr.press/v80/ott18a.html.

Related Material