Imputer: Sequence Modelling via Imputation and Dynamic Programming

William Chan, Chitwan Saharia, Geoffrey Hinton, Mohammad Norouzi, Navdeep Jaitly
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:1403-1413, 2020.

Abstract

This paper presents the Imputer, a neural sequence model that generates output sequences iteratively via imputations. The Imputer is an iterative generation model, requiring only a constant number of generation steps independent of the number of input or output tokens. The Imputer can be trained to approximately marginalize over all possible alignments between the input and output sequences, and all possible generation orders. We present a tractable dynamic programming training algorithm, which yields a lower bound on the log marginal likelihood. When applied to end-to-end speech recognition, the Imputer outperforms prior non-autoregressive models and achieves competitive results to autoregressive models. On LibriSpeech test-other, the Imputer achieves 11.1 WER, outperforming CTC at 13.0 WER and seq2seq at 12.5 WER.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-chan20b, title = {Imputer: Sequence Modelling via Imputation and Dynamic Programming}, author = {Chan, William and Saharia, Chitwan and Hinton, Geoffrey and Norouzi, Mohammad and Jaitly, Navdeep}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {1403--1413}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/chan20b/chan20b.pdf}, url = {https://proceedings.mlr.press/v119/chan20b.html}, abstract = {This paper presents the Imputer, a neural sequence model that generates output sequences iteratively via imputations. The Imputer is an iterative generation model, requiring only a constant number of generation steps independent of the number of input or output tokens. The Imputer can be trained to approximately marginalize over all possible alignments between the input and output sequences, and all possible generation orders. We present a tractable dynamic programming training algorithm, which yields a lower bound on the log marginal likelihood. When applied to end-to-end speech recognition, the Imputer outperforms prior non-autoregressive models and achieves competitive results to autoregressive models. On LibriSpeech test-other, the Imputer achieves 11.1 WER, outperforming CTC at 13.0 WER and seq2seq at 12.5 WER.} }
Endnote
%0 Conference Paper %T Imputer: Sequence Modelling via Imputation and Dynamic Programming %A William Chan %A Chitwan Saharia %A Geoffrey Hinton %A Mohammad Norouzi %A Navdeep Jaitly %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-chan20b %I PMLR %P 1403--1413 %U https://proceedings.mlr.press/v119/chan20b.html %V 119 %X This paper presents the Imputer, a neural sequence model that generates output sequences iteratively via imputations. The Imputer is an iterative generation model, requiring only a constant number of generation steps independent of the number of input or output tokens. The Imputer can be trained to approximately marginalize over all possible alignments between the input and output sequences, and all possible generation orders. We present a tractable dynamic programming training algorithm, which yields a lower bound on the log marginal likelihood. When applied to end-to-end speech recognition, the Imputer outperforms prior non-autoregressive models and achieves competitive results to autoregressive models. On LibriSpeech test-other, the Imputer achieves 11.1 WER, outperforming CTC at 13.0 WER and seq2seq at 12.5 WER.
APA
Chan, W., Saharia, C., Hinton, G., Norouzi, M. & Jaitly, N.. (2020). Imputer: Sequence Modelling via Imputation and Dynamic Programming. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:1403-1413 Available from https://proceedings.mlr.press/v119/chan20b.html.

Related Material