View article

[PDF] from arxiv.org

Mask-predict: Parallel decoding of conditional masked language models

Authors

Marjan Ghazvininejad, Omer Levy, Yinhan Liu, Luke Zettlemoyer

Publication date

2019/4/19

Journal

arXiv preprint arXiv:1904.09324

Description

Most machine translation systems generate text autoregressively from left to right. We, instead, use a masked language modeling objective to train a model to predict any subset of the target words, conditioned on both the input text and a partially masked target translation. This approach allows for efficient iterative decoding, where we first predict all of the target words non-autoregressively, and then repeatedly mask out and regenerate the subset of words that the model is least confident about. By applying this strategy for a constant number of iterations, our model improves state-of-the-art performance levels for non-autoregressive and parallel decoding translation models by over 4 BLEU on average. It is also able to reach within about 1 BLEU point of a typical left-to-right transformer model, while decoding significantly faster.

Total citations

Cited by 542

20192020202120222023202414 86 116 131 121 71

Scholar articles

Mask-predict: Parallel decoding of conditional masked language models

M Ghazvininejad, O Levy, Y Liu, L Zettlemoyer - arXiv preprint arXiv:1904.09324, 2019

Constant-time machine translation with conditional masked language models*

M Ghazvininejad, O Levy, Y Liu, L Zettlemoyer - arXiv preprint arXiv:1904.09324, 2019

Cited by 37 Related articles