1. Paper title
ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
2. link
https://www.aclweb.org/anthology/2020.acl-main.251.pdf
3. 摘要
We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model. In particular, we view our non-autoregressive translation system as an inference network (Tu and Gimpel, 2018) trained to minimize the autoregressive teacher energy. This contrasts with the popular approach of training a non-autoregressive model on a distilled corpus consisting of the beam-searched outputs of such a teacher model. Our approach, which we call ENGINE (ENerGy-based Inference NEtworks), achieves state-of-the-art non-autoregressive results on the IWSLT 2014 DE-EN and WMT 2016 RO-EN datasets, approaching the performance of autoregressive models.1
[?] inference network
[?] teacher energy
4. 要解决什么问题
非自回归模型的传统做法:
先由teacher model的beam-search outputs生成distilled语料库
然后由NAT在此基础上训练
5. 作者的主要贡献
本文提供另一个NAT训练方法:
先由AT定义energy,然后由NAT最小化预训练的回归模型的energy
6. 得到了什么结果
2014 DE-EN and WMT 2016 RO-EN上得到SOTA非自回归结果
性能优于不带优化的原始NAT