1. Paper title

ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation

2. link

https://www.aclweb.org/anthology/2020.acl-main.251.pdf

3. 摘要

We propose to train a non-autoregressive machine translation model to minimize the energy defined by a pretrained autoregressive model. In particular, we view our non-autoregressive translation system as an inference network (Tu and Gimpel, 2018) trained to minimize the autoregressive teacher energy. This contrasts with the popular approach of training a non-autoregressive model on a distilled corpus consisting of the beam-searched outputs of such a teacher model. Our approach, which we call ENGINE (ENerGy-based Inference NEtworks), achieves state-of-the-art non-autoregressive results on the IWSLT 2014 DE-EN and WMT 2016 RO-EN datasets, approaching the performance of autoregressive models.1

[?] inference network
[?] teacher energy

4. 要解决什么问题

非自回归模型的传统做法:
先由teacher model的beam-search outputs生成distilled语料库
然后由NAT在此基础上训练

5. 作者的主要贡献

本文提供另一个NAT训练方法:
先由AT定义energy,然后由NAT最小化预训练的回归模型的energy

6. 得到了什么结果

2014 DE-EN and WMT 2016 RO-EN上得到SOTA非自回归结果
性能优于不带优化的原始NAT

7. 关键字

results matching ""

    No results matching ""