1. Paper title

Language-aware Interlingua for Multilingual Neural Machine Translation

2. link

https://www.aclweb.org/anthology/2020.acl-main.150.pdf

3. 摘要

Multilingual neural machine translation (NMT) has led to impressive accuracy improvements in low-resource scenarios by sharing common linguistic information across languages. However, the traditional multilingual model fails to capture the diversity and specificity of different languages, resulting in inferior performance compared with individual models that are sufficiently trained. In this paper, we incorporate a language-aware interlingua into the Encoder-Decoder architecture. The interlingual network enables the model to learn a language-independent representation from the semantic spaces of different languages, while still allowing for language-specific specialization of a particular language-pair. Experiments show that our proposed method achieves remarkable improvements over state-of-the-art multilingual NMT baselines and produces comparable performance with strong individual models.

4. 要解决什么问题

多语言NMT是指在一个模型中handle多种语言对。

多语言NMT的优点:

  1. 减少online serving的成本
  2. 减少offline training的成本
  3. 提升low-resource语言对的翻译性能。

多语言NMT的challenge:语言多样性和模型能力限制的矛盾

多语言模型的改进:

  1. 提升模型容量
  2. 引入multiple encoder & decoder
  3. 改进注意力机制,关注language specific signals
  4. model the specifity of different languages
  5. [?] 增加pre-designed tokens

多语言模型无法捕获不同语言的多样性和特殊性。

5. 作者的主要贡献

引入language-aware的中介语模块,将interlingua(中介语)整合到Encoder-Decoder结构中,用于学习与语言无关的表示,同时允许特定语言的特殊性。
中介语模块的作用:

  1. model the shared specific space
  2. 作为encoder和decoder之间的桥梁

本文的过程和方法:

  1. language embedding,用于表达每个语言的特征
  2. interlingua embedding,捕获所有语言的通用语义
  3. 注意力机制:把encoder映射到共享语义空间
  4. reconstruction loss和语义一致性损失函数:保证语义一致
  5. language-aware positional embedding:作为初始状态

6. 得到了什么结果

性能优于STOA多语言NMT

7. 关键字

results matching ""

    No results matching ""