ECNMT: Emergent Communication Pretraining for Few-Shot Machine Translation

This repository is the official PyTorch implementation of the following paper:

Yaoyiran Li, Edoardo Maria Ponti, Ivan Vulić, and Anna Korhonen. 2020. Emergent Communication Pretraining for Few-Shot Machine Translation. In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020). LINK

This method is a form of unsupervised knowledge transfer in the absence of linguistic data, where a model is first pre-trained on artificial languages emerging from referential games and then fine-tuned on few-shot downstream tasks like neural machine translation.

Dependencies

PyTorch 1.3.1
Python 3.6

Data

COCO image features are available in the sub-folder half_feats here. Preprocessed EN-DE (DE-EN) data for translation are available in the sub-folder task1 here. Both are obtained from Translagent.

Please find the data for translation in the other language pairs (EN-CS, EN-RO, EN-FR) in the links below.

Dictionaries	Train Sentence Pairs	Reference Translations
EN-CS & CS-EN	EN-CS & CS-EN	EN-CS & CS-EN
EN-RO & RO-EN	EN-RO & RO-EN	EN-RO & RO-EN
EN-FR & FR-EN	EN-FR & FR-EN	EN-FR & FR-EN

Pretrained Models for Emergent Communication

Source / Target	Target / Source
EN	DE
EN	CS
EN	RO
EN	FR

Experiments

Step 1: run EC pretraining (otherwise go to Step 2 and use a pretrained model).

cd ./ECPRETRAIN
sh run_training.sh

Step 2: run NMT fine-tuning (please modify the roots for training data, pretrained model and saved path before).

cd ./NMT
sh run_training.sh

Optional: run baseline

cd ./BASELINENMT
sh run_training.sh

Citation

@inproceedings{YL:2020,
  author    = {Yaoyiran Li and Edoardo Maria Ponti and Ivan Vulić and Anna Korhonen},
  title     = {Emergent Communication Pretraining for Few-Shot Machine Translation},
  year      = {2020},
  booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
}

Acknowledgements

Part of the code is based on Translagent.

The datasets for our experiments include MS COCO for Emergent Communication pretraining, Multi30k Task 1 and Europarl for NMT fine-tuning. Text preprocessing is based on Moses and Subword-NMT.

Please cite these resources accordingly.

cambridgeltl / ecnmt Goto Github PK

ecnmt's Introduction

ECNMT: Emergent Communication Pretraining for Few-Shot Machine Translation

Dependencies

Data

Pretrained Models for Emergent Communication

Experiments

Citation

Acknowledgements

ecnmt's People

Contributors

Stargazers

Watchers

Forkers

ecnmt's Issues

is hard Gumbel used in paper?

EC pretraining model generates short sentences and has poor accuracy

EC pretraining prediction accuracy collapses after achieving 99+% accuracy.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent