Coder Social home page Coder Social logo

cambridgeltl / ecnmt Goto Github PK

View Code? Open in Web Editor NEW
13.0 6.0 3.0 2.87 MB

Emergent Communication Pretraining for Few-Shot Machine Translation

Home Page: https://github.com/cambridgeltl/ECNMT

Shell 0.20% Python 99.80%
emergent-communication pretraining machine-translation pytorch inductive-biases nlp

ecnmt's Introduction

ECNMT: Emergent Communication Pretraining for Few-Shot Machine Translation

This repository is the official PyTorch implementation of the following paper:

Yaoyiran Li, Edoardo Maria Ponti, Ivan Vulić, and Anna Korhonen. 2020. Emergent Communication Pretraining for Few-Shot Machine Translation. In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020). LINK

This method is a form of unsupervised knowledge transfer in the absence of linguistic data, where a model is first pre-trained on artificial languages emerging from referential games and then fine-tuned on few-shot downstream tasks like neural machine translation.

Emergent Communication and Machine Translation

Dependencies

  • PyTorch 1.3.1
  • Python 3.6

Data

COCO image features are available in the sub-folder half_feats here. Preprocessed EN-DE (DE-EN) data for translation are available in the sub-folder task1 here. Both are obtained from Translagent.

Please find the data for translation in the other language pairs (EN-CS, EN-RO, EN-FR) in the links below.

Dictionaries Train Sentence Pairs Reference Translations
EN-CS & CS-EN EN-CS & CS-EN EN-CS & CS-EN
EN-RO & RO-EN EN-RO & RO-EN EN-RO & RO-EN
EN-FR & FR-EN EN-FR & FR-EN EN-FR & FR-EN

Pretrained Models for Emergent Communication

Source / Target Target / Source
EN DE
EN CS
EN RO
EN FR

Experiments

Step 1: run EC pretraining (otherwise go to Step 2 and use a pretrained model).

cd ./ECPRETRAIN
sh run_training.sh

Step 2: run NMT fine-tuning (please modify the roots for training data, pretrained model and saved path before).

cd ./NMT
sh run_training.sh

Optional: run baseline

cd ./BASELINENMT
sh run_training.sh

Citation

@inproceedings{YL:2020,
  author    = {Yaoyiran Li and Edoardo Maria Ponti and Ivan Vulić and Anna Korhonen},
  title     = {Emergent Communication Pretraining for Few-Shot Machine Translation},
  year      = {2020},
  booktitle = {Proceedings of the 28th International Conference on Computational Linguistics},
}

Acknowledgements

Part of the code is based on Translagent.

The datasets for our experiments include MS COCO for Emergent Communication pretraining, Multi30k Task 1 and Europarl for NMT fine-tuning. Text preprocessing is based on Moses and Subword-NMT.

Please cite these resources accordingly.

ecnmt's People

Contributors

ducdauge avatar yaoyiran avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ecnmt's Issues

EC pretraining model generates short sentences and has poor accuracy

Hi, I am re-running the ECPRETRAIN script and trying to reproduce the results.

However, I found the model tends to generate very short sentences as the training goes on:
image
This seems to be not aligned with the description in the paper.

Besides, it seems that we only save a model when it reaches 99% accuracy, however, none of the training epoch reaches that in my experiment.

I try to reproduce the environment, however, I found that Pytorch 1.3.1 does not seem to be available on Pytorch's official webpage anymore, so I use Pytorch 1.6 instead. I do not know whether this could be the reason. Any insight here would be helpful.

EC pretraining prediction accuracy collapses after achieving 99+% accuracy.

Hi,
I am trying to run the ECPRETRAINING script with default hparams. The model is able to reach 99% accuracy in 4k epochs but after that the prediction accuracy collapses. I am unable to find any bug in the code that might be causing this. I have attached the training logs for the same

Reaching 99%
image

Collapse to 0%
image

Full log
log.log

It would great if you could help resolve this issue.
Thanks.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.