Coder Social home page Coder Social logo

nlp2ct / consisttl Goto Github PK

View Code? Open in Web Editor NEW
17.0 0.0 2.0 12.14 MB

Implementation of our paper in EMNLP 2022, focused on the relationship between parent and child in transfer learning for low-resource NMT

Shell 3.17% C++ 0.47% Python 95.22% Perl 0.04% C 0.01% Lua 0.07% Cuda 0.78% Makefile 0.01% Batchfile 0.01% Cython 0.21%

consisttl's Introduction

ConsistTL: Modeling Consistency in Transfer Learning for Low-Resource Neural Machine Translation

Implementation of our paper pubslished in EMNLP 2022

Brief Introduction

Framework of consistency-based transfer learning.

Latest Release Latest Release Latest Release Latest Release

Transfer learning is a simple and powerful method to boost the model performance of low-resource neural machine translation (NMT). Existing transfer learning methods for NMT are static, which simply transfer the knowledge from a parent model to a child model once and for all via parameter initialization. In this paper, we instead propose a novel transfer learning method for NMT, namely ConsistTL, which can continuously transfer parent knowledge during the whole training of the child model. Specifically, for each training instance of the child model, ConsistTL constructs the semantically-equivalent instance for the parent model, and encourages the prediction consistency between the parent and child for this instance, which is equivalent to the child model learning each instance under the guidance of the parent model.

Preparation 1: Install fairseq

cd ConsisTL
pip install --editable .
cd ..
# python>=3.7
# We don't need to install pytorch individually. 

Preparation 2: Dowload and binarize data

# download and preprocess student data
mkdir tr_en
cd tr_en
# donwload tr-en from https://drive.google.com/file/d/1B23gkfQ3O430KSGVRCqTLyjPO01A5e6L/view?usp=sharing
# raw tr-en can be downloaded from https://opus.nlpl.eu/download.php?f=SETIMES/v2/moses/en-tr.txt.zip
cd ..
fairseq-preprocess -s tr -t en --trainpref tr_en/pack_clean/train --validpref tr_en/pack_clean/valid --testpref tr_en/pack_clean/test --srcdict tr_en/dict.tr.txt --tgtdict dict.en.txt --workers 10 --destdir ${STUDENT_DATA}
# download and preprocess teacher data
mkdir de_en
cd de_en
#donwload de-en from https://drive.google.com/file/d/15CXWVj0NIMjDjxEfPCw2WktoYADUuX8O/view?usp=sharing
cd ..
fairseq-preprocess -s de -t en --trainpref de_en/pack_clean/train --validpref de_en/pack_clean/valid --testpref de_en/pack_clean/test --joined-dictionary --destdir ${TEACHER_DATA} --workers 10

Step 1: Train two parent models

cd full_process_scripts

# train two parent model
## train for en-de
### path of binarized parent model training data
BIN_TEACHER_DATA=${BIN_TEACHER_DATA}
bash train_parent.sh en de $BIN_TEACHER_DATA
## train for de-en
bash train_parent.sh de en $BIN_TEACHER_DATA

Step 2: Generate semantically-equivalent sentences

#gen synthetic de-en for tr-en
## English sentences in child data
CHILD_EN=${CHILD_EN}
## path of trained reversed teacher checkpoint
REVERSED_TEACHER_CHECKPOINT=${REVERSED_TEACHER_CHECKPOINT}
## auxiliary source
AUX_SRC_BIN=${AUX_SRC_BIN}
bash gen.sh $CHILD_EN $BIN_TEACHER_DATA $REVERSED_TEACHER_CHECKPOINT $AUX_SRC_BIN

Step 3: Exploit Token Matching (TM) for initialization

#switch checkpoint
## path of initialized checkpoint
INIT_CHECKPOINT=${INIT_CHECKPOINT}
## path of student data
BIN_STUDENT_DATA=${BIN_STUDENT_DATA}
## path of teacher checkpoint
python ../ConsisTL/preprocessing_scripts/TM.py --checkpoint $TEACHER_CHECKPOINT --output $INIT_CHECKPOINT --parent-dict $BIN_TEACHER_DATA/dict.de.txt --child-dict $BIN_STUDENT_DATA/dict.tr.txt --switch-dict src

Step 4: Train Child Model (s)

# train for TM-TL
bash train.sh $STUDENT_DATA $INIT_CHECKPOINT

# train for ConsisTL
bash ConsisTL.sh $PREFIX-bin $TEACHER_CHECKPOINT $TEACHER_DATA $STUDENT_DATA $INIT_CHECKPOINT

consisttl's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.