Coder Social home page Coder Social logo

wangqi1996 / generation_order Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 0.0 6.32 MB

License: MIT License

Shell 1.07% Python 90.57% Perl 5.71% Smalltalk 0.19% Emacs Lisp 1.74% JavaScript 0.09% NewLisp 0.16% Ruby 0.17% Slash 0.03% SystemVerilog 0.02% Roff 0.25%

generation_order's Introduction

NJUNMT-pytorch


License: MIT Build Status

NJUNMT-pytorch is an open-source toolkit for neural machine translation. This toolkit is highly research-oriented, which contains some common baseline model:

  • DL4MT-tutorial: A rnn-base nmt model widely used as baseline. To our knowledge, this is the only pytorch implementation which is exactly the same as original model.(nmtpytorch is another pytorch implementation but with minor structure difference.)

  • Attention is all you need: A strong nmt model introduced by Google, which only relies on attenion mechanism.

Table of Contents

Requirements

  • python 3.5+
  • pytorch 0.4.0+
  • tqdm
  • tensorboardX
  • sacrebleu

Usage

0. Quick Start

We provide push-button scripts to setup training and inference of transformer model on NIST Chinese-English Corpus (only on NJUNLP's server). Just execute under root directory of this repo

bash ./scripts/train.sh

for training and

# 3 means decoding on NIST 2003. This value
# can also be 4,5,6, which represents NIST 2004, 2005, 2006 respectively. 
bash ./scripts/translate.sh 3 

1. Build Vocabulary

First we should generate vocabulary files for both source and target language. We provide a script in ./data/build_dictionary.py to build them in json format.

See how to use this script by running:

python ./scripts/build_dictionary.py --help

We highly recommend not to set the limitation of the number of words and control it by config files while training.

2. Write Configuration File

See examples in ./configs folder. We provide several examples:

  • dl4mt_nist_zh2en.yaml: to run a DL4MT model on NIST Chinese to Enligsh
  • transformer_nist_zh2en.yaml: to run a Transformer model on NIST Chinese to English
  • transformer_nist_zh2en_bpe.yaml: to run a Transformer model on NIST Chinese to English using BPE.
  • transformer_wmt14_en2de.yaml: to run a Transformer model on WMT14 English to German

To further learn how to configure a NMT training task, see this wiki page.

3. Training

We can setup a training task by running

export CUDA_VISIBLE_DEVICES=0
python -m src.bin.train \
    --model_name <your-model-name> \
    --reload \
    --config_path <your-config-path> \
    --log_path <your-log-path> \
    --saveto <path-to-save-checkpoints> \
    --valid_path <path-to-save-validation-translation> \
    --use_gpu

See detail options by running python -m src.bin.train --help.

During training, checkpoints and best models will be saved under the directory specified by option ---saveto. Suppose that the model name is "MyModel", there would be several files under that directory:

  • MyModel.ckpt: A text file recording names of all the kept checkpoints

  • MyModel.ckpt.xxxx: Checkpoint stored in step xxxx

  • MyModel.best: A text file recording names of all the kept best checkpoints

  • MyModel.best.xxxx: Best checkpoint stored in step xxxx.

  • MyModel.best.final: Final best model, i.e., the model achieved best performance on validation set. Only model parameters are kept in it.

4. Translation

When training is over, our code will automatically save the best model. Usually you could just use the final best model, which is named as xxxx.best.final, to translate. This model achieves the best performance on the validation set.

We can translation any text by running:

export CUDA_VISIBLE_DEVICES=0
python -m src.bin.translate \
    --model_name <your-model-name> \
    --source_path <path-to-source-text> \
    --model_path <path-to-model> \
    --config_path <path-to-configuration> \
    --batch_size <your-batch-size> \
    --beam_size <your-beam-size> \
    --alpha <your-length-penalty> \
    --use_gpu

See detail options by running python -m src.bin.translate --help.

Also our code support ensemble decoding. See more options by running python -m src.bin.ensemble_translate --help

Benchmark

See BENCHMARK.md

Contact

If you have any question, please contact [email protected]

generation_order's People

Contributors

whr94621 avatar zhengzx-nlp avatar vergilus avatar wangqi1996 avatar wucucu avatar

Watchers

James Cloos avatar  avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.