Coder Social home page Coder Social logo

comydream / sockeye-recipes Goto Github PK

View Code? Open in Web Editor NEW

This project forked from rekriz11/sockeye-recipes

0.0 1.0 0.0 62.13 MB

Training scripts and recipes for Sockeye Neural Machine Translation toolkit

Shell 2.33% Python 64.06% PowerShell 0.04% Batchfile 0.02% Jupyter Notebook 15.39% Makefile 2.81% CSS 0.18% HTML 4.07% JavaScript 0.61% C 2.03% TeX 0.01% Dockerfile 0.01% Roff 2.20% XSLT 0.07% M4 0.13% PostScript 5.25% Java 0.50% Perl 0.28% Starlark 0.01%

sockeye-recipes's Introduction

Simplification Fork

This fork contains the code we use in the paper, Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification, found here. Please cite the following paper, along with Sockeye:

@inproceedings{kriz2019complexity,
  title={Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification},
  author={Reno Kriz and Jo{\~a}o Sedoc and Marianna Apidianaki and Carolina Zheng and Gaurav Kumar and Eleni Miltsakaki and Chris Callison-Burch},
  journal={Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019)},
  year={2019}
}

Our updates to the sockeye-recipes can be found in sockeye-recipes/egs/pretrained_embeddings/. Other additional scripts can be found in sockeye-recipes/new_scripts.

The outputs of our baseline models, models for our ablation study, and our final best model can be found in sockeye-recipes/outputs/.

sockeye-recipes

Training scripts and recipes for the Sockeye Neural Machine Translation (NMT) toolkit

  • The original Sockeye codebase is at AWS Labs
  • This version is based off a stable fork. The current sockeye version that sockeye-recipes is built on is: 1.18.15.

This package contains scripts that makes it easy to run NMT experiments. The way to use this package is to specify settings in a file like "hyperparams.txt", then run the following scripts:

  • scripts/preprocess-bpe.sh: Preprocess bitext via BPE segmentation
  • scripts/train.sh: Train the NMT model given bitext
  • scripts/translate.sh: Translates a tokenized input file using an existing model

Installation

First, clone this package:

git clone https://github.com/kevinduh/sockeye-recipes.git sockeye-recipes

We assume that Anaconda for Python virtual environments is available on the system. Run the following to install Sockeye in two Anaconda environments, sockeye_cpu and sockeye_gpu:

cd path/to/sockeye-recipes
bash ./install/install_sockeye_cpu.sh
bash ./install/install_sockeye_gpu.sh
bash ./install/install_tools.sh

The training scripts and recipes will activate either the sockeye_cpu or sockeye_gpu environment depending on whether CPU or GPU is specified. Currently we assume CUDA 9.0 is available for GPU mode; this can be changed if needed. The third install_tools.sh script simply installs some helper tools, such as BPE preprocesser.

Re-Install

When the sockeye version is updated, it is recommended to re-run the installation scripts in a clean conda environment:

conda remove --name sockeye_gpu --all
conda remove --name sockeye_cpu --all
bash ./install/install_sockeye_cpu.sh
bash ./install/install_sockeye_gpu.sh

Environment Setup

Depending on your computer setup, you may want add the following configurations in the ~/.bashrc file.

Configure CUDA and CuDNN for the GPU version of Sockeye:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda/lib64

Set up a clean UTF-8 environment to avoid encoding errors:

export LANG=en_US.UTF-8

Recipes

The egs subdirectory contains recipes for various datasets.

  • egs/quickstart: For first time users, this recipe explains how sockeye-recipe works.

  • egs/ted: Recipes for training various NMT models, using a TED Talks dataset consisting of 20 different languages.

  • egs/wmt14-en-de: Recipe for training a baseline that compares with the Luong EMNLP2015 paper.

  • egs/curriculum: Recipe for curriculum learning. Also explains how to use sockeye-recipes in conjunction with a custom sockeye installation.

  • egs/optimizers: Example of training with different optimizers (e.g. ADAM, EVE, Nesterov ADAM, SGD, ...)

The hpm subdirectory contains hyperparameter (hpm) file templates. Besides NMT hyerparameters, the most important variables in this file to set are below:

  • rootdir: location of your sockeye-recipes installation, used for finding relevant scripts (i.e. this is current directory, where this README file is located.)

  • modeldir: directory for storing a single Sockeye model training process

  • workdir: directory for placing various modeldirs (i.e. a suite of experiments with different hyperparameters) corresponding to the same dataset

  • train_tok and valid_tok: prefix of tokenized training and validation bitext file path

  • train_bpe_{src,trg} and valid_bpe_{src,trg}: alternatively, prefix of the above training and validation files already processed by BPE

Auto-Tuning

This package also provides tools for auto-tuning, where one can specify the hyperparameters to search over and a meta-optimizer automatically attempts to try different configurations that it believes will be promising. For more information, see the auto-tuning folder.

Design Principles and Suggested Usage

Building NMT systems can be a tedious process involving lenghty experimentation with hyperparameters. The goal of sockeye-recipes is to make it easy to try many different configurations and to record best practices as example recipes. The suggested usage is as follows:

  • Prepare your training and validation bitext beforehand with the necessary preprocessing (e.g. data consolidation, tokenization, lower/true-casing). Sockeye-recipes simply assumes pairs of train_tok and valid_tok files.
  • Set the working directory to correspond to a single suite of experiments on the same dataset (e.g. WMT17-German-English)
  • The only preprocessing handled here is BPE. Run preprocess-bpe.sh with different BPE vocabulary sizes (bpe_symbols_src, bpe_symbols_trg). These can be saved all to the same datadir.
  • train.sh is the main training script. Specify a new modeldir for each train.sh run. The hyperparms.txt file used in training will be saved in modeldir for future reference.
  • At the end, your workingdir will have a single datadir containing multiple BPE'ed versions of the bitext, and multiple modeldir's. You can run tensorboard on all these modeldir's concurrently to compare learning curves.

There are many options in Sockeye. Currently not all of them are used in sockeye-recipes; more will be added. See sockeye/arguments.py for detailed explanations.

Alternatively, directly call sockeye with the help option as below. Note that sockeye-recipe hyperameters have the same name as sockeye hyperparameters, except that sockeye-recipe hyperparameters replace the hyphen with underscore (e.g. --num-embed in sockeye becomes $num_embed in sockeye-recipes):

source activate sockeye_cpu
python -m sockeye.train --help

sockeye-recipes's People

Contributors

aryamccarthy avatar carolinazheng avatar este1le avatar jsedoc avatar kevinduh avatar noisychannel avatar rekriz11 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.