Coder Social home page Coder Social logo

kevin3314 / fairseq Goto Github PK

View Code? Open in Web Editor NEW

This project forked from facebookresearch/fairseq

0.0 0.0 0.0 9.55 MB

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

License: MIT License

Python 97.73% C++ 0.64% Lua 0.15% Shell 0.03% Cuda 1.44%

fairseq's Introduction



MIT License Latest Release Build Status Documentation Status


Diff

Mainly, for recycle encoder and train BERT on MLM.

  • Implement custom dictionary for (NICT-)BERT.
    • Please align index for special symbols in dictionary for encoder and decoder.
    • use-bert-dict -> If use, specify like ja,en (free order)
  • Set default value of share_all_embeddings in bart to False
    • Not share embeddings between encoder and decoder
  • Add option for token type embeddings
    • add-token-type-embeddings -> Wether add token type ids or not (model)
    • dataset-add-token-type-ids -> Wether add token type ids or not (dataset)
    • type-vocab-size -> Vocabulary size of type tokens
  • Add option for loading bert weight
    • load_bert_path -> Path for bert weight to load
    • Note: On test, pretrained weight is loaded after model is build
  • Add option for length of positional embeddings
    • max-source-positions, max-target-positions
  • Add script for revert encoder's weight in transformer to bert
    • scripts/load_fairseq_weight_to_bert.py
  • Add option for freezing encoder
    • freeze-encoder: If set, freeze encoder's weight

Options for translation

  • set dafult value for left_pad_source = False
    • for BERT compatilibility
  • option: prepend_bos_to_src
    • prpend bos only to src

Options for RoBERTa, base

  • Add option for num_segments
    • for BERT compatibility (on BERT set 2)

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks.

We provide reference implementations of various sequence modeling papers:

List of implemented papers

What's New:

Previous updates

Features:

We also provide pre-trained models for translation and language modeling with a convenient torch.hub interface:

en2de = torch.hub.load('pytorch/fairseq', 'transformer.wmt19.en-de.single_model')
en2de.translate('Hello world', beam=5)
# 'Hallo Welt'

See the PyTorch Hub tutorials for translation and RoBERTa for more examples.

Requirements and Installation

  • PyTorch version >= 1.5.0
  • Python version >= 3.6
  • For training new models, you'll also need an NVIDIA GPU and NCCL
  • To install fairseq and develop locally:
git clone https://github.com/pytorch/fairseq
cd fairseq
pip install --editable ./

# on MacOS:
# CFLAGS="-stdlib=libc++" pip install --editable ./

# to install the latest stable release (0.10.x)
# pip install fairseq
  • For faster training install NVIDIA's apex library:
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" \
  --global-option="--deprecated_fused_adam" --global-option="--xentropy" \
  --global-option="--fast_multihead_attn" ./
  • For large datasets install PyArrow: pip install pyarrow
  • If you use Docker make sure to increase the shared memory size either with --ipc=host or --shm-size as command line options to nvidia-docker run .

Getting Started

The full documentation contains instructions for getting started, training new models and extending fairseq with new model types and tasks.

Pre-trained models and examples

We provide pre-trained models and pre-processed, binarized test sets for several tasks listed below, as well as example training and evaluation commands.

We also have more detailed READMEs to reproduce results from specific papers:

Join the fairseq community

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

Please cite as:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}

fairseq's People

Contributors

myleott avatar alexeib avatar tangyuq avatar kevin3314 avatar cndn avatar louismartin avatar liezl200 avatar huihuifan avatar theweiho avatar kahne avatar edunov avatar xu-song avatar freewym avatar pipibjc avatar liuchen9494 avatar jhcross avatar joshim5 avatar erip avatar kartikayk avatar multipath avatar lematt1991 avatar maigoakisame avatar skritika avatar jma127 avatar halilakin avatar nng555 avatar shruti-bh avatar stephenroller avatar jingfeidu avatar sshleifer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.