Coder Social home page Coder Social logo

trisongz / gpt2-text-generation Goto Github PK

View Code? Open in Web Editor NEW

This project forked from minimalist-nlp/gpt2-text-generation

1.0 0.0 0.0 15.03 MB

Minimalist implementation of a GPT2 with Language Model Head with PyTorch Lightning, Transformers and PyTorch-NLP.

Python 99.97% Shell 0.03%

gpt2-text-generation's Introduction

Minimalist Implementation of a GPT 2 with Language Model Head

This repo is a minimalist implementation of a GPT 2 with Language Model Head. This repo uses the following libraries as the main building blocks:

You can also check this minimalist implementation for text classification: Minimalist Implementation of a BERT Sentence Classifier.

Requirements:

This project uses Python 3.7

Create a virtual env with (outside the project folder):

virtualenv -p python3 gpt2-env
source gpt2-env/bin/activate

Install the requirements (inside the project folder):

pip install -r requirements.txt

Getting Started:

Train:

python training.py

Available commands:

Training arguments:

optional arguments:
  --seed                      Training seed.
  --distributed_backend       Supports three options: dp
  --use_16bit                 If true uses 16 bit precision
  --batch_size                Batch size to be used.
  --accumulate_grad_batches   Accumulated gradients runs K small batches of \
                              size N before doing a backwards pass.
  --log_gpu_memory            Uses the output of nvidia-smi to log GPU usage. \
                              Might slow performance.
  --val_percent_check         If you dont want to use the entire dev set, set \
                              how much of the dev set you want to use with this flag.      

Early Stopping/Checkpoint arguments:

optional arguments:
  --metric_mode             If we want to min/max the monitored quantity.
  --min_epochs              Limits training to a minimum number of epochs
  --max_epochs              Limits training to a max number number of epochs
  --save_top_k              The best k models according to the quantity \
                            monitored will be saved.

Model arguments:

optional arguments:
  --learning_rate             Learning rate.
  --train_csv                 Path to the file containing the train data.
  --dev_csv                   Path to the file containing the dev data.
  --test_csv                  Path to the file containing the test data.
  --loader_workers            How many subprocesses to use for data loading.

Training command example:

python training.py \
    --gpus 1 \
    --distributed_backend dp \
    --batch_size 6 \
    --accumulate_grad_batches 2 \
    --loader_workers 4 \

You can generate sentences with the model using (you may change the sampling parameters in the generate function in gpt2_lm.py):

python interact.py --experiment experiments/lightning_logs/version_{date}

Tensorboard:

Launch tensorboard with:

tensorboard --logdir="experiments/lightning_logs/"

Code Style:

To make sure all the code follows the same style we use Black.

gpt2-text-generation's People

Contributors

nunonmg avatar trisongz avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.