Coder Social home page Coder Social logo

dual-transformer's Introduction

Dual-Transformer

Dual-Transformer is the framework with the input is an scenery image and the output is Vietnamese six-eight poem. The generated poem is related to the input image by containing the objects from the input image.

Installation

  1. Clone the repository:
    git clone https://github.com/chauminhnguyen/Dual-Transformer.git
  1. Install the requirements
  • Install cuda.

  • Install the requirements.

    pip install -r requirements.txt
  1. Modify the path
  • Modify the config.json for Query2labels and GPT-2 models path.

  • Modify the Query2labels's config.json (default: models/Query2labels/config.json) for the pretrained's path.

  1. Start the model
    streamlit run app.py

The general of the Img2Poem website

Infer an image -->

Train Model

Train Image-to-Keywords Model

Link data.

I used Query2Label for Image-to-Keywords Model. The command below is used to train on my Image-to-Keywords dataset.

    python main_mlc.py 
                --dataset_dir './data' --backbone resnet101 --dataname coco14 
                --batch-size 1 --print-freq 100 --output "./output" --world-size 1 --rank 0 
                --dist-url tcp://127.0.0.1:3717 --gamma_pos 0 --gamma_neg 2 --dtgfl --epochs 40 
                --lr 1e-4 --optim AdamW --pretrained --num_class 76 --img_size 448 
                --weight-decay 1e-2 --cutout --n_holes 1 --cut_fact 0.5 --hidden_dim 2048 
                --dim_feedforward 4096 --enc_layers 1 --dec_layers 2 --nheads 4 --early-stop 
                --amp --workers 2

Train Keywords-to-Poem Model

Data is located in /data.

I used GPT-2 for Keywords-to-Poem model.

    python trainKw2Poem.py
                --train_dir './data/1ext_balanced_rkw_4sen_87609_test_kw2poem_dataset.csv'
                --epoch 100 --step 10000 --batch_size 8

Pretrained Models

Model name Link
GPT-2 link
Query2label link

Acknowledgement

We thank the authors of Query2Label, GPT-2 for facilitating such an opportunity for us to create this framework. Additionally, we thank FPT for the building dataset process.

dual-transformer's People

Contributors

chauminhnguyen avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.