dual-transformer's Introduction

Dual-Transformer

Dual-Transformer is the framework with the input is an scenery image and the output is Vietnamese six-eight poem. The generated poem is related to the input image by containing the objects from the input image.

Installation

Clone the repository:

    git clone https://github.com/chauminhnguyen/Dual-Transformer.git

Install the requirements

Install cuda.
Install the requirements.

    pip install -r requirements.txt

Modify the path

Modify the config.json for Query2labels and GPT-2 models path.
Modify the Query2labels's config.json (default: models/Query2labels/config.json) for the pretrained's path.

Start the model

    streamlit run app.py

-->

Train Model

Train Image-to-Keywords Model

Link data.

I used Query2Label for Image-to-Keywords Model. The command below is used to train on my Image-to-Keywords dataset.

    python main_mlc.py 
                --dataset_dir './data' --backbone resnet101 --dataname coco14 
                --batch-size 1 --print-freq 100 --output "./output" --world-size 1 --rank 0 
                --dist-url tcp://127.0.0.1:3717 --gamma_pos 0 --gamma_neg 2 --dtgfl --epochs 40 
                --lr 1e-4 --optim AdamW --pretrained --num_class 76 --img_size 448 
                --weight-decay 1e-2 --cutout --n_holes 1 --cut_fact 0.5 --hidden_dim 2048 
                --dim_feedforward 4096 --enc_layers 1 --dec_layers 2 --nheads 4 --early-stop 
                --amp --workers 2

Train Keywords-to-Poem Model

Data is located in /data.

I used GPT-2 for Keywords-to-Poem model.

    python trainKw2Poem.py
                --train_dir './data/1ext_balanced_rkw_4sen_87609_test_kw2poem_dataset.csv'
                --epoch 100 --step 10000 --batch_size 8

Pretrained Models

Model name	Link
GPT-2	link
Query2label	link

Acknowledgement

We thank the authors of Query2Label, GPT-2 for facilitating such an opportunity for us to create this framework. Additionally, we thank FPT for the building dataset process.

Recommend Projects