Coder Social home page Coder Social logo

cadene / murel.bootstrap.pytorch Goto Github PK

View Code? Open in Web Editor NEW
194.0 10.0 25.0 6.13 MB

MUREL (CVPR 2019), a multimodal relational reasoning module for VQA

Home Page: https://arxiv.org/abs/1902.09487

License: BSD 3-Clause "New" or "Revised" License

Python 96.78% Shell 3.22%

murel.bootstrap.pytorch's Introduction

MUREL: Multimodal Relational Reasoning for Visual Question Answering

The MuRel network is a Machine Learning model learned end-to-end to answer questions about images. It relies on the object bounding boxes extracted from the image to build a complitely connected graph where each node corresponds to an object or region. The MuRel network contains a MuRel cell over which it iterates to fuse the question representation with local region features, progressively refining visual and question interactions. Finally, after a global aggregation of local representations, it answers the question using a bilinear model. Interestingly, the MuRel network doesn't include an explicit attention mechanism, usually at the core of state-of-the-art models. Its rich vectorial representation of the scene can even be leveraged to visualize the reasoning process at each step.

The MuRel cell is a novel reasoning module which models interactions between question and image regions. Its pairwise relational component enriches the multimodal representations of each node by taking their context into account in the modeling.

In this repo, we make our datasets and models available via pip install. Also, we provide pretrained models and all the code needed to reproduce the experiments from our CVPR 2019 paper.

Summary

Installation

1. Python 3 & Anaconda

We don't provide support for python 2. We advise you to install python 3 with Anaconda. Then, you can create an environment.

2. As standalone project

conda create --name murel python=3.7
source activate murel
git clone --recursive https://github.com/Cadene/murel.bootstrap.pytorch.git
cd murel.bootstrap.pytorch
pip install -r requirements.txt

3. Download datasets

Download annotations, images and features for VQA experiments:

bash murel/datasets/scripts/download_vqa2.sh
bash murel/datasets/scripts/download_vgenome.sh
bash murel/datasets/scripts/download_tdiuc.sh
bash murel/datasets/scripts/download_vqacp2.sh

Note: The features have been extracted from a pretrained Faster-RCNN with caffe. We don't provide the code for pretraining or extracting features for now.

(2. As a python library)

By importing the murel python module, you can access datasets and models in a simple way:

from murel.datasets.vqacp2 import VQACP2
from murel.models.networks.murel_net import MurelNet
from murel.models.networks.murel_cell import MurelCell
from murel.models.networks.pairwise import Pairwise

To be able to do so, you can use pip:

pip install murel.bootstrap.pytorch

Or install from source:

git clone https://github.com/Cadene/murel.bootstrap.pytorch.git
python setup.py install

Note: This repo is built on top of block.bootstrap.pytorch. We import VQA2, TDIUC, VGenome from the latter.

Quick start

Train a model

The boostrap/run.py file load the options contained in a yaml file, create the corresponding experiment directory and start the training procedure. For instance, you can train our best model on VQA2 by running:

python -m bootstrap.run -o murel/options/vqa2/murel.yaml

Then, several files are going to be created in logs/vqa2/murel:

  • options.yaml (copy of options)
  • logs.txt (history of print)
  • logs.json (batchs and epochs statistics)
  • view.html (learning curves)
  • ckpt_last_engine.pth.tar (checkpoints of last epoch)
  • ckpt_last_model.pth.tar
  • ckpt_last_optimizer.pth.tar
  • ckpt_best_eval_epoch.accuracy_top1_engine.pth.tar (checkpoints of best epoch)
  • ckpt_best_eval_epoch.accuracy_top1_model.pth.tar
  • ckpt_best_eval_epoch.accuracy_top1_optimizer.pth.tar

Many options are available in the options directory.

Evaluate a model

At the end of the training procedure, you can evaluate your model on the testing set. In this example, boostrap/run.py load the options from your experiment directory, resume the best checkpoint on the validation set and start an evaluation on the testing set instead of the validation set while skipping the training set (train_split is empty). Thanks to --misc.logs_name, the logs will be written in the new logs_test.txt and logs_test.json files, instead of being appended to the logs.txt and logs.json files.

python -m bootstrap.run \
-o logs/vqa2/murel/options.yaml \
--exp.resume best_accuracy_top1 \
--dataset.train_split \
--dataset.eval_split test \
--misc.logs_name test

Reproduce results

VQA2 dataset

Training and evaluation (train/val)

We use this simple setup to tune our hyperparameters on the valset.

python -m bootstrap.run \
-o murel/options/vqa2/murel.yaml \
--exp.dir logs/vqa2/murel

Training and evaluation (train+val/val/test)

This heavier setup allows us to train a model on 95% of the concatenation of train and val sets, and to evaluate it on the 5% rest. Then we extract the predictions of our best checkpoint on the testset. Finally, we submit a json file on the EvalAI web site.

python -m bootstrap.run \
-o murel/options/vqa2/murel.yaml \
--exp.dir logs/vqa2/murel_trainval \
--dataset.proc_split trainval

python -m bootstrap.run \
-o logs/vqa2/murel_trainval/options.yaml \
--exp.resume best_eval_epoch.accuracy_top1 \
--dataset.train_split \
--dataset.eval_split test \
--misc.logs_name test

Training and evaluation (train+val+vg/val/test)

Same, but we add pairs from the VisualGenome dataset.

python -m bootstrap.run \
-o murel/options/vqa2/murel.yaml \
--exp.dir logs/vqa2/murel_trainval_vg \
--dataset.proc_split trainval \
--dataset.vg True

python -m bootstrap.run \
-o logs/vqa2/murel_trainval_vg/options.yaml \
--exp.resume best_eval_epoch.accuracy_top1 \
--dataset.train_split \
--dataset.eval_split test \
--misc.logs_name test

Compare experiments on valset

You can compare experiments by displaying their best metrics on the valset.

python -m murel.compare_vqa_val -d logs/vqa2/murel logs/vqa2/attention

Submit predictions on EvalAI

It is not possible to automaticaly compute the accuracies on the testset. You need to submit a json file on the EvalAI platform. The evaluation step on the testset creates the json file that contains the prediction of your model on the full testset. For instance: logs/vqa2/murel_trainval_vg/results/test/epoch,19/OpenEnded_mscoco_test2015_model_results.json. To get the accuracies on testdev or test sets, you must submit this file.

VQACP2 dataset

Training and evaluation (train/val)

python -m bootstrap.run \
-o murel/options/vqacp2/murel.yaml \
--exp.dir logs/vqacp2/murel

Compare experiments on valset

python -m murel.compare_vqa_val -d logs/vqacp2/murel logs/vqacp2/attention

TDIUC dataset

Training and evaluation (train/val/test)

The full training set is split into a trainset and a valset. At the end of the training, we evaluate our best checkpoint on the testset. The TDIUC metrics are computed and displayed at the end of each epoch. They are also stored in logs.json and logs_test.json.

python -m bootstrap.run \
-o murel/options/tdiuc/murel.yaml \
--exp.dir logs/tdiuc/murel

python -m bootstrap.run \
-o logs/tdiuc/murel/options.yaml \
--exp.resume best_eval_epoch.accuracy_top1 \
--dataset.train_split \
--dataset.eval_split test \
--misc.logs_name test

Compare experiments

You can compare experiments by displaying their best metrics on the valset or testset.

python -m murel.compare_tdiuc_val -d logs/tdiuc/murel logs/tdiuc/attention
python -m murel.compare_tdiuc_test -d logs/tdiuc/murel logs/tdiuc/attention

Pretrained models

TODO

Useful commands

Use tensorboard instead of plotly

Instead of creating a view.html file, a tensorboard file will be created:

python -m bootstrap.run -o murel/options/vqa2/murel.yaml \
--view.name tensorboard
tensorboard --logdir=logs/vqa2

You can use plotly and tensorboard at the same time by updating the yaml file like this one.

Use a specific GPU

For a specific experiment:

CUDA_VISIBLE_DEVICES=0 python -m boostrap.run -o murel/options/vqa2/murel.yaml

For the current terminal session:

export CUDA_VISIBLE_DEVICES=0

Overwrite an option

The boostrap.pytorch framework makes it easy to overwrite a hyperparameter. In this example, we run an experiment with a non-default learning rate. Thus, I also overwrite the experiment directory path:

python -m bootstrap.run -o murel/options/vqa2/murel.yaml \
--optimizer.lr 0.0003 \
--exp.dir logs/vqa2/murel_lr,0.0003

Resume training

If a problem occurs, it is easy to resume the last epoch by specifying the options file from the experiment directory while overwritting the exp.resume option (default is None):

python -m bootstrap.run -o logs/vqa2/murel/options.yaml \
--exp.resume last

Web API

TODO

Extract your own image features

TODO

Citation

@InProceedings{Cadene_2019_CVPR,
    author = {Cadene, Remi and Ben-Younes, Hedi and Thome, Nicolas and Cord, Matthieu},
    title = {MUREL: {M}ultimodal {R}elational {R}easoning for {V}isual {Q}uestion {A}nswering},
    booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition {CVPR}},
    year = {2019},
    url = {http://remicadene.com/pdfs/paper_cvpr2019.pdf}
}

Poster

TODO

Authors

This code was made available by Hedi Ben-Younes (Sorbonne-Heuritech), Remi Cadene (Sorbonne), Matthieu Cord (Sorbonne) and Nicolas Thome (CNAM).

Acknowledgment

Special thanks to the authors of VQA2, TDIUC, VisualGenome and VQACP2, the datasets used in this research project.

murel.bootstrap.pytorch's People

Contributors

cadene avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

murel.bootstrap.pytorch's Issues

Access to Murel predictions on test set

Hi,

Thanks for releasing your code. I would like to have access to Murel's prediction on the TDIUC test set. Would you be able to share them? That would be really handy for an error analysis I'm doing.

Thanks,
Alessandro

SSL Certificate expired

Seems like the SSL certificate of the website to download the data has been expired. It is possible to use --no-check-certificate but is a riskier option.

FileNotFoundError when load image features.

When I run python -m bootstrap.run -o murel/options/vqa2/murel.yaml , I found the error below.
FileNotFoundError: [Errno 2] No such file or directory: 'data/vqa/coco/extract_rcnn/2018-04-27_bottom-up-attention_fixed_36/COCO_train2014_000000010083.jpg.pth'
And then I found that there are 37,695 files in 'extract_rcnn' folder.
Is the feature file incomplete?

KeyError: 'norm_coord' c = batch['norm_coord']

when i run the following command line, I found the error below.

python -m bootstrap.run -o murel/options/vqa2/murel.yaml

c = batch['norm_coord']
KeyError: 'norm_coord'

Here is the keys of batch:
dict_keys(['index', 'question_id', 'question', 'lengths', 'image_name', 'visual', 'nb_regions', 'answer_id', 'class_id', 'answer', 'question_type'])

Thank you for your amazing code.

Overfitting!!!

Hi, when training based on your default parameters on VQA task ( dropout_input: 0.1, dropout_pre_lin: 0.0, dropout_output: 0.0), Isn't that overfitting? when I training the model , the val score is only 0.24. What parameters do you use, please provide the details, thanks a lot!

result reproduce

Hello, If I want to reproduce your result of VQA-CP v2, should I run these codes to download the dataset?

bash murel/datasets/scripts/download_vqa2.sh
bash murel/datasets/scripts/download_vqacp2.sh

Hello

When I reproduce the result of VQA-CP2, Which dataset should I download?

Evaluation on test set

Hi Cadene,

I used the following command to generate a JSON for the test set.
python -m bootstrap.run \ -o logs/vqa2/murel/options.yaml \ --exp.resume best_accuracy_top1 \ --dataset.train_split \ --dataset.eval_split test \ --misc.logs_name test
The JSON was generated in folder logs/vqa2/murel/results/test/epoch,1/. But when I upload this in EvalAI http://evalai.cloudcv.org/web/challenges/challenge-page/163/submission, it says failed (I did upload OpenEnded_mscoco_test-dev2015_model_results.json)! Please let me know if I am doing something wrong.

I have attached the log file (I couldn't attach the JSON as github doesn't support sharing JSON files, update: here is the link to the JSON file I uploaded in EvalAI: https://evalai.s3.amazonaws.com/media/submission_files/submission_25691/ab81c6df-954e-4436-80d2-12826b8ffa0c.json). Thanks for the help!

logs_test.txt

inference

hi,Cadene:
How can I transfer my pictures and questions to the trained model?
I can't find the corresponding code.

VQACP2 test

Hi,

Thanks for releasing your code. I would like to have access to Murel's prediction on the VQACP2 test set. Would you be able to share them? That would be really handy for an error analysis I'm doing.

Thanks,
xywhat

VQA-CPv2

Hi! I have some questions about the files preprocessed from the dataset VQA-CP v2 and used in your work. Because VQA-CP v2 only has the training and test questions and annotations, how to generate files trainset.pth and valset.pth? Is it based on the 'coco_split' ?

Training on subset of classes

Hello,

I tried adapting the code to train for just binary yes/no questions, but while it was loading batch, it is throwing an error as shown below:

Screen Shot 2019-04-16 at 4 01 40 PM

I changed the following :
Edited trainset.pth, validset.pth to contain only yes/no questions and annotations
in options/vqa2/murel.yaml changed the output_dim to 2 for final layer.
and changed batchsize to 8

After googling found the error could be because the labels is exceeding n_classes (2), but we couldn't figure out why. :(

FileNotFoundError

Downloaded the vqacp2 dataset only. When I am trying to run '/murel/options/vqacp2/murel.yaml', getting this FileNotFound error:'data/vqa/coco/extract_rcnn/2018-04-27_bottom-up-attention_fixed_36/COCO_train2014_000000465824.jpg.pth'.

Where do I find this file? Do I need anyother dataset outside of vqacp2 dataset?

code question

When I run your code, there is an error.
The plotly.plotly module is deprecated,
please install the chart-studio package and use the
chart_studio.plotly module instead.

Key Error 'visual'

I got keyerror 'visual' during executing code on vqa-cp2 dataset..! Could help me?
image

Answer vocabulary

Hi,
How the answer vocabulary is selected (for VQAv2, VQA-CPv2 and TDIUC)?
Thanks

Error while trying to train or validate

Hi, thanks for the code.

I am trying to re-train MuRel. I got the following error: Object arrays cannot be loaded when allow_pickle=False

Attached is the output log. Please let me know if I'm doing something incorrectly.

Command: CUDA_VISIBLE_DEVICES=4,5,6,7 python -m bootstrap.run -o murel/options/vqa2/murel.yaml 2>&1 | tee murel_output.txt

murel_output.txt

About the `2018-04-27_bottom-up-attention_fixed_36` features

Hi! @Cadene @Hediby
The .pth file includes many attributs of image, like:
'cls_scores', 'rois', 'pooled_feat', 'cls', 'norm_rois'

Hope for your respond!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.