Coder Social home page Coder Social logo

hlk-1135 / miccai19-medvqa Goto Github PK

View Code? Open in Web Editor NEW

This project forked from aioz-ai/miccai19-medvqa

0.0 0.0 0.0 134 KB

AIOZ AI - Overcoming Data Limitation in Medical Visual Question Answering

License: MIT License

Python 99.58% Shell 0.42%

miccai19-medvqa's Introduction

Mixture of Enhanced Visual Features (MEVF)

This repository is the implementation of MEVF for the visual question answering task in medical domain. Our model achieved 43.9 for open-ended and 75.1 for close-end on VQA-RAD dataset. For the detail, please refer to link.

This repository is based on and inspired by @Jin-Hwa Kim's work. We sincerely thank for their sharing of the codes.

Overview of bilinear attention networks

Prerequisites

Please install dependence package by run following command:

pip install -r requirements.txt

Preprocessing

All data should be downloaded via link. The downloaded file should be extracted to data_RAD/ directory.

Training

Train MEVF model with Stacked Attention Network

$ python3 main.py --model SAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --output saved_models/SAN_MEVF

Train MEVF model with Bilinear Attention Network

$ python3 main.py --model BAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --output saved_models/BAN_MEVF

The training scores will be printed every epoch.

SAN+proposal BAN+proposal
Open-ended 40.7 43.9
Close-ended 74.1 75.1

Pretrained models and Testing

In this repo, we include the pre-trained weight of MAML and CDAE which are used for initializing the feature extraction modules.

The MAML model data_RAD/pretrained_maml.weights is trained by using official source code link.

The CDAE model data_RAD/pretrained_ae.pth is trained by code provided in train_cdae.py. For reproducing the pretrained model, please check the instruction provided in that file.

We also provide the pretrained models reported as the best single model in the paper.

For SAN_MEVF pretrained model. Please download the link and move to saved_models/SAN_MEVF/. The trained SAN_MEVF model can be tested in VQA-RAD test set via:

$ python3 test.py --model SAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --input saved_models/SAN_MEVF --epoch 19 --output results/SAN_MEVF

For BAN_MEVF pretrained model. Please download the link and move to saved_models/BAN_MEVF/. The trained BAN_MEVF model can be tested in VQA-RAD test set via:

$ python3 test.py --model BAN --use_RAD --RAD_dir data_RAD --maml --autoencoder --input saved_models/BAN_MEVF --epoch 19 --output results/BAN_MEVF

The result json file can be found in the directory results/.

Citation

Please cite these papers in your publications if it helps your research

@inproceedings{aioz_mevf_miccai19,
  author={Binh D. Nguyen, Thanh-Toan Do, Binh X. Nguyen, Tuong Do, Erman Tjiputra, Quang D. Tran},
  title={Overcoming Data Limitation in Medical Visual Question Answering},
  booktitle = {MICCAI},
  year={2019}
}

License

MIT License

More information

AIOZ AI Homepage: https://ai.aioz.io

miccai19-medvqa's People

Contributors

quangduytran avatar xuanbinh-nguyen96 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.