Coder Social home page Coder Social logo

shenxiang-vqa / deqfusion Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jinhong-ni/deqfusion

0.0 0.0 0.0 51.04 MB

PyTorch Implementation of Deep Equilibrium Multimodal Fusion

Python 92.05% Makefile 0.05% Batchfile 0.05% Jupyter Notebook 7.85%

deqfusion's Introduction

Deep Equilibrium Multimodal Fusion

PyTorch implementation of the paper: Deep Equilibrium Multimodal Fusion [arXiv].

Installation

Please clone this repo and use the following command to setup the environment, adjust the CUDA version according to your GPUs:

conda create -n deqfusion python==3.8 pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=10.2 -c pytorch
conda activate deqfusion
pip install -r requirements.txt

Usage

BRCA

The code is modified from the official implementation of MM-Dynamics.

Please follow the command below to train a model on BRCA:

cd experiments/BRCA
python main.py --mode=train -mrna -dna -mirna --f_thres=105 --b_thres=106

Please run python main.py -h for more details.

MM-IMDB

The code is modified from MultiBench.

Please first download MM-IMDB dataset from here.

If using ResNet152, please also download raw MM-IMDB dataset from here.

There are several example scripts for running the experiments using different fusion strategies. To train a model with our DEQ fusion on MM-IMDB with default settings (Word2vec+VGGNet), Specify $FILE_PATH as the path for multimodal_imdb.hdf5, then run:

cd experiments/MM-IMDB
python examples/multimedia/mmimdb_deq.py -p $FILE_PATH

Alternatively, you can train a model with our DEQ fusion with BERT+ResNet152. First, please specify $FILE_PATH as the path of raw data directory mmimdb then run the following command:

cd experiments/MM-IMDB
python examples/multimedia/mmimdb_deq_bert_resnet152.py -p $FILE_PATH

CMU-MOSI

The code is modified from the official implementation of Cross-Modal BERT.

First download the pre-trained BERT model from Google Drive, then use the following command:

wget https://storage.googleapis.com/bert_models/2018_10_18/uncased_L-12_H-768_A-12.zip
unzip uncased_L-12_H-768_A-12.zip

Alternatively, you may download both files from Baidu Netdisk with code fuse.

Run the experiments by:

cd experiments/CMU-MOSI
python run_classifier.py

Custom Dataset

If you wish to use DEQ fusion in your own dataset, please copy DEQ_fusion.py, solver.py, and jacobian.py into your repo. Then run from DEQ_fusion import DEQFusion and use DEQFusion for multimodal fusion.

Acknowledgement

Our work benefits largely from DEQ, MDEQ, MM-Dynamics, MultiBench, and Cross-Modal BERT.

deqfusion's People

Contributors

jinhong-ni avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.