Coder Social home page Coder Social logo

snli-ve's Introduction

SNLI-VE

Experiments with multi-modal entailment using an early fusion model and an attention model over words and image objects. https://github.com/CpuKnows/SNLI-VE

SNLI-VE corpus compiled by Xie et al. (2018)

Data

Setup

For full setup instructions see INSTALL.md

SNLI-VE Models

Fasttext hypothesis only baseline

Run scripts/create_fasttext_datasets.py to generate files for fasttext. Run scripts/create_snli_hard.py to create hard dataset splits.

Train fasttext model and make predictions:

fasttext supervised -input fasttext_train.txt -ouput fasttext_hyp_only -wordNgrams 2
fasttext predict fasttext_hyp_only.bin fasttext_<split>.txt 1 > prediction_<split>.txt 

Detectron bounding boxes for ROI Attention models

Run inference for bounding boxes:

DETECTRON=/path/to/detectron
SNLIVE=/path/to/SNLI-VE
python $DETECTRON/tools/infer_snlive.py \
    --cfg $DETECTRON/configs/12_2017_baselines/e2e_mask_rcnn_R-50-FPN_2x.yaml \
    --output-dir $SNLIVE/data/detectron \
    --output-ext json \
    --image-ext jpg \
    --wts $DETECTRON/weights/e2e_mask_rcnn_R-50-FPN_2x_model.pkl \
    $SNLIVE/data/flickr30k-images

The custom detection script can be found in scripts/infer_snlive.py

SNLI-VE training and inference

Create smaller data subsets for training runs scripts/subset_snli_ve_data.py

Training:

allennlp train experiments/<EXPERIMENT_NAME>.json \
    --serialization-dir models/<EXPERIMENT_NAME> \
    --include-package snli_ve

Evaluation for fusion models:

allennlp predict \
    --output-file data/predictions/<OUTPUT>.json \
    --silent \
    --cuda-device -1 \
    --predictor snlive_fusion_predictor \
    --include-package snli_ve \
    models/<EXPERIMENT_NAME>/model.tar.gz \
    data/snli_ve_<SPLIT>.jsonl

Evaluation for ROI Attention models:

allennlp predict \
    --output-file data/predictions/<OUTPUT>.json \
    --silent \
    --cuda-device -1 \
    --predictor snlive_roi_predictor \
    --include-package snli_ve \
    models/<EXPERIMENT_NAME>/model.tar.gz \
    data/snli_ve_<SPLIT>.jsonl

Results

Total dataset

Validation set Test set
Model Overall Entailed Neutral Contradict Overall Entailed Neutral Contradict
Hypothesis only 64.50 - - - 64.20 - - -
Early fusion 62.86 68.97 64.61 54.96 63.09 69.31 65.38 54.56
Early fusion with ELMo 67.05 70.15 62.23 68.78 67.07 69.36 62.63 69.23
ROI Attention 63.34 70.46 64.85 54.69 63.47 69.98 65.64 54.76

Hard dataset

Validation set Test set
Model Overall Entailed Neutral Contradict Overall Entailed Neutral Contradict
Hypothesis only - - - - - - - -
Early fusion 21.97 26.36 27.45 12.24 21.89 25.50 27.75 12.47
Early fusion with ELMo 32.19 33.42 27.19 36.48 32.09 31.16 27.40 37.86
ROI Attention 19.49 25.83 23.65 09.49 19.70 24.99 23.79 10.62

Citations

Ning Xie, Farley Lai, Derek Doran, and Asim Kadav. "Visual Entailment Task for Visually-Grounded Language Learning." arXiv preprint arXiv:1811.10582 (2018).

snli-ve's People

Contributors

cpuknows avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.