Coder Social home page Coder Social logo

analogy's Introduction

Detecting Unseen Visual Relations Using Analogies

Created by Julia Peyre at INRIA, Paris.

Introduction

This is the code for the paper :

Julia Peyre, Ivan Laptev, Cordelia Schmid, Josef Sivic, Detecting Unseen Visual Relations Using Analogies, ICCV19.

The webpage for this project is available here, with a link to the paper.

This code is available for research purpose (MIT License).

Contents

  1. Installation
  2. Data
  3. Train
  4. Test
  5. Evaluation
  6. Erratum

Installation

This code was tested on Python 2.7, Pytorch 0.4.0, CUDA 8.0 Install the dependencies with:

pip install -r requirements.txt

Data

We release data and pre-trained models for HICO-DET. To set-up the directories, please follow these steps:

  1. Download the pre-computed data
wget https://www.rocq.inria.fr/cluster-willow/jpeyre/analogy/data.tar.gz
tar zxvf data.tar.gz

This should be unzip into ./data folder
This contains the object detections, visual features as well as database objects to run our code on HICO-DET.

  1. Download HICO images
    Load HICO images and place them into directory images in ./data/hico/images :

  2. Link to COCO API
    Download COCO API into new directory ./data/coco and run make

  3. Download pre-computed models and detections

wget https://www.rocq.inria.fr/cluster-willow/jpeyre/analogy/runs.tar.gz
tar zxvf runs.tar.gz

This should be unzip into ./runs folder

Train

You can re-train our model by running:

python train.py --config_path $CONFIG_PATH

We provide config files in ./configs directory.
Feel free to edit the config options to train variants of our model.

Test

You can extract the detections by running:

python eval_hico.py --config_path $CONFIG_PATH

To extract the detections using our analogy model, you can run:

python eval_hico_analogy.py --config_path $CONFIG_PATH

Evaluation

We use the official evaluation code to evaluate performance on HICO-DET

Erratum

Please note that the numerical results in the paper were obtained using a slightly different version for analogy transformation than what is described in Eq.(6) of the paper. This variant computes analogy transformation as:

where are the embeddings of target subject, predicate and object in unigram spaces, and are the embeddings of source subject, predicate and object in visual phrase space.

You can choose between the 2 versions through the option --analogy_type. The default option described above is called 'hybrid'. To run the variant described in the paper, please activate the option --analogy_type='vp' in the config file such as in './configs/hico_trainvalzeroshot_analogy_vp.yaml'.

The variant 'vp' results in ~1% performance drop compared to the results in the paper (Table 2. s+o+vp+transfer (deep): 28.6 -> 27.5). The corresponding model is released in runs/ directory. We are still investigating why the 'hybrid' version performs better than the 'vp' one.

We would like to thank Kenneth Wong from Institute of Computing Technology, Chinese Academy of Sciences, for his careful code review and pointing out this inconsistency.

We apologize for this inconvenience. Also, please do not hesitate to contact the first author for further clarifications.

Cite

If you find this code useful in your research, please, consider citing our paper:

@InProceedings{Peyre19, author = "Peyre, Julia and Laptev, Ivan and Schmid, Cordelia and Sivic, Josef", title = "Detecting Unseen Visual Relations Using Analogies", booktitle = "ICCV", year = "2019" }

Questions

Any question please contact the first author [email protected]

analogy's People

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.