Coder Social home page Coder Social logo

ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes [ECCV 2020 (Oral)]

Website Badge License: MIT

Created by: Panos Achlioptas, Ahmed Abdelreheem, Fei Xia, Mohamed Elhoseiny, Leonidas Guibas

Introduction

This work is based on our ECCV-2020 paper. There, we proposed the novel task of identifying a 3D object in a real-world scene given discriminative language, created two relevant datasets (Nr3D and Sr3D) and proposed a 3D neural listener (ReferIt3DNet) for solving this task. The bulk of the provided code serves the training & evaluation of ReferIt3DNet in our data. For more information please visit our project's webpage.

ReferIt3DNet

ReferIt3DNet

Code-Dependencies

  1. Python 3.x with numpy, pandas, matplotlib (and a few more common packages - please see setup.py)
  2. Pytorch 1.x

Our code is tested with Python 3.6.9, Pytorch 1.4 and CUDA 10.0, on Ubuntu 14.04.

Installation

  • (recommended) you are advised to create a new anaconda environment, please use the following commands to create a new one.
    conda create -n referit3d_env python=3.6.9 cudatoolkit=10.0
    conda activate referit3d_env
    conda install pytorch torchvision -c pytorch
  • Install the referit3d python package using
    cd referit3d
    pip install -e .
  • To use a PointNet++ visual-encoder you need to compile its CUDA layers for PointNet++: Note: To do this compilation also need: gcc5.4 or later.
    cd external_tools/pointnet2
    python setup.py install

Dataset

ScanNet

First you must download the train/val scans of ScanNet if you do not have them locally. To do so, please refer to the ScanNet Dataset for more details.

Our Linguistic Data

  • Nr3D you can dowloaded Nr3D here (10.7MB)
  • Sr3D / Sr3D+ you can dowloaded Sr3D/Sr3D+ here (19MB / 20MB)

Since Sr3d is a synthetic dataset, you can change the hyper-parameters to create a version customized to your needs. please see referit3d/data_generation/sr3d/

Training

  • To train on either Nr3d or Sr3d dataset, use the following commands
    cd referit3d/scripts/
    python train_referit3d.py -scannet-file the_processed_scannet_file -referit3D-file dataset_file.csv --log-dir dir_to_log --n-workers 4

feel free to change the number of workers to match your #CPUs and RAM size.

  • To train nr3d in joint with sr3d, add the following argument
    --augment-with-sr3d sr3d_dataset_file.csv

Evaluation

  • To evaluate on either Nr3d or Sr3d dataset, use the following commands
    cd referit3d/scripts/
    python train_referit3d.py --mode evaluate -scannet-file the_processed_scannet_file -referit3D-file dataset_file.csv --resume-path the_path_to_the_best_model.pth  --n-workers 4 --batch-size 64 
  • To evaluate on joint trained model, add the following argument to the above command
    --augment-with-sr3d sr3d_dataset_file.csv

Pretrained model

you can download a pretrained ReferIt3DNet models on Nr3D and Sr3D here. please extract the zip file and then copy the extracted folder to referit3d/log folder. you can run the following the command to evaluate:

cd referit3d/scripts
python train_referit3d.py --mode evaluate -scannet-file path_to_keep_all_points_00_view_with_global_scan_alignment.pkl  -referit3D-file path_to_corresponding_csv.csv  --resume-path checkpoints/best_model.pth

ReferIt3D Benchmark Challenges

We wish to aggregate and highlight results from different approaches tackling the problem of fine-grained 3D object identification via language. If you use either of our datasets with a new method, please let us know! so we can add your method and attained results in our benchmark-aggregating page.

Citation

@article{achlioptas2020referit_3d,
    title={ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes},
    author={Achlioptas, Panos and Abdelreheem, Ahmed and Xia, Fei and Elhoseiny, Mohamed and Guibas, Leonidas},
    journal={16th European Conference on Computer Vision (ECCV)},
    year={2020}
}

License

The code is licensed under MIT license (see LICENSE.md for details).

referit3d's Projects

referit3d icon referit3d

Code accompanying our ECCV-2020 paper on 3D Neural Listeners.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.