Coder Social home page Coder Social logo

sjamieson / gift Goto Github PK

View Code? Open in Web Editor NEW

This project forked from zju3dv/gift

0.0 0.0 0.0 10.43 MB

Code for "GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs" NeurIPS 2019

License: Apache License 2.0

Python 69.38% Jupyter Notebook 2.83% C 17.91% C++ 1.52% Cuda 8.36%

gift's Introduction

GIFT: Learning Transformation-Invariant Dense Visual Descriptors via Group CNNs
Yuan Liu, Zehong Shen, Zhixuan Lin, Sida Peng, Hujun Bao, Xiaowei Zhou
NeurIPS 2019 Project Page

Any questions or discussions are welcomed!

Requirements & Compilation

  1. Requirements

Required packages are listed in requirements.txt.

Note that an old version of OpenCV (3.4.2) is needed since the code uses SIFT module of OpenCV.

The code is tested using Python-3.7.3 with pytorch 1.3.0.

  1. Compile hard example mining functions
cd hard_mining
python setup.py build_ext --inplace
  1. Compile extend utilities
cd utils/extend_utils
python build_extend_utils_cffi.py

According to your installation path of CUDA, you may need to revise the variables cuda_version, cuda_include and cuda_library in build_extend_utils_cffi.py.

Testing

Download pretrained models

  1. Pretrained GIFT model can be found at here.

  2. Pretrained SuperPoint model can be found at here.

  3. Make a directory called data and arrange these files like the following.

data/
├── superpoint/
|   └── superpoint_v1.pth
└── model/
    └── GIFT-stage2-pretrain/
        └── 20001.pth

Demo

We provide some examples of relative pose estimation in demo.ipynb.

Test on HPatches dataset

  1. Download Resized HPatches, ER-HPatches and ES-HPatches datasets at here. Optionally, you can also generate these datasets from original Hpatches sequences using correspondence_database.py.

  2. Extract these datasets like the following.

data/
├── hpatches_resize/
├── hpatches_erotate/
├── hpatches_erotate_illm/
├── hpatches_escale/
└── hpatches_escale_illm/
  1. Evaluation
# use keypoints detected by superpoint and descriptors computed by GIFT
python run.py --task=eval_original \
              --det_cfg=configs/eval/superpoint_det.yaml \
              --desc_cfg=configs/eval/gift_pretrain_desc.yaml \
              --match_cfg=configs/eval/match_v2.yaml

# use keypoints detected by superpoint and descriptors computed by superpoint
python run.py --task=eval_original \
              --det_cfg=configs/eval/superpoint_det.yaml \
              --desc_cfg=configs/eval/superpoint_desc.yaml \
              --match_cfg=configs/eval/match_v2.yaml

The output is es superpoint_det gift_pretrain_desc match_v2 pck-5 0.290 -2 0.132 -1 0.057 cost 267.826 s, which are datasetname detector_name descriptor_name matching_strategy PCK-5 PCK-2 PCK-1. PCK-5 means that a correspondence is correct if the distance between the matched keypoint and its ground truth location is less than 5 pixels.

Test on relative pose estimation dataset

  1. Download the st_peters_squares dataset from here. (This dataset is a part of st_peters.)

  2. Extract the dataset and arrange directories like the following.

data
└── st_peters_square_dataset/
    └── test/
  1. Evaluation
# use keypoints detected by superpoint and descriptors computed by GIFT
python run.py --task=rel_pose \
              --det_cfg=configs/eval/superpoint_det.yaml \
              --desc_cfg=configs/eval/gift_pretrain_desc.yaml \
              --match_cfg=configs/eval/match_v0.yaml

# use keypoints detected by superpoint and descriptors computed by superpoint
python run.py --task=rel_pose \
              --det_cfg=configs/eval/superpoint_det.yaml \
              --desc_cfg=configs/eval/superpoint_desc.yaml \
              --match_cfg=configs/eval/match_v0.yaml

The output is sps_100_200_first_100 superpoint_det gift_pretrain_desc match_v0 ang diff 24.100 inlier 62.240 correct-5 0.170 -10 0.350 -20 0.650 which is datasetname detector_name descriptor_name matching_strategy average_angle_difference average_inlier_number correct_rate_5_degree correct_rate_10_degree correct_rate_20_degree. In relative pose estimation, we can compute the angle difference between the estimated rotation and the ground truth rotation. average_angle_difference is the average angle difference among all image pairs. average_inlier_number is the number of inlier keypoints after RANSAC. correct_rate_5_degree indicate the percentage of image pairs whose angle difference is less than 5 degree.

Training

  1. Download the train-2014 and val-2014 set of COCO dataset and the SUN397 dataset.

  2. Organize files like the following

data
├── SUN2012Images/
|   └── JPEGImages/
└── coco/
    ├── train2014/ 
    └── val2014/
  1. Training
mkdir data/record
python run.py --task=train --cfg=configs/GIFT-stage1.yaml # train group extractor (Vanilla CNN)
python run.py --task=train --cfg=configs/GIFT-stage2.yaml # train group embedder (Group CNNs)

Acknowledgements

We have used codes or datasets from following projects:

Copyright

This work is affliated with ZJU-SenseTime Joint Lab of 3D Vision, and its intellectual property belongs to SenseTime Group Ltd.

Copyright SenseTime. All Rights Reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

gift's People

Contributors

jiamingsuen avatar liuyuan-pal avatar liuyuanwhu avatar sjamieson avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.