Coder Social home page Coder Social logo

conallwang / motionhint Goto Github PK

View Code? Open in Web Editor NEW
24.0 2.0 5.0 58.87 MB

The official implementation of "MotionHint: Self-Supervised Monocular Visual Odometry with Motion Constraints". ICRA 2022

License: MIT License

Python 98.71% Shell 1.29%
self-supervised-learning odometry

motionhint's Introduction

MotionHint: Self-Supervised Monocular Visual Odometry with Motion Constraints

This is the official PyTorch implementation of "MotionHint: Self-Supervised Monocular Visual Odometry with Motion Constraints".

MotionHint: Self-Supervised Monocular Visual Odometry with Motion Constraints

Cong Wang, Yu-Ping Wang and Dinesh Manocha

ICRA 2022 (arXiv pdf)

quantitive results

This code is for non-commercial use; please see the license file for terms.

If you find our work useful in your research please consider citing our paper:

@inproceedings{DBLP:conf/icra/WangWM22,
  author    = {Cong Wang and
               Yu{-}Ping Wang and
               Dinesh Manocha},
  title     = {MotionHint: Self-Supervised Monocular Visual Odometry with Motion
               Constraints},
  booktitle = {2022 International Conference on Robotics and Automation, {ICRA} 2022,
               Philadelphia, PA, USA, May 23-27, 2022},
  pages     = {1265--1272},
  publisher = {{IEEE}},
  year      = {2022}
}

⚙️ Setup

First, assuming you are using a fresh Anaconda distribution, you can set up the environment with:

conda env create -f environment.yaml

This command will install all packages used during training and testing.

Dataset

You can download the KITTI Odometry dataset from here. You need to download the color odometry dataset (~65GB) and the ground truth poses (~4MB), and organize them as:

data_path
│
└───poses
│   │   00.txt
│   │   01.txt
|   |   ...
|   |   21.txt
│   
└───sequences
    │   
    └───00
    |   |   calib.txt
    |   |   times.txt
    |   └───image_2
    |   |   |   000000.png
    |   |   |   000001.png
    |   |   |   ...
    |   |      
    |   └───image_3
    |       |   000000.png
    |       |   000001.png
    |       |   ...
    |   
    └───01
    |   |   calib.txt
    |   |   times.txt
    |   └───image_2
    |   |   |   000000.png
    |   |   |   000001.png
    |   |   |   ...
    |   |      
    |   └───image_3
    |       |   000000.png
    |       |   000001.png
    |       |   ...
    |
    └───02
    |
    ...

Training

To train our MotionHint models, you need to first download the pretrained model of Monodepth2, and replace 'PRETRAINED_MONODEPTH2' in all training scripts with your own path.

Using the shell scripts in ./scripts, you can train the self-supervised visual odometry with our MotionHint.

There are totally three setups in our paper,

If you want to train the network using 'Ground Truth' setup, you should run:

./scripts/train_gt_setup.sh 0     # 0 infers using the gpu 0.

If you want to train the network using 'Paired Poses' setup, you should run:

./scripts/train_pair_setup.sh 0     # 0 infers using the gpu 0.

If you want to train the network using 'Unpaired Poses' setup, you should run:

./scripts/train_unpair_setup.sh 0     # 0 infers using the gpu 0.

Before running all scripts, you need to change the values of 'DATA_PATH' and 'LOG_PATH' in the scripts. The models will be saved in LOG_PATH/models, and the latest model will be used for evaluation.

You can also change some parameters in scripts and options.py to do some ablation study.

Evaluation

We directly employ the KITTI Odometry Evaluation Toolbox to evaluate our models. For convenience, we have integrated the toolbox into our code, and written a script to do the evaluation.

If you want to evlaute a MotionHint model, you can run:

./scripts/eval_model.sh YOUR_MODEL_PATH EVALUATION_NAME

The evaluation results will be shown in the terminal, and saved in ./evaluations/result/EVALUATION_NAME.

Results and Pre-trained Models

Model name Seq 09 translation error(%) Seq 09 rotation error(°/100m) Seq 09 ATE(m) Seq 10 translation error(%) Seq 10 rotation error(°/100m) Seq 10 ATE(m)
MonoDepth2 15.079 3.335 69.019 12.102 4.927 19.371
MotionHint(Ground Truth) 13.502 2.998 62.337 10.377 4.453 17.541
MotionHint(Paired Pose) 14.071 3.099 64.704 10.976 4.495 17.752
MotionHint(Unpaired Pose) 9.761 2.226 46.036 8.679 3.334 13.282

All models above are trained for 20 epochs, and then the lastest version is taken as the final model.

motionhint's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

motionhint's Issues

关于Combining Losses 的问题

你好,请问这个Motion Loss中,对于pseudo label的位姿和ego-motion预测的位姿的差异,为何将两者直接相减呢 ? 这个位姿变换矩阵中包含平移和旋转变换,计算两个位姿变换矩阵差的时候直接用欧式距离是不是有点不合理呢? 是不是可以考虑将两者分开计算,然后乘以权重之后相加呢 ?
希望能得到你的回复,我时刻关注。

result of Monodepth2?

MonoDepth2 (Baseline) 14.837 3.299 68.180 12.097 4.949 19.408
15.079 3.335 69.019 12.102 4.927 19.371
Thank you very much for your work.
But I found the results of (MonoDepth2). in the paper is different from that in the code. May I ask why is this?

Training of PPNet

I see that for the pre-training PPNet part, no training script is provided and only the pre-trained weights are given. Can you share the training script too?

Questions about the 09 trajectory in the article

hello, thanks for your shared code. Its a great work!
I try to use the weights provided at the end of your homepage to evaluate the 09 and 10 sequence, I can get the same trajectory error and other data, but I can't get a trajectory that is close to a closed track like Figure 1 in your article, I want to know Figure 1 in your article Which weights are used to obtain such a result. Because I used the MotionHint-GT and MotionHint-Unpair weights at the end of your homepage, I can't get the same effect as your picture in evo toolbox.

Wish you can apply me soon, thx a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.