Coder Social home page Coder Social logo

gengshan-y / rigidmask Goto Github PK

View Code? Open in Web Editor NEW
187.0 12.0 18.0 21.63 MB

Code for "Learning to Segment Rigid Motions from Two Frames". CVPR 2021.

Home Page: https://gengshan-y.github.io/rigidmask/

License: MIT License

Python 76.00% C++ 13.12% C 2.62% Cuda 8.26% Shell 0.01%
scene-flow rigid-motion-estimation rigid-motion-segmentation

rigidmask's Introduction

rigidmask

Code for "Learning to Segment Rigid Motions from Two Frames".

** This is a partial release with inference and evaluation code. The project is still being tested and documented. There might be implemention changes in the future release. Thanks for your interest.

Visuals on Sintel/KITTI/Coral (not temporally smoothed):

If you find this work useful, please consider citing:

@inproceedings{yang2021rigidmask,
title={Learning to Segment Rigid Motions from Two Frames},
author={Yang, Gengshan and Ramanan, Deva},
booktitle={CVPR},
year={2021}
}

Data and precomputed results

Download

Additional inputs (coral reef images) and precomputed results are hosted on google drive. Run (assuming you have installed gdown)

gdown https://drive.google.com/uc?id=1Up2cPCjzd_HGafw1AB2ijGmiKqaX5KTi -O ./input.tar.gz
gdown https://drive.google.com/uc?id=12C7rl5xS66NpmvtTfikr_2HWL5SakLVY -O ./rigidmask-sf-precomputed.zip
tar -xzvf ./input.tar.gz 
unzip ./rigidmask-sf-precomputed.zip -d precomputed/

To compute the results in Tab.1, Tab.2 on KITTI,

modelname=rigidmask-sf
python eval/eval_seg.py  --path precomputed/$modelname/  --dataset 2015
python eval/eval_sf.py   --path precomputed/$modelname/  --dataset 2015

Install

The code is tested with python 3.8, pytorch 1.7.0, and CUDA 10.2. Install dependencies by

conda env create -f rigidmask.yml
conda activate rigidmask_v0
conda install -c conda-forge kornia=0.5.3 # install a compatible korna version
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.7/index.html

Compile DCNv2 and ngransac.

cd models/networks/DCNv2/; python setup.py install; cd -
cd models/ngransac/; python setup.py install; cd -

Pretrained models

Download pre-trained models to ./weights (assuming gdown is installed),

mkdir weights
mkdir weights/rigidmask-sf
mkdir weights/rigidmask-kitti
gdown https://drive.google.com/uc?id=1H2khr5nI4BrcrYMBZVxXjRBQYBcgSOkh -O ./weights/rigidmask-sf/weights.pth
gdown https://drive.google.com/uc?id=1sbu6zVeiiK1Ra1vp_ioyy1GCv_Om_WqY -O ./weights/rigidmask-kitti/weights.pth
modelname training set flow model flow err. (K:Fl-err/EPE) motion-in-depth err. (K:1e4) seg. acc. (K:obj/K:bg/S:bg)
rigidmask-sf (mono) SF C+SF+V 10.9%/3.128px 120.4 90.71%/97.05%/86.72%
rigidmask-kitti (stereo) SF+KITTI C+SF+V->KITTI 4.1%/1.155px 49.7 95.58%/98.91%/-

** C: FlythingChairs, SF(SceneFlow including FlyingThings, Monkaa, and Driving, K: KITTI scene flow training set, V: VIPER, S: Sintel. Averaged over the 200 annotated KITTI pairs.

Inference

Run and visualize rigid segmentation of coral reef video, (pass --refine to turn on rigid motion refinement). Results will be saved at ./weights/$modelname/seq/ and a output-seg.gif file will be generated in the current folder.

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset seq-coral --datapath input/imgs/coral/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth --testres 1
python eval/generate_visual.py --datapath weights/$modelname/seq-coral/ --imgpath input/imgs/coral

Run and visualize two-view depth estimation on kitti video, a output-depth.gif will be saved to the current folder.

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset seq-kitti --datapath input/imgs/kitti_2011_09_30_drive_0028_sync_11xx/   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth --testres 1.2 --refine
python eval/generate_visual.py --datapath weights/$modelname/seq-kitti/ --imgpath input/imgs/kitti_2011_09_30_drive_0028_sync_11xx
python eval/render_scene.py --inpath weights/rigidmask-sf/seq-kitti/pc0-0000001110.ply

Run and evaluate kitti-sceneflow (monocular setup, Tab. 1 and Tab. 2),

modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015 --datapath path-to-kitti-sceneflow-training   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --testres 1.2 --refine
python eval/eval_seg.py   --path weights/$modelname/  --dataset 2015
python eval/eval_sf.py   --path weights/$modelname/  --dataset 2015
modelname=rigidmask-sf
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset sintel_mrflow_val --datapath path-to-sintel-training   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --testres 1.5 --refine
python eval/eval_seg.py   --path weights/$modelname/  --dataset sintel
python eval/eval_sf.py   --path weights/$modelname/  --dataset sintel

Run and evaluate kitti-sceneflow (stereo setup, Tab. 6),

modelname=rigidmask-kitti
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015 --datapath path-to-kitti-sceneflow-images   --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --disp_path input/disp/kittisf-train-hsm-disp/ --fac 2 --maxdisp 512 --refine --sensor stereo
python eval/eval_seg.py   --path weights/$modelname/  --dataset 2015
python eval/eval_sf.py    --path weights/$modelname/  --dataset 2015

To generate results for kitti-sceneflow benchmark (stereo setup, Tab. 3),

modelname=rigidmask-kitti
mkdir ./benchmark_output
CUDA_VISIBLE_DEVICES=1 python submission.py --dataset 2015test --datapath path-to-kitti-sceneflow-images  --outdir ./weights/$modelname/ --loadmodel ./weights/$modelname/weights.pth  --disp_path input/disp/kittisf-test-ganet-disp/ --fac 2 --maxdisp 512 --refine --sensor stereo

Training (TODO)

Training on synthetic dataset

First download and unzip the scene flow dataset under the same folder. You'll need RGB images, camera data, object segmentation, disparity, disparity change, and optical flow. It takes 1~2 TB space. Then download the pre-trained optical flow and expansion network (trained on synthetic datasets)

gdown https://drive.google.com/uc?id=11F_dI6o37nzA9B5V7OT-UwAl66LWlu-4 -O ./weights/flowexp-sf.pth

To train the rigidmask-sf model, run

datapath=path-to-data-dir
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 train.py --logname logname --database $datapath --savemodel ./weights --stage segsf --nproc 4 --loadmodel ./weights/flowexp-sf.pth

Training on synthetic dataset + KITTI

First download and place the KITTI-SF dataset under the same folder as the scene flow dataset. The pre-computed relative camera poses of KITTI-SF can be downloaded here, and placed underkitti_scene/training/.

Then download the pre-trained optical flow and expansion network (trained on synthetic datasets and fine-tuned on KITTI).

gdown https://drive.google.com/uc?id=1uWPvL71KXeFY0U4wFluJfYJdcFfwEH_G -O ./weights/flowexp-kitti.pth
      

To train the rigidmask-kitti model, run

datapath=path-to-data-dir
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 train.py --logname logname --database $datapath --savemodel ./weights --stage segkitti --nproc 4 --loadmodel ./weights/flowexp-kitti.pth

Acknowledge (incomplete)

rigidmask's People

Contributors

gengshan-y avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rigidmask's Issues

KITTI 2015 test inference

Hi,
We want to get the flow result on KITTI 2015 testing. We use the following command:

python eval/eval_flow.py   --path precomputed/rigidmask-sf/  --dataset 2015test

But we found the error:

FileNotFoundError: [Errno 2] No such file or directory: 'precomputed/rigidmask-sf//2015test/flo-000000_10.pfm'

Do you have any suggestions?

sintel result

Hello, may I ask which sequences of Sintel were tested in the test results of Sintel in the paper? I tested all the sequences of Sintel and obtained a BG IOU result of 76, instead of the 86 in Table 1 of the paper

problem

hello,I have a probelm, in the 'exploader.py' , the 168 line shows us that change_size[:,:,1:4]=d1,d2,d2 and change_size[:,:,4:6]=flow3d, then we get the rectified 3D flow named 'p3d' from the 202 line, but the 203 line shows that we replace the 'd1,d2,d2' with the rectified 3D flow named 'p3d', I have the problem why we do not replace the old flow3d which is change_size[:,:,4:6] with 'p3d'?

different results

Hello, in your paper, I want to know that why the results of 'ours' in the table 1 are different from those of 'Reference' in table 4.

Problem about loading resnext101_32x8d_wsl

Thank you for your wonderful work. But when I reproduced the results, some problems appeared. The model resnext101_32x8d_wsl cannot be downloaded, as shown in the figure below. The problem seems like #404 error when pulling model from hub. Can you provide this file that you downloaded before? If so, I would be very grateful!
image
Thank you again for your wonderful work, and hope to get your reply!

Missing models/ngransac

Hi,
firstly, thanks a lot for your work :) I have a short note after following your installation guidelines: it seems that ngransac is missing in the models folder. Although one can clone the repository oneself, can you please still commit its working version?
Thanks!

how to train the model

hi, i am training the model using kitti dataset only, but i faced a problem. When I trained the model, it appears that the ./ssd/kitti scene/training/calib/000044.txt doesn't exists, and may i ask that is there a calib file for kitti dataset training, or calib cam_to_cam is the calib file. Here is a list of files i can fetch in the training section. Is a calib file provided but not in my list. If there is, could you please share me the link of that calib file?
help

problem

Hello! Thank you for the training code. But I find some problems. Firstly, in the train code , we should replace the 'modela' with the 'mode' in the 400 line. Secondly, I do not find the 'max_epo' in the code except for the 503 line. Lastly the pre-trained optical flow and expansion network provided are the same whether train on synthetic dataset or on synthetic dataset + KITTI.

speed

Thanks for your work. Could you please tell me briefly what is the approximate running speed of your network under what kind of hardware?

pose.txt

hello, I do not find the 'pose.txt' in the 'kitti_scene/training', could share it with me?

Kitti Inference

Hello! I was wondering how was the calibration file constructed, for the kitti video example. I am trying to run the model on other kitti videos, but the calibration files are looking entirely different. Thank you!

How to use it on own data?

I want to detect moving object and their direction on my own video,but I did'nt have the calib file,how to use it on own video?

rigid segmentation

Hi,
can you released your code on the rigid segmentation? i seem to be stuck at this part
Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.