Coder Social home page Coder Social logo

albert100121 / 360sd-net Goto Github PK

View Code? Open in Web Editor NEW
162.0 14.0 32.0 89.83 MB

Pytorch implementation of ICRA 2020 paper "360° Stereo Depth Estimation with Learnable Cost Volume"

License: MIT License

Python 100.00%
360 360-photo 360degree 360-degree deep-learning deeplearning stereo-vision stereo-matching stereo stereovision

360sd-net's Introduction

360SD-Net

project page | paper | dataset

This is the implementation of our ICRA 2020 paper "360° Stereo Depth Estimation with Learnable Cost Volume" by Ning-Hsu Wang

Overview

How to Use

  • Setup a directory for all experiments. All you have to do in advance may look like this,
# SETUP REPO
>> git clone https://github.com/albert100121/360SD-Net.git
>> cd 360SD-Net
>> mkdir output
>> cd conda_env
>> conda create --name 360SD-Net python=2.7
>> conda activate 360SD-Net
>> conda install --file requirement.txt

# DOWNLOAD MP3D Dataset
>> cd ./data
# reqest download MP3D Dataset
>> unzip MP3D Dataset
# request download SF3D Dataset
>> unzip SF3D Dataset
  • Setup data and directories (opt to you as long as the data is linked correctly). Set the directory structure for data as follows:
# MP3D Dataset
./data/
     |--MP3D/
                 |--train/
                       |--image_up/
                       |--image_down/
                       |--disp_up/
                 |--val/
                       |--image_up/
                       |--image_down/
                       |--disp_up/
                 |--test/
                       |--image_up/
                       |--image_down/
                       |--disp_up/
# SF3D Dataset
./data/
     |--SF3D/
                 |--train/
                       |--image_up/
                       |--image_down/
                       |--disp_up/
                 |--val/
                       |--image_up/
                       |--image_down/
                       |--disp_up/
                 |--test/
                       |--image_up/
                       |--image_down/
                       |--disp_up/
  • Training procedure:
# For MP3D Dataset
>> python main.py --datapath data/MP3D/train/ --datapath_val data/MP3D/val/ --batch 8

# For SF3D Dataset
>> python main.py --datapath data/SF3D/train/ --datapath_val data/SF3D/val/ --batch 8 --SF3D
  • Testing prodedure:
# For MP3D Dataset
>> python testing.py --datapath data/MP3D/test/ --checkpoint checkpoints/MP3D_checkpoint/checkpoint.tar --outfile output/MP3D

# For SF3D Dataset
>> python testing.py --datapath data/SF3D/test/ --checkpoint checkpoints/SF3D_checkpoint/checkpoint.tar --outfile output/SF3D

# For Real World Data
>> python testing.py --datapath data/realworld/ --checkpoint checkpoints/Realworld_checkpoint/checkpoint.tar --real --outfile output/realworld

# For small inference
>> python testing.py --datapath data/inference/MP3D/ --checkpoint checkpoints/MP3D_checkpoint/checkpoint.tar --outfile output/small_inference
  • Disparity to Depth:
>> python utils/disp2de.py --path PATH_TO_DISPARITY

Notes

  • The training process will cost a lot of GPU memory. Please make sure you have a GPU with 32G or larger memory.
  • For testing, 1080Ti (12G) is enough for a 512 x 1024 image.

Synthetic Results

  • Depth / Error Map

* Projected PCL

Real-World Results

  • Camera Setting

* Real World Results

Citation

@article{wang2019360sdnet,
	title={360SD-Net: 360° Stereo Depth Estimation with Learnable Cost Volume},
	author={Ning-Hsu Wang and Bolivar Solarte and Yi-Hsuan Tsai and Wei-Chen Chiu and Min Sun},
	journal={arXiv preprint arXiv:1911.04460},
	year={2019}
}

360sd-net's People

Contributors

albert100121 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

360sd-net's Issues

Error

tensor.sub_(mean[:, None, None]).div_(std[:, None, None])
RuntimeError: The size of tensor a (4) must match the size of tensor b (3) at non-singleton dimension 0

Could you please help me?

About the LCV module

Hi, nice work and thank you for your sharing !
From my perspective, the pixel in equirectangular projection is proportional to the degree_disparity.
For a 2*pi×pi equirectangular projection map, degree_disparity=i/pi where i in pixel coordinates disparity.
Please correct me if I'm wrong.

The data contained in 360SD-Net/data are not suit.

Hello, I download your sample images from MP3D and SF3D provided in this repository. But I found that the up-view and down-view pictures are almost the same in your folders. So there are no differences between them.
The same thing occur in realworld folder, too. Please check out in your convenience, and let me know if I am wrong.

Best.

PreTrained Model

Can you please provide the pretrained model as I do not have the resources to train this data set?

Learnable Cost Volume model

Can you help me with this?

Epoch: 0%| | 0/500 [00:17<?, ?it/s]
Traceback (most recent call last):
File "main.py", line 348, in
main()
File "main.py", line 286, in main
loss = train(imgU_crop, imgD_crop, disp_crop)
File "main.py", line 196, in train
output1, output2, output3 = model(imgU, imgD)
File "/ssoft/spack/arvine/v1/opt/spack/linux-rhel7-skylake_avx512/gcc-8.4.0/py-torch-1.6.0-43xbre3fdhzp6upz6mfe3jk6rpwt5uky/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "/ssoft/spack/arvine/v1/opt/spack/linux-rhel7-skylake_avx512/gcc-8.4.0/py-torch-1.6.0-43xbre3fdhzp6upz6mfe3jk6rpwt5uky/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 155, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/ssoft/spack/arvine/v1/opt/spack/linux-rhel7-skylake_avx512/gcc-8.4.0/py-torch-1.6.0-43xbre3fdhzp6upz6mfe3jk6rpwt5uky/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 165, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/ssoft/spack/arvine/v1/opt/spack/linux-rhel7-skylake_avx512/gcc-8.4.0/py-torch-1.6.0-43xbre3fdhzp6upz6mfe3jk6rpwt5uky/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 85, in parallel_apply
output.reraise()
File "/ssoft/spack/arvine/v1/opt/spack/linux-rhel7-skylake_avx512/gcc-8.4.0/py-torch-1.6.0-43xbre3fdhzp6upz6mfe3jk6rpwt5uky/lib/python3.7/site-packages/torch/_utils.py", line 395, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/ssoft/spack/arvine/v1/opt/spack/linux-rhel7-skylake_avx512/gcc-8.4.0/py-torch-1.6.0-43xbre3fdhzp6upz6mfe3jk6rpwt5uky/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
output = module(*input, **kwargs)
File "/ssoft/spack/arvine/v1/opt/spack/linux-rhel7-skylake_avx512/gcc-8.4.0/py-torch-1.6.0-43xbre3fdhzp6upz6mfe3jk6rpwt5uky/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in call_impl
result = self.forward(*input, **kwargs)
File "/work/vita/danial/360SD-Net/models/LCV_ours_sub3.py", line 187, in forward
refimg_fea.size()[3]).zero
()).cuda()
TypeError: new(): argument 'size' must be tuple of ints, but found element of type float at pos 3

Is CUDA11.2 supported?

I'm trying to run the test script with your model and this command:
python testing.py --datapath data/realworld/ --checkpoint checkpoints/Realworld_checkpoint/checkpoint.tar --real --outfile output/realworld

but I faced this problem:
捕获5

I wonder is cuda11.2 supported?

Some questions about reproduction.

Hi,

I am trying to run this code with our own synthetic dataset.

I rendered two equirectangular images 512x1024 with a vertical stereo setup as yours.

However, I got a questionalble result.

Belows are input_up_image, input_down_image, result_with_MP3D_ckpt, result_with_SF3D_ckpt, and result_with_Real_ckpt.
For the last one, I gave args.real = True.

Do you think I am running incorrectly? or are these expected results? Please let me know.

For more information, I ran testing.py, and I changed __imagenet_status in preprocess.py.

I didn't work with the original code:
__imagenet_stats = { 'mean': [0.485, 0.456, 0.406], 'std': [0.229, 0.224, 0.225] }

, so I added some numbers like
__imagenet_stats = { 'mean': [0.485, 0.456, 0.406, 1], 'std': [0.229, 0.224, 0.225, 1] }

and it didn't matter much which number I gave. Should I change these numbers to reproduce your performance?

up
down
MP3D
SF3D
Real

Problems encountered while downloading the dataset

Hi,
I received this email, and then downloaded the data set according to the first link,
image
but it looked like this after downloading,
image
May I ask whether I made a mistake? How to download the dataset correctly?

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.