qt-zhu / aa-rmvsnet Goto Github PK

View Code? Open in Web Editor NEW

113.0 8.0 18.0 8.54 MB

[ICCV 2021] Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network.

License: MIT License

Python 97.21% Shell 2.79%

multi-view-stereo 3d-reconstruction computer-vision

aa-rmvsnet's Introduction

AA-RMVSNet

Code for AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network (ICCV 2021) in PyTorch.

paper link: arXiv | CVF

Change Log

Jun 17, 2021: Initialize repo
Jun 27, 2021: Update code
Aug 10, 2021: Update paper link
Oct 14, 2021: Update bibtex
May 23, 2022: Update network architecture & pretrained model

Data Preparation

Download the preprocessed DTU training data (also available at BaiduYun, PW: s2v2).
For other datasets, please follow the practice in Yao Yao's MVSNet repo.
Note that the newly released pretrained models are not compatible with the old codebase. Please update the code as well.

How to run

Install required dependencies:

conda create -n drmvsnet python=3.6
conda activate drmvsnet
conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=10.0 -c pytorch
conda install -c conda-forge py-opencv plyfile tensorboardx

Set root of datasets as env variables in env.sh.
Train AA-RMVSNet on DTU dataset (note that training requires a large amount of GPU memory):
```
./scripts/train_dtu.sh
```
Predict depth maps and fuse them to get point clouds of DTU:
```
./scripts/eval_dtu.sh
./scripts/fusion_dtu.sh
```
Predict depth maps and fuse them to get point clouds of Tanks and Temples:
```
./scripts/eval_tnt.sh
./scripts/fusion_tnt.sh
```
Note: if permission issues are encountered, try chmod +x <script_filename> to allow execution.

Citation

@inproceedings{wei2021aa,
  title={AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network},
  author={Wei, Zizhuang and Zhu, Qingtian and Min, Chen and Chen, Yisong and Wang, Guoping},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={6187--6196},
  year={2021}
}

Acknowledgements

This repository is heavily based on Xiaoyang Guo's PyTorch implementation.

aa-rmvsnet's People

Contributors

Stargazers

Watchers

Forkers

yannnnnnnnnnnn xubin1994 zebrajack lixiansen496 tp030ny metavai zhixiongzuo gongshuai1 leejaeyong7 cocowy1 kimsoohwan gmy0317 hanzoe torment123 shuweishao winfire9520 zcspike butterk3ks

aa-rmvsnet's Issues

Parameters for torch summary

Hello, I want to visualize all the layers of the model AARMVSNet with number of parameters, and I tried with torch summary, but I got some errors because of parameters, could you provide the parameters for summary(model, ........) ? I tried a lot of combinations, but it doesn't work.

How much memory is required for training

make sure d * interval_scale = 203.52

你好，我看到训练脚本文件中有这样一句注释：make sure d * interval_scale = 203.52。我想请问下满足这个条件的意义是什么呀？然后我还想请问下interval_scale这个变量的作用是什么，没有太搞懂。能麻烦解答下吗，多谢了。

Whether the benchmarking on Tanks and Temples use fine-tuned model on the training set of BlendedMVS.

Hi, thank you for your excellent work.
I have a question about the model used for tanks and temples. In your paper, you said that

So I wonder whether you also used fine-tuned model on the training set of BlendedMVS for benchmarking on Tanks and Temples or not. Thank you!

About numdepth and interval_scale

Hello, I'm sorry to bother you. I want to ask why the numdepth and intervalscale values set in the code are like this? ‘

d=150

interval_scale=1.35 #make sure d * interval_scale = 203.52

Why is what is said in the paper different from that in the code?

Question about the survey paper

Hello, Dr. @QT-Zhu
Recently, I'm reading your survey paper, "Deep Learning for Multi-view Stereo via Plane Sweep: A Survey".
I found the Fig. 2 of the paper is visually appealing and understandable.
Now I want to make photos of plane sweeping using my own 3D scene and 2D images, how could I achieve it?
Could you give me some suggestions?
Looking forward to your reply.

Nan while using mvsnet_cls_loss in other framework

Hi, thanks for your great work!
I find the mvsnet_cls_loss useful for improving the representational ability of the classification schedule in MVS network.
However, when I transfer mvsnet_cls_loss to other CasMVSNet-like frameworks, I get nan value of loss during training sometimes.
Have you meet this problem before? Hoping for your reply~

BTW, does the mvsnet_cls_loss corresponds to finer ground-truth in the paper?

Dataloader tensor size error

Hello, Thank you for your paper and code contributions！I encountered an error while testing a dataset created with colmap2mvsnet.py. When I set the batchsize to 1, it can be resolved without processing too much data.

Traceback (most recent call last):
File "eval.py", line 125, in
save_depth()
File "eval.py", line 87, in save_depth
for batch_idx, sample in enumerate(TestImgLoader):
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 652, in next
data = self._next_data()
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1347, in _next_data
return self._process_data(data)
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1373, in _process_data
data.reraise()
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/_utils.py", line 461, in reraise
raise exception
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop
data = fetcher.fetch(index)
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 52, in fetch
return self.collate_fn(data)
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 160, in default_collate
return elem_type({key: default_collate([d[key] for d in batch]) for key in elem})
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 160, in
return elem_type({key: default_collate([d[key] for d in batch]) for key in elem})
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 149, in default_collate
return default_collate([torch.as_tensor(b) for b in batch])
File "/home/west/.conda/envs/d2hc/lib/python3.7/site-packages/torch/utils/data/_utils/collate.py", line 141, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [512] at entry 0 and [513] at entry 1

But the batchsize can only be set to 1, which is a bit inconvenient. Do you know why? My dataset photo size dimensions are consistent, I don't know why there are errors.

Load your existing model to continue training

Hello, thank you for your contribution. I tested your network in the tnt dataset, and the result is very good. But when you say "Note that the newly released pretrained models are not compatible with the old codebase. Please update the code as well." , I have downloaded the latest entire code file. Why do I still report an error when I load your ckpt?
————————————————————————
Missing key(s) in state_dict:
"module.feature.init_conv.0.0.weight", "module.feature.init_conv.0.0.bias", "module.feature.init_conv.0.1.weight", "module.feature.init_conv.0.1.bias", "module.feature.init_conv.1.0.weight", "module.feature.init_conv.1.0.bias",
......

Unexpected key(s) in state_dict:
"cost_regularization.cell_list.0.conv.weight", "cost_regularization.cell_list.0.conv.bias", "cost_regularization.cell_list.1.conv.weight", "cost_regularization.cell_list.1.conv.bias", "cost_regularization.cell_list.2.conv.weight", "cost_regularization.cell_list.2.conv.bias",
......
————————————————————————
My device is ubuntu 20.04, torch 1.12.0, cuda 11.4, python 3.7.0, RTX 3090

Questions about testing BlendesMVS

Hello, I tested blended_MVS before, using the model you gave me. The GPU is 3090, but the memory of the GPU is insufficient when executing the test script. How do you set the test parameters and get the results

question about sample details？

Hello, thank you for your great work and generous contribution. There‘re two details in code: (1) in the script(train_dtu.sh), # make
sure num_depth* interval = 203.45. (2) in t&t datasets, the depths are reversed. What's the role of these two operations？

Error when running testing

I tried to run eval.sh and get this error, with both DTU and TNT:

RuntimeError: Error(s) in loading state_dict for AARMVSNet:
Missing key(s) in state_dict: "feature.init_conv.0.0.weight", "feature.init_conv.0.0.bias",.....
Unexpected key(s) in state_dict: "module.feature.init_conv.0.0.weight"...

When I trained, I changed the view_num from 7 to 2 and batch size to 1, because of memory and time (I'm running in colab).

The settings of fine-tune on BlendedMVS

Hi! I use the script "train_blend.sh" and the default setting to fine-tune the released pre-trained model on the BlendedMVS and hope to reproduce the result on Tanks and Temples. I have one NVIDIA 3090 (24G), but I met the out-of-memory problem. Do I need to change some settings like the "--max_h" and "--max_w"?

About the pretrained model on BlendMVS

Hello!
I wonder whether the released model has been fine tuned on BlendMVS dataset. If hot, could you please provide that model？

test in Tank and Temple benchmark

I success using eval.sh and generate "intermediate" point cloud by fusion.sh.
Can you remind me how to submit these data to tank and temple website .**Because I don't have *.log file, so I guess to use ***.log file in TAT databenchmark from mvsnet.
I just try to upload data to TAT ,can you help me ?

Code to obtain ground truth depth maps for network training?

Hi, thanks for your work. Could you please provide a high-resolution depth map like CascadeMVS (https://github.com/alibaba/cascade-stereo)? Or would you consider releasing the code for obtaining the depth map?

tank

Tanks and temples 结果

您好，我看AA-RMVSNet在Tanks and Temples数据集上的表现非常好，麻烦咨询一下如何复现AA-RMVSNet在Tanks and Temples的leaderboard上的结果呢

Question about the training cost

Thanks for your great works!

I am retraining AA-RMVSNet in DTU dataset with default settings on one V100 32GB GPU.
But it cost about 17GB for batchsize=1, and batchsize=2 will cause OOM problem.

It is really strange because in the paper, batchsize=4 costs only 20.16GB. Besides, the depth_num is set as 192 in the paper, while it is just 150 in the default setting.

Another question is that the training is very slow. It cost about 4.6s for one step of batch=1.

Can you provide any advice on it?

the finer DTU ground truth

Can you publish the code to improve DTU ground truth? We need 640*512 GT for our experiment, thanks!

NaN value in testing

When running the testing script, I found nan values occur in the computation, thus giving all 0 output depth. What are possible reasons? Thanks

Where can I download the pre-trained model

Ground Truth improvement code

Hi,
Thank you for this amazing repo. I am really curious about improvement in the ground truth depth maps. You mention that it is similar to Attention aware multi-view stereo paper by Luo et al. 2020. I am unable to find the code that was used for it. can you please point me to the relevant code for GT improvement?

Thanks,

Batch size and timing for single GPU setup

First, let me add my thanks for your excellent work.

With a single GPU setup, I infer that I should set 'batch=1' in 'env.sh'. Is that correct?

Using the default parameters with batch size of 1, computing depth for DTU models takes ~170 seconds per image on a single GPU (nVidia 1060).

Does that seem correct?

相机参数估计的问题

很抱歉打扰到您了，最近使用您的代码实现MVSNet的深度估计时，遇到了很难解决的问题，想请教一下。
在获取相机内外参数时使用了colmap（https://github.com/colmap/colmap/tree/3.5）来估计的，然后使用caolmap2mvsnet.py 将colmap估计的结果转换成MVSNet的输入格式。
目前问题是colmap估计的内外参和真值差距很大。下图是输入DTU数据时colmap估计的结果

下图是DTU提供的直

数据差异很大。
并且当使用估计的内外参作为网络输入时，输出的深度估计完全不正确。请问有什么解决方法吗？
谢谢！

Scripts to train/test/eval on custom data.

Hi, thanks for a great work.
Could you provide a script to train/test/eval on custom data? Readme only suggests to prepare data in MVSNet format, but didn't say how to run on it.