xueliancheng / leastereo Goto Github PK

View Code? Open in Web Editor NEW

255.0 255.0 52.0 77.37 MB

Hierarchical Neural Architecture Searchfor Deep Stereo Matching (NeurIPS 2020)

License: MIT License

Python 95.66% Shell 4.34%

leastereo's People

Contributors

Stargazers

Watchers

Forkers

wang-kx zhwzhong chengming2 trendingtechnology killsking rie1010 raywuuuuu lebronboo wpfhtl jucic yiweichen04 chaobo521 sttomato songhyeon jwpleow bhlarson pabrousseau wxchencn 00-01 jb892 purenightmare longervision s95huang pw22-sbn-01 tungngovn avakanksh weilongye wayne980 wassryan zhangyy12345 blackhorz devmentality akashihi devmentality0 sneakyluke88 jtzhang shuowang-ai jerryweihuajing runner42195 zhaohuai-l ye-hanyu sundragon1993 ysimson guanbo-tju wxz1996 yuhuang-ca yes-jumby steven-xiong nicholasdalhaug pem-gini mengzelin

leastereo's Issues

forward pass in layer 1

level6_new = normalized_betas[layer][0][2] * level6_new_1 + normalized_betas[layer][1][2] * level6_new_2 seems not correct.

i think it should be
level6_new = normalized_betas[layer][0][2] * level6_new_1 + normalized_betas[layer][1][1] * level6_new_2

How to test picutures with 1280*720?

Hi, I want to predict the disparity of picutures with 1280*720. I set crop_height=720 --crop_width=1280, however it meets an error.
Could you tell me how to fix it? Thanks a lot!

GPU too small

Apparently my GPU is too small (GeForce GTX 1660 Ti with Max-Q Design, 5944MiB)

I added in utils/multadds-count.py line 24
torch.cuda.empty_cache()

After this new memory could be allocated and your code worked (only tried with the kitti2015 dataset)

Background model of LEAStereo and other questions

What is the background model you used to build a LEAStereo model?
Also I want to know the reason why you used a skip connection between 2 and 5 node, and 5 and 9 node.
Finally, is there any reason why you replace both feature extraction and cost aggregation parts with NAS?

predict_md.sh (Middlebury 2014 dataset) GPU running out of memory

I'm able to run inference on a Kitti 2015 dataset.
Do you know how can I run prediction on Middlebury 2014 with a single GPU with 24GB?
It always run out of memory, Should I downsize the input?

I'm using MiddEval3-data-H -> 1000 x 1500 size

Exception has occurred: RuntimeError
CUDA out of memory. Tried to allocate 5.49 GiB (GPU 0; 23.68 GiB total capacity; 16.71 GiB already allocated; 3.46 GiB free; 18.42 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
File "/home/andreaa/dev/stereo_depth/LEAStereo/retrain/skip_model_3d.py", line 47, in forward
s1 = F.interpolate(s1, [feature_size_d, feature_size_h, feature_size_w], mode='trilinear', align_corners=True)
File "/home/andreaa/dev/stereo_depth/LEAStereo/retrain/skip_model_3d.py", line 155, in forward
out10= self.cells[10](out9[0], out9[1])
File "/home/andreaa/dev/stereo_depth/LEAStereo/retrain/LEAStereo.py", line 41, in forward
cost = self.matching(cost)
File "/home/andreaa/dev/stereo_depth/LEAStereo/utils/multadds_count.py", line 21, in comp_multadds
_ = model(input_data, input_data)
File "/home/andreaa/dev/stereo_depth/LEAStereo/predict.py", line 48, in
mult_adds = comp_multadds(model, input_size=(3,opt.crop_height, opt.crop_width)) #(3,192, 192))

Is the search.py running on multi gpu?

I tried to expose multiview GPUs to the search script but it looks like there is only one GPU that is used by the script by checking nvidia-smi

Are KITTI Calibration Files needed? If so, where are they processed in the pipeline

Hi all,

I am attempting to use this model on a custom dataset, and I have not been able to find where if at all the stereo calibration parameters are used. The stereo projection matrices are different of course from KITTI's so I am curious where I need to update within the model to reflect these changes, if at all.

Best,
Eric

Exception search.sh "Exception: No GPU found, please run without --cuda"

in search.sh line 1 you set
CUDA_VISIBLE_DEVICES=1

because of this I got the error:
File "search.py", line 33, in
raise Exception("No GPU found, please run without --cuda")
Exception: No GPU found, please run without --cuda

This was caused, because torch.cuda.is_available() returned FALSE;

But when I executed the code manually I could not reproduce it, because torch.cuda.is_available() returned TRUE;

After I changed CUDA_VISIBLE_DEVICES=0 it worked. (I double checked with nvidia-smi)

It took me some hours to figure this solution out. I am new to working with linux, cuda, virtual environments and pytorch. So maybe this isn't a real issue and quite obvious to others.

Required Operations

I miss some info in the paper.

You report the number of parameters of the network, but could you report the number of estimated floating point operations of the inference network?

Python 3.8 conflicts with Conda Install Opencv. How do you resolve it ？

Three pixel error benchmark code

Thank you for your great work.

I have a question about the three pixel error benchmark in your code. Can you explain why the disp_true was less than 1? Should we replace the number 1 by the number 3 for the threshold? And why you consider the situation when disp_true < true_disp*0.05?

correct = (disp_true[index[0][:], index[1][:], index[2][:]] < 1)|(disp_true[index[0][:], index[1][:], index[2][:]] < true_disp[index[0][:], index[1][:], index[2][:]]*0.05)

This is code line 145 in file train.py

GPU requirements

What are the requirements for the GPU? In the paper the NVIDA V100 GPU is mentioned for the trainingstep but here in the comments RTX8000 48GB and GeForce 3090 are mentioned.

Also, is there any trained model for use?

Training logs and intermediate checkpoints

Hello. Thank you for this wonderful work. Would it be possible to provide training logs and intermediate checkpoints? Because when I try to reproduce the results, I am not sure how well the training is going until it is complete.

Trained model

Hello,
Does anyone have the trained model ready? I wanted to do some testing and I would be happy to use the finished model.

How to enable multiple GPU training?

Dear authors,

I am very impressive of your excellent work!
We are trying your code to reproduce the experimental results. We have a server equipped with 4 V100 GPUs and would like to enable multiple GPU training. Would you mind telling some information about that?
Thank you very much!

Best regards,
Qiang Wang

pre-trained weights for Scene Flow

Hi Xuelian, I am wondering if you could provide the pre-trained weights for Scene Flow? Thanks!

search.sh has error

Prepare the environment according to the procedure, I try "sh search.sh" in the Architecture Search 1st steps.

Traceback (most recent call last):
  File "/home/test/anaconda3/envs/leastereo/lib/python3.8/site-packages/PIL/Image.py", line 2813, in fromarray
    mode, rawmode = _fromarray_typemap[typekey]
KeyError: ((1, 1, 192), '|u1')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "search.py", line 303, in <module>
    trainer.training(epoch)
  File "search.py", line 211, in training
    self.summary.visualize_image_stereo(self.writer, input1, target, output, global_step)
  File "/home/test/test/LEAstereo/utils/summaries.py", line 56, in visualize_image_stereo
    writer.add_image('Predicted disparity', pr_image, global_step)
  File "/home/test/anaconda3/envs/leastereo/lib/python3.8/site-packages/tensorboardX/writer.py", line 667, in add_image
    image(tag, img_tensor, dataformats=dataformats), global_step, walltime)
  File "/home/test/anaconda3/envs/leastereo/lib/python3.8/site-packages/tensorboardX/summary.py", line 288, in image
    image = make_image(tensor, rescale=rescale)
  File "/home/test/anaconda3/envs/leastereo/lib/python3.8/site-packages/tensorboardX/summary.py", line 330, in make_image
    image = Image.fromarray(tensor)
  File "/home/test/anaconda3/envs/leastereo/lib/python3.8/site-packages/PIL/Image.py", line 2815, in fromarray
    raise TypeError("Cannot handle this data type: %s, %s" % typekey) from e
TypeError: Cannot handle this data type: (1, 1, 192), |u1

Do you have any idea?

missing train_sf.sh

Retrain cannot get the same validation error

Hi Xuelian, I just ran train_kitti15.sh on my server without changing any setting (batch size 4 on 1 GPU started from the given weights on SceneFlow), however, I cannot get similar results on the validation set as the given weights (3 px err 5.97% vs 1.88%). I am wondering if this is the same setting as the one that gives the provided weights? Thanks!

RuntimeError: CUDA out of memory.

I run into the following error in models/operations_3d.py line 44 when I tired to predict the depth of middleburry dataset by running "sh predict_md.sh".

RuntimeError: CUDA out of memory. Tried to allocate 2.75 GiB (GPU 0; 14.73 GiB total capacity; 11.05 GiB already allocated; 1.64 GiB free; 12.24 GiB reserved in total by PyTorch)

the configuration of my system is as follows:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:D8:00.0 Off | 0 |
| N/A 29C P8 10W / 70W | 10MiB / 15079MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
I've tried some solutions such as "torch.cuda.empty_cache()" and "with torch.no_grad()" but they don't work.
I have a CUDA Version of 10.1 rather than 10.2 as the author recommended because I have a nvidia driver version of 410.104 which doesn't support cuda 10.2, I don't know whether this is the problem.
If there is anyone knows what's the problem, does this mean I have to update my gpu to a higher memory.

another bug in feature aggregation

Hi Xuelian,

I think there is a bug in

LEAStereo/models/build_model_2d.py

Line 261 in 424b2b1

    
           level6_new = normalized_betas[layer][0][2] * level6_new_1 + normalized_betas[layer][1][2] * level6_new_2

should be:
level6_new = normalized_betas[layer][0][2] * level6_new_1 + normalized_betas[layer][1][1] * level6_new_2
I created a pull request.

Cannot use code to train model obtained by NAS

Hi!
I'm using your code to find best model on SceneFlow. I need better latency so I reduced parameters of search to 4 for feature net and 8 for matching net. NAS training worked well (despite OOM on epoch 11). I decoded model (there were also problems because of DataParallel saving) and tried to retrain it. But script crashed on the line:
https://github.com/XuelianCheng/LEAStereo/blob/master/retrain/skip_model_3d.py#L148
because of number of channels mismatch. As far as I understand it is not the only case, it is hardcoded to use 12 cells.
How can I adapt this code to use different parameters?

关于KITTI评估代码

作者您好，原谅我使用汉语请教您，（我英文水平太菜了）。
1、对于SceneFlow数据集的评估，普遍都使用EPE（也就是MAE）作为评估标准，而且代码里也可以实现评估函数进行评估。
2、对于KITTI2012数据集，评价标准有Noc和Occ（All）的>2px, >3px, >4px, >5px以及Mean Error的错误率和错误像素数的评估，这些评估都是需要在自己代码里面实现它们的函数吗？还是需要提交到KITTI官网上生成评测结果呢？
3、对于KITTI2015数据集，评价标准里有All（Occ）和Noc的D1-bg，D1-fg，D1-all的错误率评估，需要自己在代码里面实现评估函数吗，还是必须提交到KITTI官网上评测结果呢？
4、对于kitti12来说，所以的评估标准可以自己代码实现；但是对于kitti2015来说，自己无法实现评价代码，D1-bg，D1-fg，D1-all这些怎么实现？
5、而且发论文的话，KITTI12和15的实验数据必须来自kitti的官方网站吗？
对于以上问题，目前还是比较迷惑的，kitti网站好像说是不能用于调试程序，每个人只能在规定时间内提交一次把，也不能申请多个账号吧。
所以对于这些评价标准的问题，还望作者大佬您能在百忙中抽出时间不吝赐教，万分感谢！！！~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~小白先行谢过！！！！！！！

Reproduce results on the SceneFlow dataset

Hi,

I am wondering how to reproduce the sceneflow results. I followed the author's scripts that train the architecture searched for 20 epochs with a fixed learning rate of 0.001. The results were far from the reported numbers in the paper (1.01EPE against 0.78EPE). So I would like to know any learning rate scheduling that could reproduce the sceneflow results (also any checkpoint available? It seems the sceneflow checkpoint is missing)

Thanks!

UnboundLocalError: local variable 'fea' referenced before assignment

I change the crop size with this issue: UnboundLocalError: local variable 'fea' referenced before assignment

Something wrong with new_model_2d.py

Hi Xuelian,

When i try to use the MiddEval3 model, i've this error.

Traceback (most recent call last): File "/home/jgerhards/LEAStereo-master/predic2.py", line 31, in <module> mult_adds = comp_multadds(model, input_size=(3,270, 1920)) #(3,192, 192)) File "/home/jgerhards/LEAStereo-master/utils/multadds_count.py", line 21, in comp_multadds _ = model(input_data, input_data) File "/home/jgerhards/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/jgerhards/LEAStereo-master/retrain/LEAStereo.py", line 28, in forward x = self.feature(x) File "/home/jgerhards/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 722, in _call_impl result = self.forward(*input, **kwargs) File "/home/jgerhards/LEAStereo-master/retrain/new_model_2d.py", line 165, in forward return fea UnboundLocalError: local variable 'fea' referenced before assignment

predicted disparity issue

How to use the code in my own dataset?

Hi, author!

Nice paper and nice code work!

I want to use your code in my own dataset, how can I achieve that?

Re-train learning rate and epoch schedule

I'm look at trying to re-produce your results.
I see that none of the train_*.sh scripts include a learning rate and therefore obtain_train_args() will give 0.001 by default.
However, I see in all the the released checkpoints:

checkpoint['optimizer']['param_groups'][0]['initial_lr']
0.0001
checkpoint['optimizer']['param_groups'][0]['lr']
1.25e-05

Is this just a small oversight? Might there be any other differences?

评测指标相关问题

您好，请问是否可以提供一下评测Sceneflow测试集，kitti2015val 的评测代码？不胜感激

RuntimeError: Error(s) in loading state_dict for LEAStereo

Hi. Thanks for your great project.
I followed your guide to run prediction on KITTI2012 dataset. However, when I run sh predict_kitti12.sh, I got this error message:

Traceback (most recent call last):
  File "predict.py", line 58, in <module>
    model.load_state_dict(checkpoint['state_dict'], strict=True)      
  File "/home/tungngo/anaconda3/envs/leastereo_11/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for LEAStereo:
	Missing key(s) in state_dict: "feature.cells.0.pre_preprocess.conv.weight",

and a very long list of other keys. Do you know how to solve this problem?
My conda environment: Pytorch 1.7.0; CUDA 11.0

Input size for custom dataset

Hi,

Thanks for publishing the code of this amazing work.

I would like to know how to feed custom sized data to the network. If I feed SceneFlow data I have to use a crop of 576x960, if I want to feed KITTI data I need 384x1248. If I want to feed custom data how should I pad the inputs to run the newtork? Usually it is necessary to make the input shape multiple of 64, or 32, but I was not able to figure out what size do I need with LEAStereo.

Thanks in advance,
Sergio

How to get real depth?

First of all, thank you for making this wonderful project available.

I'm planning to use this algorithm in my thesis, for which I need to get real depth, which will be consistent with camera poses from the same dataset ( I use Kitti-stereo-2015). I've tried to estimate depth from LEAStereo disparity using the formula:
depth = fx * baseline / disparity

However when I combine the resulting depth with camera extrinsics to reproject pixels from one frame to another (for example from left frame at time t to left frame at time t+1), the results aren't meaningful (some pixel values turn out to be -40000 which is way beyond image size).

I'd like to know what is the correct formula to get real depth from the outputted disparity, I'll appreciate any help or insights. Thank you!

About Table 3 in your paper

Dear authors:
how can I get the Table 3 in your paper in addition to submitting to the Middlebury 2014 website.Where are the testing gt images? I want to compare with the gt images and analyze the error,how can i do that?can you help me?
tks!

small bug in beta normalization?

Hi,

Nice work,
I notice there is a bug in beta normalization in AutoFeature. I'm not sure I'm correct about it.
I created a pull request to fix it.

about time

Hello, I tested your code for a 1248*720 image, it just takes only 54ms on device of 2080TI, but in your paper, for KITTI resolution, the time you provide is about 300ms. Did you only count the running time of the first batch? Usually the first batch takes longer，specifically, I did not modify your model code.

CUDA Out Of Memory

Hi,
In my case I tried to validate the Middlebury 2014 (Quarter) model using ./predict_md.sh script. I tried with RTX 3070 or 1660Ti GPUs, everytime I get the CUDA oom error.
Any idea why this happen?
btw, because of the apex size, I have to use following versions (inside the conda env):
python=3.7
pytorch=1.7.1 torchvision=0.8.2 cudatoolkit=11.0
and my nvcc -V returns 11.0

Error:

(leastereo) alisvndk@alisvndk-G3-3500:~/Desktop/LEA/LEAStereo(master)$ ./predict_md.sh 
Namespace(cell_arch_fea='run/sceneflow/best/architecture/feature_genotype.npy', cell_arch_mat='run/sceneflow/best/architecture/matching_genotype.npy', crop_height=1008, crop_width=1512, cuda=True, data_path='./dataset/MiddEval3/testQ/', fea_block_multiplier=4, fea_filter_multiplier=8, fea_num_layers=6, fea_step=3, kitti2012=0, kitti2015=0, mat_block_multiplier=4, mat_filter_multiplier=8, mat_num_layers=12, mat_step=3, maxdisp=408, middlebury=1, net_arch_fea='run/sceneflow/best/architecture/feature_network_path.npy', net_arch_mat='run/sceneflow/best/architecture/matching_network_path.npy', resume='./run/MiddEval3/best/best.pth', save_path='./predict/middlebury/images/', sceneflow=0, test_list='./dataloaders/lists/middeval3_test.list')
===> Building LEAStereo model
Feature network path:[1 0 1 0 0 0]
Matching network path:[1 1 2 2 1 2 2 2 1 1 0 1] 

Total Params = 1.81MB
Feature Net Params = 0.10MB
Matching Net Params = 1.71MB
Traceback (most recent call last):
  File "predict.py", line 48, in <module>
    mult_adds = comp_multadds(model, input_size=(3,opt.crop_height, opt.crop_width)) #(3,192, 192))
  File "/home/alisvndk/Desktop/LEA/LEAStereo/utils/multadds_count.py", line 21, in comp_multadds
    _ = model(input_data, input_data)
  File "/home/alisvndk/anaconda3/envs/leastereo/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/alisvndk/Desktop/LEA/LEAStereo/retrain/LEAStereo.py", line 32, in forward
    cost = x.new().resize_(x.size()[0], x.size()[1]*2, int(self.maxdisp/3),  x.size()[2],  x.size()[3]).zero_() 
RuntimeError: CUDA out of memory. Tried to allocate 5.49 GiB (GPU 0; 5.81 GiB total capacity; 66.60 MiB already allocated; 285.50 MiB free; 4.00 GiB reserved in total by PyTorch)

Could you please give me a hand about this problem?
Thanks.