Coder Social home page Coder Social logo

svip-lab / planedepth Goto Github PK

View Code? Open in Web Editor NEW
97.0 97.0 5.0 1.83 MB

[CVPR2023] This is an official implementation for "PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes".

License: Other

Python 99.77% Shell 0.23%
cvpr2023 depth-estimation self-supervised-learning

planedepth's People

Contributors

dwawayu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

planedepth's Issues

`eval.sh` crush because of missing files

Hello, I'm trying to run eval.sh with the pretrained models of "stage1" provided in this repo on the Kitti dataset that was downloaded and extracted as described in the readme.md file.

I downloaded the pretrained models and saved them to a new folder named "ckpt" inside the PlaneDepth repo folder

the eval.sh file I run

CUDA_VISIBLE_DEVICES=0 python evaluate_depth_HR.py \
--eval_stereo \
--load_weights_folder ./ckpt \
--models_to_load encoder depth \
--use_denseaspp \
--plane_residual \
--use_mixture_loss \
--batch_size 1 \
--width 1280 \
--height 384 \

console output is:

>> /usr/bin/zsh /home/.../projects/PlaneDepth/eval.sh                                                                                                                                      [main]
-> Loading weights from ./ckpt
use 49 xy planes, 14 xz planes and 0 yz planes.
use DenseAspp Block
use mixture Lap loss
use plane residual
-> Computing predictions with size 1280x384
0.6184294
Traceback (most recent call last):
  File "/home/.../projects/PlaneDepth/evaluate_depth_HR.py", line 287, in <module>
    evaluate(options.parse())
  File "/home/.../projects/PlaneDepth/evaluate_depth_HR.py", line 216, in evaluate
    gt_depths = np.load(gt_path, fix_imports=True, encoding='latin1', allow_pickle=True)["data"]
  File "/home/.../miniconda3/envs/planedepth/lib/python3.9/site-packages/numpy/lib/npyio.py", line 417, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: '/home/.../projects/PlaneDepth/./splits/eigen_raw/gt_depths.npz'

I saw that there is a creation of this file on "PlaneDepth/splits/eigen_improved/prepare_groundtruth.py" but its only for the "eigen_improves" and it seems to contain hard-coded path that is not relevant on line 23, GT_path = os.path.join("/public/home/wangry3/datasets/kitti_depth/train/", line[0][11:], "proj_depth/groundtruth/image_02", line[1]+".png")

and the dataset structure seems to be different to the dataset structure provided by this repo and "kitti_archives_to_download.txt"

can someone please explain what am I missing? thank you 🙏
my repository tree:

PlaneDepth # cloned from repo
├── ckpt
      ├── depth.pth # from the repo PlaneDepth pre-trained section
       |── encoder.pth # from the repo PlaneDepth pre-trained section
├── kitti # downloaded and extracted as described PlaneDepth repo
      ├── 2011_09_26
      │   ├── 2011_09_26_drive_0001_sync
      │   ├── 2011_09_26_drive_0002_sync
      │   ├── 2011_09_26_drive_0005_sync
...

GPU and efficiency

Hi @Dwawayu, thank you for sharing the code!
Could you please provide information on the GPU(s) used for this project and the quantity?
Additionally, it would be helpful to know the training and inference runtimes.

I'm looking forward to your reply!

Some question about training progress

Dear @Dwawayu
Thank you for your amazing work!

  1. Now I'm tying to retraining your model with only 1 RTX3090. So I use: CUDA_VISIBLE_DEVICES=0 python -u train.py --png --model_name ** --use_denseaspp --use_mixture_loss --plane_residual --flip_right for first stage and delete all the code like dist.get_rank() == 0: blablabla . Are there any other details I need to pay attention?
  2. I would like to see what your training output looks like. I've noticed that my training loss is often negative and after about 20 epochs, the val abs_rel is around 0.11. Is this normal?
  3. I notice that at here you recode the bestmodel . But at HRfinetune you use --load_weights_folder ./log/ResNet/exp1/last_models .Why you don't use the bestmodel?
  4. In the paper:
    image
    I'm wandering All of your results are the 50th epoch result (last_model)? Or the best_model result? And All the three line you had use the HRfinetune, right?

Thank your for your time and help.

"train.py" crush when using flag `--use_mixture_loss`

I run the train.py as follows

CUDA_VISIBLE_DEVICES=0 torchrun  train.py \
--png \
--model_name exp1 \
--use_denseaspp \
--plane_residual \
--flip_right \
--use_mixture_loss

and I get

>> CUDA_VISIBLE_DEVICES=0 torchrun  train.py \                                                                                                                                             [main]
--png \
--model_name exp1 \
--use_denseaspp \
--plane_residual \
--flip_right \
--use_mixture_loss \

./trainer_1stage.py not exist!
copy ./networks/depth_decoder.py -> ./log/ResNet/exp1/depth_decoder.py
copy ./train_ResNet.sh -> ./log/ResNet/exp1/train_ResNet.sh
train ResNet
use 49 xy planes, 14 xz planes and 0 yz planes.
use DenseAspp Block
use mixture Lap loss
use plane residual
Training model named:
   exp1
Models and tensorboard events files are saved to:
   ./log/ResNet
Training is using:
   cuda
Using split:
   eigen_full_left
There are 22600 training items and 1776 validation items

Training
[W reducer.cpp:1303] Warning: find_unused_parameters=True was specified in DDP constructor, but did not find any unused parameters in the forward pass. This flag results in an extra traversal of the autograd graph every iteration,  which can adversely affect performance. If your model indeed never has any unused parameters in the forward pass, consider turning this flag off. Note that this warning may be a false positive if your model has flow control causing later iterations to have unused parameters. (function operator())
Traceback (most recent call last):
  File "/home/bar/projects/PlaneDepth/train.py", line 21, in <module>
    trainer.train()
  File "/home/bar/projects/PlaneDepth/trainer.py", line 248, in train
    self.run_epoch()
  File "/home/bar/projects/PlaneDepth/trainer.py", line 300, in run_epoch
    losses["loss/total_loss"].backward()
  File "/home/bar/miniconda3/envs/planedepth/lib/python3.9/site-packages/torch/_tensor.py", line 307, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
  File "/home/bar/miniconda3/envs/planedepth/lib/python3.9/site-packages/torch/autograd/__init__.py", line 154, in backward
    Variable._execution_engine.run_backward(
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_INITIALIZED
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 27588) of binary: /home/bar/miniconda3/envs/planedepth/bin/python
Traceback (most recent call last):
  File "/home/bar/miniconda3/envs/planedepth/bin/torchrun", line 33, in <module>
    sys.exit(load_entry_point('torch==1.10.1', 'console_scripts', 'torchrun')())
  File "/home/bar/miniconda3/envs/planedepth/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 345, in wrapper
    return f(*args, **kwargs)
  File "/home/bar/miniconda3/envs/planedepth/lib/python3.9/site-packages/torch/distributed/run.py", line 719, in main
    run(args)
  File "/home/bar/miniconda3/envs/planedepth/lib/python3.9/site-packages/torch/distributed/run.py", line 710, in run
    elastic_launch(
  File "/home/bar/miniconda3/envs/planedepth/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/home/bar/miniconda3/envs/planedepth/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
train.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2023-04-24_18:06:29
  host      : clikaws105
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 27588)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================

Any other configuration for "train.py" I use without --use_mixture_loss run smoothly.
for example, command below runs well.

CUDA_VISIBLE_DEVICES=0 torchrun  train.py \
--png \
--model_name exp1 \
--use_denseaspp \
--plane_residual \
--flip_right

Can anyone please help me fix this?

About val part code

Thanks for you amazing work!

Could you tell me why to use this in compute_depth_losses part code?

I don't find any other same code in evaluateHR.py. But they are do the same thing : compute the loss metric????

How to evaluate a single jpg image?

Hi, I sincerely congratulate you for publishing such an excellent article.

When I tried to test a single image, I encountered some problems about image process, weight file not match the pertained model.
Thank you very much for your time and support. I look forward to receiving your guidance.

Inquiry about the HR image‘s resolution and their processing method

Dear authors,

I am currently following your fantastic work and have noticed that in your paper, you mentioned fine-tuning the network on high-resolution images of 1280384. However, in the KITTI dataset, the maximum resolution shown in image_02 is 1242375. Therefore, I would like to inquire whether the high-resolution images were cropped or resized to achieve the resolution of 1280*384. If cropping was used, could you please provide more information on how the cropping was performed? If resizing was used, could you please clarify the method used for resizing? Thank you for your time and consideration.

Sincerely!

monocualr results

Hi, thanks for your impressive work!
I found that in Table 2, there are no results on monocular settings for training (Only stereo and monocular plus stereo).
Did you do the experiments on the monocular settings?

关于加入新loss导致的网络崩溃

感谢您的突出贡献!
我正在尝试加一个额外的loss来辅助训练模型。但是网络训练崩溃了,出现了非常大的波动。由于您的Loss设置,导致输出的记录中loss没有明显的下降情况,所以我很难知道原因。我想问:
您设定的ph_loss是一个负值的这种特殊情况,加入新loss有没有需要特地注意的?
在loss权重的设定上,您设定0.04这样的值应该经过了一些尝试,所以这个模型对loss权重非常敏感吗?有没有好的办法来确定加入新loss的权重?

Possible licencing conflict

Dear authors,

I appreciate your great work and I don't mean to be the party pooper but your codebase seems to be built upon monodepth2 which has a very restrictive license. See 3. Redistribution and modifications.

I don't think you can just use their code and re-license it to MIT License.

I'm in no way related to Ninantic. Just a fellow researcher.

Best regards,
Zeeshan Khan Suri

关于概率分布的疑问

image 您好,感谢你们的工作,我看论文时有个问题。 Di是离散的视差值或者深度值,pi是离散的概率值。直觉上pi应该服从某种凸分布,为什么你们不直接使用拉普拉斯分布拟合pi而是使用了使用了很多拉普拉斯分布来混合获得这个值呢? 如果我的问题有根本性错误,请告诉我,谢谢

can't find a suitable pretrained model

Thanks for your impressive work!
when I tested on custmo model,"load_state_dict(torch.load(os.path.join(opt.load_weights_folder, "plade.pth")))", but it seems like no pretrained model end with"plade.pth", could you please tell me which model is matched with PladeNet, thankyou!

Unclear requirements.txt file

Hello I had a a hard time with the requirements.txt since there are some typos may I suggest this:

pip install ipython==8.11.0
pip install matplotlib==3.5.0
pip install numpy==1.21.2
pip install opencv-python==4.5.5.62
pip install Pillow==9.4.0
pip install six==1.16.0
pip install tensorboardX==2.6
pip install scikit-image==0.18.3
torch==1.10.0
torchvision==0.11.1

OutPut Size Cant be 1242*375

Dear Author:
I am currently using the PlaneDepth model for image depth prediction. However, I am encountering issues while trying to predict the depth of an image to the resolution of 1242x375(It is the orgin resloution of kitti dataset. And I mean in the eva.sh, I use --width 1242 --height 375)
Traceback (most recent call last): File xx/PlaneDepth/out_depth.py", line 181, in <module> evaluate(options.parse()) File "xx/PlaneDepth/out_depth.py", line 136, in evaluate output = depth_decoder(encoder(input_color), grids) File "xx/PlaneDepth/networks/depth_decoder.py", line 136, in forward x = torch.cat(x, 1) RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 48 but got size 47 for tensor number 1 in the list.

Cannot be evaluated with pretrained models

Hi I'm trying to run your eval.sh file with pretrained model and got this error can you help me, thanks

eval.sh
CUDA_VISIBLE_DEVICES=0 python evaluate_depth_HR.py \ --eval_stereo \ --load_weights_folder pretrained/stage1 \ --models_to_load encoder depth \ --use_denseaspp \ --plane_residual \ --width 1280 \ --height 384

error
RuntimeError: Error(s) in loading state_dict for DepthDecoder: Missing key(s) in state_dict: "decoder.13.0.weight", "decoder.13.0.bias", "decoder.13.2.weight", "decoder.13.2.bias". Unexpected key(s) in state_dict: "decoder.14.0.weight", "decoder.14.0.bias", "decoder.14.2.weight", "decoder.14.2.bias", "decoder.13.conv.weight", "decoder.13.conv.bias".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.