Coder Social home page Coder Social logo

penet_icra2021's People

Contributors

jugghm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

penet_icra2021's Issues

Experimental results using the NYU dataset

Hello, I am still training on PENet with NYU dataset, please help me to take a look. The third one in this graph is the predicted result, right? I think this also proves that this network can run on the NYU dataset, is that correct? Because I want to know if this network is suitable for a densely labeled dataset, thanks. Looking forward to your recovery.
comparison_best

Training with Multiple GPU's

Hi,
Thanks for the code, I am using 4 GPU's to train the model and training pass goes well. However when It starts validation i receive an error given below.

RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/parallel/parallel_apply.py", line 60, in _worker
    output = module(*input, **kwargs)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 545, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nazir/PENet_ICRA2021/model.py", line 440, in forward
    sparsed_feature3 = self.depth_layer3(sparsed_feature2_plus, geo_s2, geo_s3) # b 64 88 304
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 545, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/nazir/PENet_ICRA2021/basic.py", line 312, in forward
    x = torch.cat((x, g1), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 176 and 160 in dimension 2 at /tmp/pip-req-build-akjifb_7/aten/src/THC/generic/THCTensorMath.cu:71

I assume this is due to the multi-gpu training. I see that you have also used 2 GPUs for training, did you see this issue while training?

Thanks,

Use other datasets to train PENet

Hello
I now want to use the PENet network architecture to complete depth completion, because I only need the dataset at hand, so I found the HandNet dataset, which contains depth data and image data collected by the realsense series of depth cameras. I especially want to know where do I need to modify the code? Thank you in advance!
Looking forward to your reply!

Using the NYU dataset

I'm so sorry to bother you again. I watched the NLSPN network a few days ago, but I still want to use the NYU dataset on PENet to see the effect. I especially want to know why the prediction error is so large. Because I have been reading this article before. So I may still need to ask you a few questions:

  1. I only downloaded the labeled dataset of the NYU dataset. Because of the limited hardware conditions in our lab, I downloaded rgb (16-bit jpg), depth (16-bit png), and rawdepths (16-bit) in the labeled dataset. png), as far as I know, they are all aligned. I think depths is groundtruth, I don't know if I think it's right. Because I read a lot of information on the Internet, I feel that it is not specific enough.
    2.depths is the depth map after rawdepths is completed. In PENet, I take rawdepths and rgb as input, is this right? Because when I run the code, I see that the pixel values ​​of rawdepths are all 0. I don't know why this is? Is there any other work I need to do before entering it?
  2. I don't know if I express myself clearly? I'm really sorry to bother you again, I'm just getting to know this direction too. Because I really have no other way. Now, my teacher has no time, and I really can't understand what I read online. Thank you very much! Looking forward to your reply!

How to use KITTI-raw data

Hi, recently I've been through your great work, I would like to train the network , but which classes should I to download, you know that, KITTI-raw dataset is too big, so please guide me which classes in KITTI-raw dataset for training. Thanks in advance!

Runtime measurement

Hi there,
thank you very much for your excellent work and for publishing it.

I am trying to implement a "light weight" version of the ENet which aims to be faster in computation.
In order to have the runtime of the ENet on my hardware (Tesla V100 GPU) as a benchmark, I was trying to measure it. Being aware of this issue #4, I took account of the torch.cuda.synchronize() command.
Measuring the time this way, I obtained 10,5 ms runtime for processing a single image.
However, I realized that I can not compute more than 6 depth images per second (while having a GPU load of 100 %), which indicated to me that something was wrong.
Doing further investigations, I came across the PyTorch profiler which seems to be the official tool for correct GPU time measurement . Measuring the time that way I got 150 ms, which is in accordance with my maximum frame rate, as data preprocessing on the CPU comes on top.

Is it possible, that the times you measured are still not the proper execution times of the network, but rather the kernel launch times ?

Inference: own data

Hi,

I just had a few questions regarding using our own data and running inference using PENet pretrained weights.

  1. How sparse can the depth map be?
    Currently, my inference image is from the Kitti360 dataset which is quite similar to the previous kitti that the network was trained on. But there is no GT depth to sample the depth from. So my sparse depth map is quite sparse.
    When I run inference on this image, the prediction is also sparse i.e I have prediction only in the regions covered by the sparse depth map. Is this an expected behaviour?

  2. What should my input be for 'positions' (i.e the cropped image), I don't want to crop the images for running inference, so should I just set input['positions'] = input['rgb']?

It would be great if you can answer these questions when time permits :)

Regards,
Shrisha

How to convert .bin lidar files to .png

I want to infer using your model on kitti .bin files but your model take .png files. So if it's possible to provide the code for this conversion.
Thanks in advance.

Normalization Input Data

Hi and thank you very much for your great repo.
As it was mentioned in #43 before, you did not normalize the input data. However, as far as I know it is common practice to do so for a more efficient training and keeping weights and biases small. May I ask why you decided to not normalize your data?

Projection result confirmation

Hello,

I'm now working on the cross-modality detection tasks in 3D space.
Since the SFDNet uses this method as their depth completion way, so I try this repo as well.
The following is the output of PENet and the outcome I obtained when I project them back.
Does it look correct?
I know that the recovered 3d position from the depth map is suffering from the artifacts as u discuss in this issue: #3.
But it looks more severe far beyond my expectation.
Thanks for any help

0000000000
Screenshot from 2022-05-11 23-01-29
Screenshot from 2022-05-11 23-00-21

Questioning Inference Speed

Good day,

First of all, congratulations on your work and paper. The idea of separating depth-dominant and color-dominant branches is interesting. Also, thank you for releasing the source code to the public. I have been replicating your code the past few days, and so far inferencing has been straightforward (I am getting RMSE scores at around ~760).

However, correct me if I'm wrong but I think there might be a mistake in the inference time computation. In main.py line 213/216, this is where the predictions are generated from the ENet/PENet models, after which gpu_time is computed. I tried adding a print(pred) function call (see in the image below).
image

I got very different inference times with and without the print(pred) function call. I ran this on a machine with RTX 2080Ti, i7-9700k, CUDA 11.2, torch==1.3.1, torchvision==0.4.2. Below are my runtimes:

image
original code - a bit faster than your official runtime presumably due to my newer CUDA version(?)

image
modified code - much slower when print(pred) was added

My understanding is that calling pred = model(batch_data) does not yet run the model prediction; the model inference only actually runs when you call result.evaluate() in line 268 (i.e. lazy execution):
image

This results in a nearly x10 increase in inference time (i.e. 151ms vs 17ms). Can you confirm that this also happens in your environment?

Questions about KITTI-odometry

image
您好,您的工作非常出色!
我想将他应用于KITTI-odometry数据集上,并使用了PENet的预训练模型,但是效果不尽如人意。想请问一下您对此问题的建议。
万分感谢!

Cannot load ENet Model

Hi,

I am trying to load the e.pth.tar model and I am not able to do it. All I am doing is

import torch
torch.load('e.pth.tar', 'cuda:0')

and it gives me this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/core_uc/.local/lib/python3.6/site-packages/torch/serialization.py", line 608, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/core_uc/.local/lib/python3.6/site-packages/torch/serialization.py", line 787, in _legacy_load
    result = unpickler.load()
ModuleNotFoundError: No module named 'metrics'

Would you know what could be the issue? I didpip3 install metrics just in case but it did not work.

Is that possible to use 3 inputs to your model instead of 2 ?

Hi, I am wondering to use 3 inputs like as if now you are using 2 inputs like Sparse depth map and corresponding RGB, Is that possible to use one more input like Coarse dense depth map got from hand crafted methods with corresponding to sparse depth map ? please let me know if its possible, Thanks in advance .

input images with standard size like 720p

Thank you for your amazing work!
I have an interest to use the ENet depth completion model. It is working very well with the KITTI database, but when I try to feed it with images of different size (640x360) instead of (1216x352) I am facing this error:
RuntimeError: Given groups=1, weight of size [32, 4, 5, 5], expected input[1, 5, 360, 640] to have 4 channels, but got 5 channels instead.
Is the input layer dedicated to this form factor? What is your recommendation to adapt to input image of size 360x640 or 720p?

About the training data

Thanks for your code!
I have a question about the KITTI raw data. Do I need to download all the raw data? I know that in the monocular depth estimation, generally only a part of it needs to be downloaded. Is there a download list that I can refer to? thanks!

Do intesity of pixels of RGB images get normalized to the range [0, 1]?

Hello,
Thank you very much for such an amazing work.
I have a question about input of ENet. As I understand from method __getitem__ of class KittiDepth, images represented by variable rgb do not get normalized to have pixel intensity in the range [0, 1.0], instead the intesity range is [0, 255]. Is it correct?
Looking forward to your answer.

Why can sparse loss functions supervise the generation of dense depth maps?

Thank you for your excellent work, network design gives me a lot of inspiration.
But there's one thing I never understood. I noticed that the training loss function in the paper only focuses on pixels with valid depth(sparse),so why can networks generate dense depth maps? How to ensure that pixels are accurate without supervision?
image

question about the implementation of DE-CSPN++

Hi,

When I read the code about the implementation of DE-CSPN++. I notice that during the iteration you continuously update the depth3, depth5 and depth7. However, the depth3, depth5, depth7 are stored in the list without cloning the underlying data. I am wondering if it is incorrect, since the assignment operation in pytorch is only a reference assignment, so the depth stored in the list would also change when the iteration continues.

Thanks for your reply.

Can I reduce kitti_Raw data for training ?

Hi, recently I've been through your great work, I would like to train the network but with less Kitti_Raw data, is that possible? if so please guide me on how to reduce Kitt-raw data for training. Thanks in advance!

How to use the sparse depth?

Thank you for your outstanding contribution!
I want to know how the Color-dominant Branch is combined with the point cloud and sent to the network. Does the radar point cloud only take effective points, and does the color image also take the same effective points as the point cloud? How can we get a dense depth map in this way?
This question has been bothering me. I hope you can answer it for me. Thank you again!

Query Regarding iRMSE calculation.

Hello,
Thank you for the great work. I want to ask one small query regarding the iRMSE calculation. In your code iRMSE is calculated as follows

        # convert from meters to km
        inv_output_km = (1e-3 * output[valid_mask])**(-1)
        inv_target_km = (1e-3 * target[valid_mask])**(-1)
        abs_inv_diff = (inv_output_km - inv_target_km).abs()
        irmse = math.sqrt((torch.pow(abs_inv_diff, 2)).mean())`

I want to ask two things

  1. In the first two lines according to your comment, you are converting meters to kilometers. Why are you multiplying with 1e-3 and taking the power to the -1? shouldn't it be 1e3?
  2. inv_output_km is causing the iRMSE to go Nan. This is because output[valid_mask] is 0 at some entries of the tensor. Is this normal behavior?. I am using the same dataset as yours. Is it expected to decrease in late epochs? I only ran training for 5 epochs and iRMSE and IMAE are going Nan.
    P.S I am training ENet so I am on round 1.

Thanks for helping out.

Ran PENet pre-trained model and results do not match Kitti benchmark depth completion page

Thank you very much for a very interesting paper.
I have run the PENet pre-trained model you've provided in evaluation mode on the cropped image.
in the results the code provided (val.csv under results) I got RMSE=757.197 MAE=209.001, compared to RMSE=730.08 MAE=210.55 as it appears in the Kitti benchmark page.
IIs there a different PENet model that matches the submitted results? or there is something in the parameters that I put wrong (I kept the parameters as is in this repository).
Thanks a lot,
Mani

RuntimeError: CUDA out of memory.

Hello!

I encountered some difficulties while training the mode. I run it on sigle NVIDIA GTX 2080 GPU, but an error occurred

RuntimeError: CUDA out of memory. Tried to allocate 38.00 MiB (GPU 0; 7.79 GiB total capacity; 5.85 GiB already allocated; 87.56 MiB free; 136.18 MiB cached)

I think memory should be sufficient, but it only run successfully when the batch size is reduced to 1, and the gradient jitter is very severe. Do you have any advise for this ? :(

A rather common but vital problem about confidence map

Thank you for your good work!
I notice that in both CD-branch and DD-branch, confidence maps(concatenated with CD-depth and DD-depth respectively) are generated by the last convolutional layer, which are comprized of ordinary conv+bn+relu layer.

self.rgb_decoder_output = deconvbnrelu(in_channels=32, out_channels=2, kernel_size=3, stride=1, padding=1, output_padding=0)

rgb_output = self.rgb_decoder_output(rgb_feature0_plus)
rgb_depth = rgb_output[:, 0:1, :, :]
rgb_conf = rgb_output[:, 1:2, :, :]
self.decoder_layer6 = convbnrelu(in_channels=32, out_channels=2, kernel_size=3, stride=1, padding=1)

depth_output = self.decoder_layer6(decoder_feature5)
d_depth, d_conf = torch.chunk(depth_output, 2, dim=1)
rgb_conf, d_conf = torch.chunk(self.softmax(torch.cat((rgb_conf, d_conf), dim=1)), 2, dim=1)

I wonder how convbnrelu layer can output a neat confidence map and a depth map without confidence supervision. Would you please provide me with some relavant works or papers to see? Thanks.

Questions about the principal point of the croped image

In "kitti_loader.py" file, the function load_calib() changes the principal point of image by the code
"""
K[0, 2] = K[0, 2] - 13; # from width = 1242 to 1216, with a 13-pixel cut on both sides
K[1, 2] = K[1, 2] - 11.5; # from width = 375 to 352, with a 11.5-pixel cut on both sides
"""
but I find the data is croped from the bottom. In this case, I think the change of the principal point should be calculated as
"""
K[0, 2] = K[0, 2] - 13; # from width = 1242 to 1216, with a 13-pixel cut on both sides
K[1, 2] = K[1, 2] - 23; # from width = 375 to 352, with a 23-pixel cut from the bottom
"""

How to use .bin (Raw point Cloud) file for Depth completion ?

Hi, I would like to estimate the dense depth for the Kitti 3D Object detection dataset which also includes Images and its corresponding point cloud but it's in 360 views, not only front view. so how to convert Raw point cloud to the front view depth map and need to save in .png so that I can estimate dense depth map with that. If you provide code for this conversion, I would be very happy to use that code. Thanks in advance.

Trouble loading .pth.tar: Fix: metrics.py must be in the main directory

Hi,

Torch have officially removed version 1.3.1 from their website and I am unable to load the pretrained weights using torch versions 1.4 >. I get this error: ModuleNotFoundError: No module named 'metrics', which is then followed by AttributeError: Can't get attribute 'Result' on <module 'metrics' from '/home/lib/python3.8/site-packages/metrics/__init__.py'> after I pip install metrics.

All of these errors arise when I try torch.load('pe.pth.tar').

I searched for some issues and this maybe because PyTorch has changed the way they save and load the models. Would it be possible to save the model using this: _use_new_zipfile_serialization=True as a flag?

Or did you face this error aswell? The latest version of metrics is 0.3.3

How to use the sparse depth?

Thank you for your outstanding contribution!
I want to know how the Color-dominant Branch is combined with the point cloud and sent to the network. Does the radar point cloud only take effective points, and does the color image also take the same effective points as the point cloud? How can we get a dense depth map in this way?
This question has been bothering me. I hope you can answer it for me. Thank you again!

Modify the backbone network

Hi!
I have a question for you!
Recently I changed PENet's backbone network ENet to attention-unet, and I only used part of Kitti's dataset. I felt that the training was a little slow, and then I modified the learning rate when training the backbone network so that the network could learn faster.
image
It's ok when I train ENet. (Blue is the training set, yellow is the validation set)
But when I used the trained backbone network for the second stage of training, the network experienced severe overfitting. I wonder if this has something to do with the learning rate?
image
We can see that cspn++ is trained very poorly. The validation set error is large.
What is the reason for this?
Below is the learning rate when I train ENet.
f5519c1a5763a78fb707e811817d022

Inquiry on the dataset download

Hi thanks for your impressive work! My future research interests is on the depth completion so I would like to start with your work!

Just one minor question on downloading the kitti raw data: should I download the rectified or uncertified data? And Are all the branches (road, campus, city, person) are used?

Thanks again for your great work and looking forward to your reply!

Questions about data augmentation

Thanks for your great work !
I have a question about data augmentation part. You applied random crop on images and I notice that you didn't apply same augmentation on camera intrinsic matrix. When you crop the images, then the parameters in camera intrinsic matrix like cx,cy will mismatch. Do you guys think fixing this will get some improvement or you leave this on purpose ?

Thanks again for your open-source !

Training Log

Thanks for the nice work!
Is there any chance that the training loss log can be released? That can greatly help my debugging process.

Query regarding stage 3 training.

Thank you for the great work.
I am currently running stage 3 training to further refine the depth maps. However, the RMSE is not so great (till 15 epochs), It's still lingering around 860-890. After how many epochs do you generally experience a drop of RMSE especially in the third stage? and the hyperparameters used in the code repository for stage three is the same as you used in the stage 3 experiment?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.