barbararoessle / dense_depth_priors_nerf Goto Github PK

View Code? Open in Web Editor NEW

378.0 378.0 49.0 50.96 MB

Dense Depth Priors for Neural Radiance Fields from Sparse Input Views

License: MIT License

Python 81.17% CMake 0.77% C++ 18.06%

dense_depth_priors_nerf's People

Contributors

Stargazers

Watchers

Forkers

mfkiwl dingranshiwo eagleflag88 fcntes smamooler zhanghua7099 rudyryk mavende xinfushe cleardumm yijunwu wx-b melhashash aurora-zhou hjxwhy mikacuy stupidtree jinalee-mech hyunjin7 fredy-zhang chenbarryhu jason-yxj linzhenyuyuchen 0iui0 yuliangguo abhinavutkarsh ttaa9 jingxugithub dansonzhang karbo123 kjunsong louhz xuxumiao777 goodyenough hanoch666 ahasan-haque leejaeyong7 lulinzhang ameerazam08 wangcongbme peasant98 axifeng peterzs michaelnoi linemapping marenan this233 smartcai boliqi

dense_depth_priors_nerf's Issues

Question about depth_prior_network_path and db_path

Hello, thanks to this excellent project.
I have encountered some problems when I try to reproduce the project. Could you please tell me how to set the "--depth_prior_network_path " of run_nerf.py? (I have downloaded the pretrained depth completion model from the link provided by the author, and then unzip it to get a file folder "Archive")
What's more, I also want to know how to get the "scannet_sift_database.db" of step 1.
Thanks for your reply. (Maybe the issue seems a little bit silly lol)

The torch version of the completion depth network

Since I use pytorch 1.13, when I load the completion depth network, I get the error that the pkl's version is not 1.13. So do I know the correct version of the completion depth network(I didn't find the torch version in the readme)?

Is there a way of using RGBD images as the input?

This solution extract depth data for photos as part of the pipeline (plus uncertainties?). Is there a way of providing RGBD data directly from depth cameras?

The error I encountered while preparing my scenes using code extract_scannet_scene

Hi, thanks for your code!
I prepare my scenes using the code "extract_scannet_scene",which has two input parameters, one is "path to scene" and the other is "path to ScanNet". For parameter "path to ScanNet", I type the path of trained network downloaded in step1(named 20211027_092436.tar), but an error occurs when running the code, prompting "_! src_empty() in 'cvtcolor' ".I open "extract_scannet_scene.cpp" and find "boost:: Filesystem ::path path2scannet(argv[2])" and "boost::filesystem::path path2scannetscene(path2scannet/"scans_test"/config.kname); " in lines 161 and 163 of the "main()".I find that the trained ScanNet(20211027_092436.tar) does not contain the folder "scans_test", so I would like to know how to deal with this problem, whether I need to train a ScanNet by myself when preparing my own scenes, or there is something wrong with my method .Looking forward to your reply!

About depth completion?

This article uses depth completion instead of other depth-predicting networks (such as MVS), is there any difference? Is our deep completion network better?

On the standard deviation of depth map

std = (((z_vals - depth.unsqueeze(-1)).pow(2) * weights).sum(-1)).sqrt()
I don't quite understand the concept of standard deviation on your side. Can you expand it?

About the depth-guided ray sampling

dense_depth_priors_nerf/run_nerf.py

Line 579 in d541ebe

    
           return forward_with_additonal_samples(z_vals, raw, z_vals_2, rays_o, rays_d, viewdirs, embedded_cam, network_fn, network_query_fn, raw_noise_std, pytest)

Why is network_fine not used here (network_fn is used) after the depth-guided sampling for final rendering? The network_fine isn't used totally in the rendering process.

Question with regards to the target depth map

Hello...Can you please tell me what kind of depth is being used in the target depth map? As far as I understood from the paper, for ScanNet dataset, the RGBD images are being acquired using Structured sensor, isn't it?

Depth loss is negative

Hello and thank you for your great work! I noticed that the paper is using GaussianNLLLoss to satisfy the true sampling distribution as much as possible, as this formula shows:

But I find that this loss is always negative, is this as expected?

Also I would like to ask, have you tried to do MSE loss directly on the rendered depth map with the GT depth?

Sorry to bother you and look forward you for your reply!

What scenes are used for the training of pre-trained depth_prior_network

Hi，Thanks for your work, i want to know the scene list you have used to train depth_prior_network because i need to split the train and test dataset. Did you use all scannet scenes for depth prior training and only spare a few images from each scene for Few shot NeRF optimization?
Thanks!

I encountered the error while Optimizing NeRF with Dense Depth Priors

Hi, thanks for your code!
While Optimizing,it shows RuntimeError. cUDA is out of memory. Is there any way to reduce the required memory.

fail to build extract_scannet_scene.cpp

after I run the command make install for building preprocessing, I met the error:

preprocessing/io_colmap/src/colmap_reader.cpp:84:117: error: no matching function for call to ‘rotate(glm::mat4, float, glm::vec3)’ const glm::mat4 rot_cam2cam = glm::rotate(glm::mat4(1.0f), glm::radians(180.0f), glm::vec3(1.0f, 0.0f, 0.0f)); ^ In file included from /usr/include/glm/gtc/quaternion.hpp:434:0, from dense_depth_priors_nerf-master/preprocessing/io_colmap/src/colmap_reader.cpp:6: /usr/include/glm/gtc/quaternion.inl:560:33: note: candidate: template<class T, glm::qualifier Q> glm::tquat<T, Q> glm::rotate(const glm::tquat<T, Q>&, const T&, const glm::vec<3, T, Q>&) GLM_FUNC_QUALIFIER tquat<T, Q> rotate(tquat<T, Q> const& q, T const& angle, vec<3, T, Q> const& v) ^~~~~~

I install glm by sudo apt install libglm-dev. Finally glm-0.9.9~a2-2 is installed. But it seems that it is not suitable for this program. Could you tell me how you install the glm or how you build the environment of preprocessing?

How do I run this project on my own dataset?

How should I get the max_depth and dist2m of config.json?

How to compute stantard deviation?

I want to know, if i have the dense map already, how can I compute the stantard deviation?

About the data after processing scannet with colmap

Hi,

Thanks for your nice work! I have a question regarding the depth/poses obtained in the transform.json files. Is the ground-truth depth consistent with the camera to world pose given? i.e. do I need to scale them or can I directly use them to obtain correspondences for example?

Thanks a lot!

Error while making the preprocessing/build .

Hey I keep encountering the following error: //usr/lib/x86_64-linux-gnu/libboost_system.so.1.65.1: error adding symbols: DSO missing from command line when I run make -j in preprocessing/build.
Any idea how do I go about this?
Thanks.

Can you tell me which version about the pytorch used to train the depth completion model ? Thank you!

Can I obtain dense depth maps corresponding to each rgb images through your depth completion work？

Dear，I have reproduced your code, but my main goal is to do depth completion to obtain a dense depth maps.I have my own rgb map s and corresponding sparse depth maps and gt depth maps. Would it be possible to use your code to do a depth completion to obtain a dense depth maps?

Hi,I can’t figure out how to load "the depth completion model trained on ScanNet",or more specific,the "data.pkl"

I use torch.load("the path of pretrained model")
but it says ”_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.“

my pytorch'version is 1.8.1
I want to konw your pytorch'version on save "data.pkl",

Sorry to trouble you，this bug has bothered me for a long time.
I hope you can solve my confusion.
Thanks！！！

Could you provide the pretrained depth net on Matterport3D?

Could you provide more complete preprocessing scripts?

When I use the preprocessing code and some code I added because the code seems not complete, I found the poses reconstructed were wrong and strange. I don't know what error is in the preprocessing code and the code wrote. So could you provide a more complete preprocessing code?

UnpicklingError: A load persistent id instruction was encountered

Traceback (most recent call last):
File "/home/lhs/project/nerf/dense_depth_priors_nerf-master/run_nerf.py", line 1104, in
run_nerf()
File "/home/lhs/project/nerf/dense_depth_priors_nerf-master/run_nerf.py", line 1070, in run_nerf
train_nerf(images, depths, valid_depths, poses, intrinsics, i_split, args, scene_sample_params, lpips_alex, gt_depths, gt_valid_depths)
File "/home/lhs/project/nerf/dense_depth_priors_nerf-master/run_nerf.py", line 788, in train_nerf
depths, valid_depths = complete_and_check_depth(images, depths, valid_depths, i_train, gt_depths_train, gt_valid_depths_train,
File "/home/lhs/project/nerf/dense_depth_priors_nerf-master/run_nerf.py", line 726, in complete_and_check_depth
depths[i_train], valid_depths[i_train] = complete_depth(images[i_train], depths[i_train], valid_depths[i_train],
File "/home/lhs/project/nerf/dense_depth_priors_nerf-master/run_nerf.py", line 678, in complete_depth
ckpt = torch.load(model_path)
File "/home/lhs/.conda/envs/liao/lib/python3.10/site-packages/torch/serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/lhs/.conda/envs/liao/lib/python3.10/site-packages/torch/serialization.py", line 920, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

Hello, when reproducing, I first downloaded the depth_priors network model, then downloaded the scenes listed in the readme, and ran python3 run_nerf.py train with the above data --scene_id <scene, e.g. scene0710_00> --data_dir < directory containing the scenes> --depth_prior_network_path --ckpt_dir But the above problem occurred. Is there a problem with the version of some files?

How are camera translations scaled?

Hi,

I've a dataset of images with given camera intrinsics and extrinsics (pose). So, I'm trying to generate the transforms_train.json file without needing to run colmap and other stuff. To do this, I'm trying to figure out how transforms_train.json is created based on colmap sparse reconstruction for the ScanNet scenes. I figured out the relation between rotation matrices, however, I'm not able to figure out how translation is scaled. I found different scaling factors for the 5 scenes. I tried to understand the C++ code that generates the transforms_train.json file, but I'm not used to C++ and hence couldn't figure it out.

Can you please tell me how to compute the scaling factor for the camera translation?

PS: I also noticed that, unlike the original NeRF which scales translation so that nearest depth becomes unity, you do not do such a scaling here.

Models to reproduce the results on ScanNet and Matterport3D

Hi,

Thank you so much for your work. Will you provide your pre-trained models to reproduce the results in your paper (e.g., Tab2 and Tab3)?

Thank you for your attention.

can't run extract_scannet_scene

Unable to read the picture, keep reporting “what(): OpenCV(3.4.6) /home/mvs18/Downloads/opencv/opencv-3.4.6/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'”
May I ask how is this going？

Good work, but why can't we achieve good results after 100000 iterations of training？

make a google colab example to run

How to get the rendered model?

Hi! I have run this project successfully. The result is really good!

I found that this project provides the method of render video visualization. Does it support the output of the 3D model, like the ".ply" or ".pcd" format? I want to access the 3D model for further processing.

Thank you!

Confused about the coordinate system and COLMAP sparse reconstruction

Hi, thanks for your work. I want to use Scannet ground truth Camera pose to compute the depth map. However, I was confused by the coordinate system and Camera model.

I found that SCANNET provides Camera to World Matrix as camera pose. Do i need to transfer it to W2C so it can be used by COLMAP?
Also I don't see any code that transfer COLMAP coordinate to OPENGL coordinate, as original NeRF dose. Do you take COLMAP coordinate as model input?
The camera.txt in sparse dictionary have a WIDTH, HEIGHT of 624*468. However, the SCANNET original image size is 1296 * 968. Does this mean that we need to run COLMAP on resized images? I thought the resized work is done by extract_scannet_scene.cpp but we need COLMAP Sparse reconstruction result to run extract_scannet_scene. That is confusing.
Could you help me solve these problems? Thanks a lot, Sorry I am new to NeRF and COLMAP.

How to get the sequence of the poses to render a video?

Hi, thanks for sharing your nice work!

I couldn't find the camera poses to render a video. How can I get the sequence of camera poses to make a video?

Error in compute_depth_loss

Hi, thanks for your nice work. However, when I run the code I got the error in compute_depth_loss:

Traceback (most recent call last): File "/public/home/luanzl/.pycharm_helpers/pydev/pydevd.py", line 2195, in <module> main() File "/public/home/luanzl/.pycharm_helpers/pydev/pydevd.py", line 2177, in main globals = debugger.run(setup['file'], None, None, is_module) File "/public/home/luanzl/.pycharm_helpers/pydev/pydevd.py", line 1489, in run return self._exec(is_module, entry_point_fn, module_name, file, globals, locals) File "/public/home/luanzl/.pycharm_helpers/pydev/pydevd.py", line 1496, in _exec pydev_imports.execfile(file, globals, locals) # execute the script File "/public/home/luanzl/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/public/home/luanzl/WorkSpace/dense_depth_priors_nerf/run_nerf.py", line 1137, in <module> run_nerf() File "/public/home/luanzl/WorkSpace/dense_depth_priors_nerf/run_nerf.py", line 1100, in run_nerf train_nerf(images, depths, valid_depths, poses, intrinsics, i_split, args, scene_sample_params, lpips_alex, gt_depths, gt_valid_depths) File "/public/home/luanzl/WorkSpace/dense_depth_priors_nerf/run_nerf.py", line 859, in train_nerf depth_loss = compute_depth_loss(extras['depth_map'], extras['z_vals'], extras['weights'], target_d, target_vd) File "/public/home/luanzl/WorkSpace/dense_depth_priors_nerf/model/run_nerf_helpers.py", line 44, in compute_depth_loss return float(pred_mean.shape[0]) / float(target_valid_depth.shape[0]) * f(pred_mean, target_mean, pred_var) File "/public/home/luanzl/anaconda3/envs/dense_depth_priors_nerf/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(*input, **kwargs) File "/public/home/luanzl/anaconda3/envs/dense_depth_priors_nerf/lib/python3.8/site-packages/torch/nn/modules/loss.py", line 372, in forward return F.gaussian_nll_loss(input, target, var, full=self.full, eps=self.eps, reduction=self.reduction) File "/public/home/luanzl/anaconda3/envs/dense_depth_priors_nerf/lib/python3.8/site-packages/torch/nn/functional.py", line 2804, in gaussian_nll_loss raise ValueError("var has negative entry/entries") ValueError: var has negative entry/entries

It seems that I got a negative value during the loss calculation. May I know how to tackle this issue? Hope for your reply, thanks!

Colmap pose

Hi, thanks for sharing the code!
I want to ask for detail about generating poses. I fail to obtain poses for some images by running colmap on each scene with train and test images together. Is that because the feature extractor should be run on all ScanNet images? If this is the case, can you share the scannet_sift_database.db? Otherwise, can you provide the code for generating camera poses? Many thanks!

Video generation with less than 3 images

My goal is to create a video with a limited number of images, specifically less than three. After researching, I found that using dense depth priors with nerf technology can be effective for this purpose. However, I am uncertain whether this approach can be applied to my specific problem of working with only one to three images.

cannot load the depth completion model trained on ScanNet

Hello,

I'm trying to optimize NeRF with dense depth priors using the depth completion network trained on ScanNet provided in the corresponding readme section and I get the following error:


Traceback (most recent call last):
  File "run_nerf.py", line 1104, in <module>
    run_nerf()
  File "run_nerf.py", line 1070, in run_nerf
    train_nerf(images, depths, valid_depths, poses, intrinsics, i_split, args, scene_sample_params, lpips_alex, gt_depths, gt_valid_depths)
  File "run_nerf.py", line 789, in train_nerf
    scene_sample_params, args)
  File "run_nerf.py", line 728, in complete_and_check_depth
    invalidate_large_std_threshold=args.invalidate_large_std_threshold)
  File "run_nerf.py", line 678, in complete_depth
    ckpt = torch.load(model_path)
  File "/miniconda/envs/env/lib/python3.7/site-packages/torch/serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/miniconda/envs/env/lib/python3.7/site-packages/torch/serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

I've tried the following pytorch versions: 1.11.0, 1.10.0, 1.9.1, and 1.9.0.
Could you please confirm that this is the correct link for the depth completion network weights?

Thank you in advance for your help.

Question for the resolution of the imags

I try to reuse the depth network which u train on scan net:

When I input my own images, there are some error:

Does the model have any requirements for input? The resolution of my image is (1920, 1080)