zju3dv / enerf Goto Github PK

View Code? Open in Web Editor NEW

407.0 22.0 27.0 3.61 MB

SIGGRAPH Asia 2022: Code for "Efficient Neural Radiance Fields for Interactive Free-viewpoint Video"

Home Page: https://zju3dv.github.io/enerf

License: Other

Python 100.00%

4d-reconstruction dynamic-view-synthesis novel-view-synthesis siggraph-asia-2022

enerf's People

Stargazers

Watchers

enerf's Issues

resolution ratio of input image

Hi, it seems a little blurry when using your gui_human.py to visualize the results. Does the resolution ratio (input_ratio in the yaml) that cause the problem? Will the result seems much clearer if the parameter set to 1.0 for training and inference? Thank you!

About video on the website

Hi Haotong and Sida,

awesome work! I believe many are as impressed as I am. My question is about the experimental setting for the video on your website since it's not mentioned in the main paper. I wonder:
(1) how many cameras are you using?
(2) what are the training and testing splits? e.g. are testing done on completely new videos? Are the training data any similar to the test videos? etc.
(3) Are these generated with finetuning?

Thank you very much for your awesome work!

Super low GPU/CPU usage while training

I'm training with enerf-outdoor from scratch using the default configuration. The GPU/CPU usage is super low. What might be the cause?

如何使用ENeRF训练自己的数据集

尊敬的前辈：
您好！最近拜读了贵学校的文章ENeRF，其是一项伟大的工作，对我很有启发。在复现了您的部分工作后，我想尝试使用自己的数据集在ENeRF上运行，但是出现了一些问题。我想请问一下我是否有机会得到您处理从colmap SfM result转换到 ENeRF input的方法。

No such file or directory: 'lib/visualizers/enerf.py'

Hi, thank you for your hard work. I have been trying to run the visualize module, but maybe it is not published yet? It says No such file or directory: 'lib/visualizers/enerf.py' or maybe I am doing something run.

I was running using the following command python run.py --type visualize --cfg_file configs/enerf/llff_eval.yaml

Looking forward to your release of the visualize, interactive rendering code and outdoor dataset!

Thank you

How to train on my own llff data?

Thanks for your great work! I found that the README only writes how to train on the DTU dataset, but the Depth_raw is difficult to get if I try to train my own llff data. What should I do? I would appreciate it if you could reply to me.

Question about quantitative evaluation on pretrained model

I used the provided generalization model to perform evaluation on DTU dataset as in readme, and got the psnr, ssim and lpips values:

whereas in readme, the quantitative results should be

I wonder why the quantitative evaluation results are different? And I want to know your evaluation results? Thanks

weird result after visualizing on the ENeRF-Outdoor dataset

Hi, thanks for your great work.
But when I inference on your ENeRF-Outdoor dataset, a weird result appears:

It doesn't make sense as the color is so gray. Could your please tell us why causes that?

Camera Color Calibration

Thanks for your contribution, how was the camera color calibrated when capturing the dataset used in your project? Need to do some manual settings on the camera?

How to handle our own data, can you provide a tutorial？

Hello, I want to use my own data for training and rendering, how do I process my own data? In addition, in the video on your project page, can you provide interfaces and data sets for testing? I would be very grateful if a tutorial could be provided on this.

different result of standing and sitting person

Hi, I built two datasets, one for standing peroson and one for sitting person. The training result of standing data is much better than sitting data (about 4 PSNR). I noticed that someone said openpose achieve better result with standing person than sitting person.
zju3dv/EasyMocap#94
Is that the problem cause enerf achieve different results? Thank you!

Coordinate transformation problem

I'm not quite clear about the calculation of self.XYZ in enerf_interactivate.py, could you please help explain? I know it's roughly pixel coordinates to camera coordinates, but it's a little confusing why I need to invert the matrix and then transpose.

Pre-trained model link has expired

Hi,

The pre-trained model link in Readme.md (dtu_pretrain) has expired.
Kindly re-share the model.

Thank You!

Strange error while funetunning zjumocap

I want to finetune with zjumocap 313, just like what the config does. But met the following issue.
Is it caused by windows?

btw I modified some lines in

ENeRF/lib/utils/net_utils.py

Line 391 in fe1adae

os.system('mkdir -p {}'.format(model_dir))

to something like os.makedirs(model_dir, exist_ok=True), which will create "-p" folder under the workspace in windows.

Error logs:

(smpl-py38-torch110-cu111) PS D:\workspace4tian\ENeRF> python train_net.py --cfg_file configs/enerf/zjumocap/zjumocap_train.yaml
Workspace:  D:\workspace4tian\ENeRF
configs/enerf/dtu_pretrain.yaml
configs/enerf/zjumocap/zjumocap_train.yaml
EXP NAME:  zjumocap
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
  warnings.warn(
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
  warnings.warn(msg)
Loading model from: D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\lpips\weights\v0.1\vgg.pth
子目录或文件 D:\workspace4tian\ENeRF\result\enerf\zjumocap\default 已经存在。
处理: D:\workspace4tian\ENeRF\result\enerf\zjumocap\default 时出错。
Load pretrain model: D:\workspace4tian\ENeRF\trained_model\enerf\dtu_pretrain\latest.pth
Traceback (most recent call last):
  File "train_net.py", line 117, in <module>
    main()
    w.start()
  File "C:\Program Files\Python38\lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
  File "C:\Program Files\Python38\lib\multiprocessing\context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Program Files\Python38\lib\multiprocessing\context.py", line 327, in _Popen
    return Popen(process_obj)
  File "C:\Program Files\Python38\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Program Files\Python38\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'lib.datasets.zjumocap.enerf.Dataset'>: it's not the same object as lib.datasets.zjumocap.enerf.Dataset
Workspace:  D:\workspace4tian\ENeRF
configs/enerf/dtu_pretrain.yaml
configs/enerf/zjumocap/zjumocap_train.yaml
EXP NAME:  zjumocap
(smpl-py38-torch110-cu111) PS D:\workspace4tian\ENeRF> Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

Where the code confuses me

Why lib/datasets/zjumocap/enerf.py $input_views_num need to plus 1?

Run Enerf on a self-made zju-mocap

I wanted to run Enerf on a dataset I made myself, which I got with EasyMocap and this link， Now I get file structure like this:

now I'm still missing files like new_vertices to run the visualization, how can I fix this?

which kind of scenes can be generized well without finetuning?

How to make bounding box data?

Hi, I want to train ENeRF with my own data.

My own data constructed on 3*7 fixed grid multi-view cameras.

In this moment, I have to make bounding box data (*.npy) for my own data to train on ENeRF-Outdoor dataset setting.

Please let me know how to make the bounding box
(or is there any code for make bounding box?)

Results on ENeRF-Outdoor dataset and poor quality depth

Hi, thanks for the great work!
However, after running your training script (python train_net.py --cfg_file configs/enerf/enerf_outdoor/actor1.yaml) on Actor1 for 50 epochs, I am getting the following results. The results for the color prediction are not as good as advertised on your project page, with lots of warping of the background. Also, the depth maps are quite poor, with the depth of the shadow region being incorrectly predicted. Do you know why this might be?

Color:
https://user-images.githubusercontent.com/9107279/219429442-24e2cc1d-bb5b-4d78-9f58-588e318fdbaa.mp4

Depth:
https://user-images.githubusercontent.com/9107279/219429583-eccd8139-173f-4a6c-b4c6-26d0e83e5db9.mp4

Question about the mask_util

Hi, Thanks for your great work.
However, after reading your paper and code, I notice that there is a mask_util file include the ade20k label that did not mentioned in your paper
I am just curious about the meaning of this file?
Is this related to your future work or can I just ignore it?

Question about the dimension of Forward_feat and render_rays in network_composite.py

Hi, I am confused about the dimension in Forward_feat and render_rays functions: the B,S,C,H,W, which is different from the traditional 5D B C D H W.
I assume the B is batch size and C,H,W are color, height and width. But I am not sure about the meaning of S dimension.
Can you offer me some intuition for this dimension?

Thanks!

Config file for fine-tune zjumocap dataset

Hi,

Thanks for sharing this nice work.

I would like know that how to fine-tune model for zjumocap dataset.

I have modified the config from zjumocap_eval.yaml, but the results are worse than the pre-trained model.

Do you have suggestions ?

Thanks !!!

Visualization error in ENeRF-Outdoor dataset

Hi, thanks for the great work!

I tried to visualize ENeRF-Outdoor dataset by executing the code below.(Use pretrain model provided to you)

python run.py --type visualize --cfg_file configs/enerf/enerf_outdoor/actor1_path.yaml

However, the error below occurred, and when I looked for the solution, I guess it is because the pretrained model model was learned using multi-gpu and i loaded it on the single GPU.

Therefore, I would like to ask for an answer to the following question.

Please tell me how to visualize the ENeRF-Outdoor dataset using the pretrained model you provided on the single GPU.

The ZJU-MoCap Interactive Rendering, ZJU-MoCap / LLFF / NeRF / DTU dataasets Evaluation you provided works normally on my single GPU.

I would like to request the ENeRF-Outdoor / ST-NeRF dataset interactive rendering code that can be found on your project page.
I think it has already been developed, so I sincerely ask you to share it.

-----------------------------------ERROR Detail--------------------------------------

EXP NAME: actor1

load model: /home/ubuntu/ENeRF/trained_model/enerf/actor1/latest.pth
Traceback (most recent call last):
File "run.py", line 106, in
globals()'run_' + args.type
File "run.py", line 90, in run_visualize
load_network(network,
File "/home/ubuntu/ENeRF/lib/utils/net_utils.py", line 443, in load_network
net.load_state_dict(pretrained_model['net'], strict=strict)
File "/home/ubuntu/.conda/envs/enerf/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Network:
Missing key(s) in state_dict: "feature_net_bg.conv0.0.conv.weight", "feature_net_bg.conv0.0.bn.weight", "feature_net_bg.conv0.0.bn.bias", "feature_net_bg.conv0.0.bn.running_mean", "feature_net_bg.conv0.0.bn.running_var", "feature_net_bg.conv0.1.conv.weight", "feature_net_bg.conv0.1.bn.weight", "feature_net_bg.conv0.1.bn.bias", "feature_net_bg.conv0.1.bn.running_mean", "feature_net_bg.conv0.1.bn.running_var", "feature_net_bg.conv1.0.conv.weight", "feature_net_bg.conv1.0.bn.weight", "feature_net_bg.conv1.0.bn.bias", "feature_net_bg.conv1.0.bn.running_mean", "feature_net_bg.conv1.0.bn.running_var", "feature_net_bg.conv1.1.conv.weight", "feature_net_bg.conv1.1.bn.weight", "feature_net_bg.conv1.1.bn.bias", "feature_net_bg.conv1.1.bn.running_mean", "feature_net_bg.conv1.1.bn.running_var", "feature_net_bg.conv2.0.conv.weight", "feature_net_bg.conv2.0.bn.weight", "feature_net_bg.conv2.0.bn.bias", "feature_net_bg.conv2.0.bn.running_mean", "feature_net_bg.conv2.0.bn.running_var", "feature_net_bg.conv2.1.conv.weight", "feature_net_bg.conv2.1.bn.weight", "feature_net_bg.conv2.1.bn.bias", "feature_net_bg.conv2.1.bn.running_mean", "feature_net_bg.conv2.1.bn.running_var", "feature_net_bg.toplayer.weight", "feature_net_bg.toplayer.bias", "feature_net_bg.lat1.weight", "feature_net_bg.lat1.bias", "feature_net_bg.lat0.weight", "feature_net_bg.lat0.bias", "feature_net_bg.smooth1.weight", "feature_net_bg.smooth1.bias", "feature_net_bg.smooth0.weight", "feature_net_bg.smooth0.bias", "cost_reg_0_layer0.conv0.conv.weight", "cost_reg_0_layer0.conv0.bn.weight", "cost_reg_0_layer0.conv0.bn.bias", "cost_reg_0_layer0.conv0.bn.running_mean", "cost_reg_0_layer0.conv0.bn.running_var", "cost_reg_0_layer0.conv1.conv.weight", "cost_reg_0_layer0.conv1.bn.weight", "cost_reg_0_layer0.conv1.bn.bias", "cost_reg_0_layer0.conv1.bn.running_mean", "cost_reg_0_layer0.conv1.bn.running_var", "cost_reg_0_layer0.conv2.conv.weight", "cost_reg_0_layer0.conv2.bn.weight", "cost_reg_0_layer0.conv2.bn.bias", "cost_reg_0_layer0.conv2.bn.running_mean", "cost_reg_0_layer0.conv2.bn.running_var", "cost_reg_0_layer0.conv3.conv.weight", "cost_reg_0_layer0.conv3.bn.weight", "cost_reg_0_layer0.conv3.bn.bias", "cost_reg_0_layer0.conv3.bn.running_mean", "cost_reg_0_layer0.conv3.bn.running_var", "cost_reg_0_layer0.conv4.conv.weight", "cost_reg_0_layer0.conv4.bn.weight", "cost_reg_0_layer0.conv4.bn.bias", "cost_reg_0_layer0.conv4.bn.running_mean", "cost_reg_0_layer0.conv4.bn.running_var", "cost_reg_0_layer0.conv9.0.weight", "cost_reg_0_layer0.conv9.1.weight", "cost_reg_0_layer0.conv9.1.bias", "cost_reg_0_layer0.conv9.1.running_mean", "cost_reg_0_layer0.conv9.1.running_var", "cost_reg_0_layer0.conv11.0.weight", "cost_reg_0_layer0.conv11.1.weight", "cost_reg_0_layer0.conv11.1.bias", "cost_reg_0_layer0.conv11.1.running_mean", "cost_reg_0_layer0.conv11.1.running_var", "cost_reg_0_layer0.depth_conv.0.weight", "cost_reg_0_layer0.feat_conv.0.weight", "nerf_0_layer0.agg.global_fc.0.weight", "nerf_0_layer0.agg.global_fc.0.bias", "nerf_0_layer0.agg.agg_w_fc.0.weight", "nerf_0_layer0.agg.agg_w_fc.0.bias", "nerf_0_layer0.agg.fc.0.weight", "nerf_0_layer0.agg.fc.0.bias", "nerf_0_layer0.lr0.0.weight", "nerf_0_layer0.lr0.0.bias", "nerf_0_layer0.sigma.0.weight", "nerf_0_layer0.sigma.0.bias", "nerf_0_layer0.color.0.weight", "nerf_0_layer0.color.0.bias", "nerf_0_layer0.color.2.weight", "nerf_0_layer0.color.2.bias", "cost_reg_0_bg.conv0.conv.weight", "cost_reg_0_bg.conv0.bn.weight", "cost_reg_0_bg.conv0.bn.bias", "cost_reg_0_bg.conv0.bn.running_mean", "cost_reg_0_bg.conv0.bn.running_var", "cost_reg_0_bg.conv1.conv.weight", "cost_reg_0_bg.conv1.bn.weight", "cost_reg_0_bg.conv1.bn.bias", "cost_reg_0_bg.conv1.bn.running_mean", "cost_reg_0_bg.conv1.bn.running_var", "cost_reg_0_bg.conv2.conv.weight", "cost_reg_0_bg.conv2.bn.weight", "cost_reg_0_bg.conv2.bn.bias", "cost_reg_0_bg.conv2.bn.running_mean", "cost_reg_0_bg.conv2.bn.running_var", "cost_reg_0_bg.conv3.conv.weight", "cost_reg_0_bg.conv3.bn.weight", "cost_reg_0_bg.conv3.bn.bias", "cost_reg_0_bg.conv3.bn.running_mean", "cost_reg_0_bg.conv3.bn.running_var", "cost_reg_0_bg.conv4.conv.weight", "cost_reg_0_bg.conv4.bn.weight", "cost_reg_0_bg.conv4.bn.bias", "cost_reg_0_bg.conv4.bn.running_mean", "cost_reg_0_bg.conv4.bn.running_var", "cost_reg_0_bg.conv9.0.weight", "cost_reg_0_bg.conv9.1.weight", "cost_reg_0_bg.conv9.1.bias", "cost_reg_0_bg.conv9.1.running_mean", "cost_reg_0_bg.conv9.1.running_var", "cost_reg_0_bg.conv11.0.weight", "cost_reg_0_bg.conv11.1.weight", "cost_reg_0_bg.conv11.1.bias", "cost_reg_0_bg.conv11.1.running_mean", "cost_reg_0_bg.conv11.1.running_var", "cost_reg_0_bg.depth_conv.0.weight", "cost_reg_0_bg.feat_conv.0.weight", "nerf_0_bg.agg.global_fc.0.weight", "nerf_0_bg.agg.global_fc.0.bias", "nerf_0_bg.agg.agg_w_fc.0.weight", "nerf_0_bg.agg.agg_w_fc.0.bias", "nerf_0_bg.agg.fc.0.weight", "nerf_0_bg.agg.fc.0.bias", "nerf_0_bg.lr0.0.weight", "nerf_0_bg.lr0.0.bias", "nerf_0_bg.sigma.0.weight", "nerf_0_bg.sigma.0.bias", "nerf_0_bg.color.0.weight", "nerf_0_bg.color.0.bias", "nerf_0_bg.color.2.weight", "nerf_0_bg.color.2.bias", "cost_reg_1_layer0.conv0.conv.weight", "cost_reg_1_layer0.conv0.bn.weight", "cost_reg_1_layer0.conv0.bn.bias", "cost_reg_1_layer0.conv0.bn.running_mean", "cost_reg_1_layer0.conv0.bn.running_var", "cost_reg_1_layer0.conv1.conv.weight", "cost_reg_1_layer0.conv1.bn.weight", "cost_reg_1_layer0.conv1.bn.bias", "cost_reg_1_layer0.conv1.bn.running_mean", "cost_reg_1_layer0.conv1.bn.running_var", "cost_reg_1_layer0.conv2.conv.weight", "cost_reg_1_layer0.conv2.bn.weight", "cost_reg_1_layer0.conv2.bn.bias", "cost_reg_1_layer0.conv2.bn.running_mean", "cost_reg_1_layer0.conv2.bn.running_var", "cost_reg_1_layer0.conv3.conv.weight", "cost_reg_1_layer0.conv3.bn.weight", "cost_reg_1_layer0.conv3.bn.bias", "cost_reg_1_layer0.conv3.bn.running_mean", "cost_reg_1_layer0.conv3.bn.running_var", "cost_reg_1_layer0.conv4.conv.weight", "cost_reg_1_layer0.conv4.bn.weight", "cost_reg_1_layer0.conv4.bn.bias", "cost_reg_1_layer0.conv4.bn.running_mean", "cost_reg_1_layer0.conv4.bn.running_var", "cost_reg_1_layer0.conv9.0.weight", "cost_reg_1_layer0.conv9.1.weight", "cost_reg_1_layer0.conv9.1.bias", "cost_reg_1_layer0.conv9.1.running_mean", "cost_reg_1_layer0.conv9.1.running_var", "cost_reg_1_layer0.conv11.0.weight", "cost_reg_1_layer0.conv11.1.weight", "cost_reg_1_layer0.conv11.1.bias", "cost_reg_1_layer0.conv11.1.running_mean", "cost_reg_1_layer0.conv11.1.running_var", "cost_reg_1_layer0.depth_conv.0.weight", "cost_reg_1_layer0.feat_conv.0.weight", "nerf_1_layer0.agg.global_fc.0.weight", "nerf_1_layer0.agg.global_fc.0.bias", "nerf_1_layer0.agg.agg_w_fc.0.weight", "nerf_1_layer0.agg.agg_w_fc.0.bias", "nerf_1_layer0.agg.fc.0.weight", "nerf_1_layer0.agg.fc.0.bias", "nerf_1_layer0.lr0.0.weight", "nerf_1_layer0.lr0.0.bias", "nerf_1_layer0.sigma.0.weight", "nerf_1_layer0.sigma.0.bias", "nerf_1_layer0.color.0.weight", "nerf_1_layer0.color.0.bias", "nerf_1_layer0.color.2.weight", "nerf_1_layer0.color.2.bias", "cost_reg_1_bg.conv0.conv.weight", "cost_reg_1_bg.conv0.bn.weight", "cost_reg_1_bg.conv0.bn.bias", "cost_reg_1_bg.conv0.bn.running_mean", "cost_reg_1_bg.conv0.bn.running_var", "cost_reg_1_bg.conv1.conv.weight", "cost_reg_1_bg.conv1.bn.weight", "cost_reg_1_bg.conv1.bn.bias", "cost_reg_1_bg.conv1.bn.running_mean", "cost_reg_1_bg.conv1.bn.running_var", "cost_reg_1_bg.conv2.conv.weight", "cost_reg_1_bg.conv2.bn.weight", "cost_reg_1_bg.conv2.bn.bias", "cost_reg_1_bg.conv2.bn.running_mean", "cost_reg_1_bg.conv2.bn.running_var", "cost_reg_1_bg.conv3.conv.weight", "cost_reg_1_bg.conv3.bn.weight", "cost_reg_1_bg.conv3.bn.bias", "cost_reg_1_bg.conv3.bn.running_mean", "cost_reg_1_bg.conv3.bn.running_var", "cost_reg_1_bg.conv4.conv.weight", "cost_reg_1_bg.conv4.bn.weight", "cost_reg_1_bg.conv4.bn.bias", "cost_reg_1_bg.conv4.bn.running_mean", "cost_reg_1_bg.conv4.bn.running_var", "cost_reg_1_bg.conv9.0.weight", "cost_reg_1_bg.conv9.1.weight", "cost_reg_1_bg.conv9.1.bias", "cost_reg_1_bg.conv9.1.running_mean", "cost_reg_1_bg.conv9.1.running_var", "cost_reg_1_bg.conv11.0.weight", "cost_reg_1_bg.conv11.1.weight", "cost_reg_1_bg.conv11.1.bias", "cost_reg_1_bg.conv11.1.running_mean", "cost_reg_1_bg.conv11.1.running_var", "cost_reg_1_bg.depth_conv.0.weight", "cost_reg_1_bg.feat_conv.0.weight", "nerf_1_bg.agg.global_fc.0.weight", "nerf_1_bg.agg.global_fc.0.bias", "nerf_1_bg.agg.agg_w_fc.0.weight", "nerf_1_bg.agg.agg_w_fc.0.bias", "nerf_1_bg.agg.fc.0.weight", "nerf_1_bg.agg.fc.0.bias", "nerf_1_bg.lr0.0.weight", "nerf_1_bg.lr0.0.bias", "nerf_1_bg.sigma.0.weight", "nerf_1_bg.sigma.0.bias", "nerf_1_bg.color.0.weight", "nerf_1_bg.color.0.bias", "nerf_1_bg.color.2.weight", "nerf_1_bg.color.2.bias".
Unexpected key(s) in state_dict: "cost_reg_0.conv0.conv.weight", "cost_reg_0.conv0.bn.weight", "cost_reg_0.conv0.bn.bias", "cost_reg_0.conv0.bn.running_mean", "cost_reg_0.conv0.bn.running_var", "cost_reg_0.conv0.bn.num_batches_tracked", "cost_reg_0.conv1.conv.weight", "cost_reg_0.conv1.bn.weight", "cost_reg_0.conv1.bn.bias", "cost_reg_0.conv1.bn.running_mean", "cost_reg_0.conv1.bn.running_var", "cost_reg_0.conv1.bn.num_batches_tracked", "cost_reg_0.conv2.conv.weight", "cost_reg_0.conv2.bn.weight", "cost_reg_0.conv2.bn.bias", "cost_reg_0.conv2.bn.running_mean", "cost_reg_0.conv2.bn.running_var", "cost_reg_0.conv2.bn.num_batches_tracked", "cost_reg_0.conv3.conv.weight", "cost_reg_0.conv3.bn.weight", "cost_reg_0.conv3.bn.bias", "cost_reg_0.conv3.bn.running_mean", "cost_reg_0.conv3.bn.running_var", "cost_reg_0.conv3.bn.num_batches_tracked", "cost_reg_0.conv4.conv.weight", "cost_reg_0.conv4.bn.weight", "cost_reg_0.conv4.bn.bias", "cost_reg_0.conv4.bn.running_mean", "cost_reg_0.conv4.bn.running_var", "cost_reg_0.conv4.bn.num_batches_tracked", "cost_reg_0.conv9.0.weight", "cost_reg_0.conv9.1.weight", "cost_reg_0.conv9.1.bias", "cost_reg_0.conv9.1.running_mean", "cost_reg_0.conv9.1.running_var", "cost_reg_0.conv9.1.num_batches_tracked", "cost_reg_0.conv11.0.weight", "cost_reg_0.conv11.1.weight", "cost_reg_0.conv11.1.bias", "cost_reg_0.conv11.1.running_mean", "cost_reg_0.conv11.1.running_var", "cost_reg_0.conv11.1.num_batches_tracked", "cost_reg_0.depth_conv.0.weight", "cost_reg_0.feat_conv.0.weight", "nerf_0.agg.view_fc.0.weight", "nerf_0.agg.view_fc.0.bias", "nerf_0.agg.global_fc.0.weight", "nerf_0.agg.global_fc.0.bias", "nerf_0.agg.agg_w_fc.0.weight", "nerf_0.agg.agg_w_fc.0.bias", "nerf_0.agg.fc.0.weight", "nerf_0.agg.fc.0.bias", "nerf_0.lr0.0.weight", "nerf_0.lr0.0.bias", "nerf_0.sigma.0.weight", "nerf_0.sigma.0.bias", "nerf_0.color.0.weight", "nerf_0.color.0.bias", "nerf_0.color.2.weight", "nerf_0.color.2.bias", "cost_reg_1.conv0.conv.weight", "cost_reg_1.conv0.bn.weight", "cost_reg_1.conv0.bn.bias", "cost_reg_1.conv0.bn.running_mean", "cost_reg_1.conv0.bn.running_var", "cost_reg_1.conv0.bn.num_batches_tracked", "cost_reg_1.conv1.conv.weight", "cost_reg_1.conv1.bn.weight", "cost_reg_1.conv1.bn.bias", "cost_reg_1.conv1.bn.running_mean", "cost_reg_1.conv1.bn.running_var", "cost_reg_1.conv1.bn.num_batches_tracked", "cost_reg_1.conv2.conv.weight", "cost_reg_1.conv2.bn.weight", "cost_reg_1.conv2.bn.bias", "cost_reg_1.conv2.bn.running_mean", "cost_reg_1.conv2.bn.running_var", "cost_reg_1.conv2.bn.num_batches_tracked", "cost_reg_1.conv3.conv.weight", "cost_reg_1.conv3.bn.weight", "cost_reg_1.conv3.bn.bias", "cost_reg_1.conv3.bn.running_mean", "cost_reg_1.conv3.bn.running_var", "cost_reg_1.conv3.bn.num_batches_tracked", "cost_reg_1.conv4.conv.weight", "cost_reg_1.conv4.bn.weight", "cost_reg_1.conv4.bn.bias", "cost_reg_1.conv4.bn.running_mean", "cost_reg_1.conv4.bn.running_var", "cost_reg_1.conv4.bn.num_batches_tracked", "cost_reg_1.conv5.conv.weight", "cost_reg_1.conv5.bn.weight", "cost_reg_1.conv5.bn.bias", "cost_reg_1.conv5.bn.running_mean", "cost_reg_1.conv5.bn.running_var", "cost_reg_1.conv5.bn.num_batches_tracked", "cost_reg_1.conv6.conv.weight", "cost_reg_1.conv6.bn.weight", "cost_reg_1.conv6.bn.bias", "cost_reg_1.conv6.bn.running_mean", "cost_reg_1.conv6.bn.running_var", "cost_reg_1.conv6.bn.num_batches_tracked", "cost_reg_1.conv7.0.weight", "cost_reg_1.conv7.1.weight", "cost_reg_1.conv7.1.bias", "cost_reg_1.conv7.1.running_mean", "cost_reg_1.conv7.1.running_var", "cost_reg_1.conv7.1.num_batches_tracked", "cost_reg_1.conv9.0.weight", "cost_reg_1.conv9.1.weight", "cost_reg_1.conv9.1.bias", "cost_reg_1.conv9.1.running_mean", "cost_reg_1.conv9.1.running_var", "cost_reg_1.conv9.1.num_batches_tracked", "cost_reg_1.conv11.0.weight", "cost_reg_1.conv11.1.weight", "cost_reg_1.conv11.1.bias", "cost_reg_1.conv11.1.running_mean", "cost_reg_1.conv11.1.running_var", "cost_reg_1.conv11.1.num_batches_tracked", "cost_reg_1.depth_conv.0.weight", "cost_reg_1.feat_conv.0.weight", "nerf_1.agg.view_fc.0.weight", "nerf_1.agg.view_fc.0.bias", "nerf_1.agg.global_fc.0.weight", "nerf_1.agg.global_fc.0.bias", "nerf_1.agg.agg_w_fc.0.weight", "nerf_1.agg.agg_w_fc.0.bias", "nerf_1.agg.fc.0.weight", "nerf_1.agg.fc.0.bias", "nerf_1.lr0.0.weight", "nerf_1.lr0.0.bias", "nerf_1.sigma.0.weight", "nerf_1.sigma.0.bias", "nerf_1.color.0.weight", "nerf_1.color.0.bias", "nerf_1.color.2.weight", "nerf_1.color.2.bias".

The agreement file of the ENeRF-Outdoor dataset

I want to request the ENeRF-Outdoor dataset, but I can't find the agreement file that you mentioned for requesting the dataset.

Error in DTU Eval

When using train_net.py to train, an error occurs when performing eval：
cv2.error:resize.cpp:4062:erroe:(-215:Assert failed) !ssize.empty() in function 'resize'

Locating errors 90-93 in dtu/enerf.py
tar_dpt = data_utils.read_pfm(scene_info['dpt_paths'][tar_view])[0].astype(np.float32) tar_dpt0(128, 160)
tar_dpt = cv2.resize(tar_dpt, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_NEAREST) tar_dpt1(64, 80)
tar_dpt = tar_dpt[44:556, 80:720] tar_dpt2(20, 0)
tar_mask = (tar_dpt > 0.).astype(np.uint8) tar_mask(20, 0)

After spliting, tar_mask is empty
how to solve it？

AttributeError: frames

Great work! I met a problem when I using the flower in llff dataset to run gui_human.py with parameters '--cfg_file configs/enerf/llff/flower.yaml'.
The error is:
Traceback (most recent call last):
File "D:\pycharm\ENeRF\gui_human.py", line 380, in
main()
File "D:\pycharm\ENeRF\gui_human.py", line 231, in main
rend = Renderer() # prepare network and dataloader
File "D:\pycharm\ENeRF\gui_human.py", line 50, in init
self.frame_start = cfg.test_dataset.frames[0]
File "D:\pycharm\ENeRF\lib\config\yacs.py", line 115, in getattr
raise AttributeError(name)
AttributeError: frames
It seems like it don't have a 'frame' in test_dataset

Could you please help me with this problem? Thanks in advance!

Is there a GUI in this project?

Hi, thanks for your hard work. I'm trying to run ENeRF evaluate and it can work normally. I'm curious whether this project has a GUI to display the reconstruction model. Could you please tell me how to execute？

Dataset access

Dear author, hi~
Could I ask for the agreement form of the dataset download, I found that the link to download the agreement form you provided earlier is no longer valid.

Great Work!

The ENerf is the first to achieve real-time photorealistic rendering of arbitrary dynamic scenes, which will greatly promote future scientific research. And it also has great application value in the future life！

RuntimeError: CUDA out of memory.

Hi, thanks for sharing your great work!

I am trying to run the evaluation on scan114 only (have not had the space to download the other datasets yet). However, I have encountered a CUDA out of memory runtime error as shown, after running the command python run.py --type evaluate --cfg_file configs/enerf/dtu_pretrain.yaml enerf.cas_config.render_if False,True enerf.cas_config.volume_planes 48,8 enerf.eval_depth True:

load model: /home/ENeRF-master/trained_model/enerf/dtu_pretrain/latest.pth
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /home/anaconda3/lib/python3.9/site-packages/lpips/weights/v0.1/vgg.pth
  0%|                                                                                                                                                                                        | 0/4 [00:03<?, ?it/s]
Traceback (most recent call last):
  File "/home/ENeRF-master/run.py", line 111, in <module>
    globals()['run_' + args.type]()
  File "/home/ENeRF-master/run.py", line 70, in run_evaluate
    output = network(batch)
  File "/home/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "lib/networks/enerf/network.py", line 96, in forward
    ret_i = self.batchify_rays(
  File "lib/networks/enerf/network.py", line 49, in batchify_rays
    ret = self.render_rays(rays[:, i:i + chunk], **kwargs)
  File "lib/networks/enerf/network.py", line 40, in render_rays
    net_output = nerf_model(vox_feat, img_feat_rgb_dir)
  File "/home/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/ENeRF-master/lib/networks/enerf/nerf.py", line 40, in forward
    x = torch.cat((x, img_feat_rgb_dir), dim=-1)
RuntimeError: CUDA out of memory. Tried to allocate 774.00 MiB (GPU 0; 23.70 GiB total capacity; 1.13 GiB already allocated; 321.56 MiB free; 1.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I have tried to include os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:512" at the beginning of the run.py file, however, I have received the exact same error. Any suggestion on how I should resolve this error?

Thank you!

custom outdoor dataset

Hi! About building a custom outdoor dataset like yours, could you please give some suggestions about how to get my own 'background.ply'? Also, for the 3D bounding box in 'vhull', is there any faster method than those mentioned in Easymocap? Thank you for your help!

Issues on evaluation

Hello! Thanks for sharing the code!

When I eval this model on LLFF dataset with test part (unseen scenes in theory), I found it perform much better in 'fortress' scene than other methods such as IBRNet, but other scenes have similar performances. So I would like to ask whether the 'latest.pth' you released is a model that has not been finetuned?

In addition, I want to know if I want to get an image whose image size does not fit the cost volume network (such as the original image size of the LLFF dataset), how should I do it？(matbe resize the size of the depth and the original image?)

I'm looking forward to your reply.

About the dataloader codes for the Real Forward-facing and NeRF Synthetic datasets

Hi, thanks for sharing your great work!
I want to know how to run your code on the Real Forward-facing and NeRF Synthetic datasets? It seems that you only release the dataloader code for DTU dataset? Can you release the dataloader codes for the Real Forward-facing and NeRF Synthetic datasets?
Thanks very much!

miss repo in requirement.txt in order to run enerf_outdoor

Hi, I notice that you are using trimesh when training the enerf_outdoor dataset, which is not included in your requirement.txt

Config files and dataloaders for ST-NeRF and DynamicCap dataset

Thank you for sharing this great work!

I would like to try ST-NeRF and DynamicCap dataset.
Could you release config files and dataloaders for ST-NeRF and DynamicCap dataset ?

Thanks in advance.

How to run ENeRF on my own data?

Hi, the real-time dynamic rendering demo on your project page is so cool! And I want to make the same thing on my own data (sequence of images). What should I do?

Question about config file

Hi, Thanks for your great work! During training, I find the code pass through the function name build_feature_volume twice, is this means coarse to fine? And I wondering how to change the parameters in cfg.enerf.cas_config, I cannot find the related parameters in lib.config. Looking forward to your reply！
Best wishes！

Composition of the enerf-outdoor dataset

How to get the bakground.ply and npy files in vhull in the enerf-outdoor dataset. I would be very grateful if a tutorial could be provided on this.

关于zjumocap_train.yaml文件下某些项的作用

我正在尝试调整这个文件中的一些数值以尝试微调预训练模型。在其中我产生了一些疑问，请问enerf项下的train_input_views_prob的作用，我现在的理解是train_dataset项下的input_views控制了输入的训练视角的数量，test_dataset项下则是控制参与评估的数量。

我在finetune时使用了如下设置
train_dataset:
data_root: 'zju_mocap'
scene: 'CoreView_test4'
split: train
frames: [0, 599, 1]
input_views: [0, -1, 1]
render_views: [0, -1, 1]
input_ratio: 0.5

test_dataset:
data_root: 'zju_mocap'
scene: 'CoreView_test4'
split: test
frames: [0, 600, 100]
input_views: [0, -1, 2]
render_views: [1, -1, 2]
input_ratio: 0.5
（其他的部分与原文件相同）
另外，我在用我自己的类似zju-mocap数据集（6个同步摄像头）进行训练的时候，有一个很奇怪的现象是，训练后的psnr，ssim，lpips等值变得比之前更优了，但是从gui中渲染出来的效果却没有使用预训练模型时候的渲染效果清晰，请问这是什么原因呢？

Doubt with homo_warp

Hi,

I was working with your code and when reviewing the projection of the feature maps into the cost volume, there's something that I don't understand. In the function homo_warp, the projection matrix that is computed is to go from the target view camera coordinates to the source view in order to interpolate the features:

def get_proj_mats(batch, src_scale, tar_scale):
    B, S_V, C, H, W = batch['src_inps'].shape
    src_ext = batch['src_exts']
    src_ixt = batch['src_ixts'].clone()
    src_ixt[:, :, :2] *= src_scale
    src_projs = src_ixt @ src_ext[:, :, :3]

    tar_ext = batch['tar_ext']
    tar_ixt = batch['tar_ixt'].clone()
    tar_ixt[:, :2] *= tar_scale
    tar_projs = tar_ixt @ tar_ext[:, :3]
    tar_ones = torch.zeros((B, 1, 4)).to(tar_projs.device)
    tar_ones[:, :, 3] = 1
    tar_projs = torch.cat((tar_projs, tar_ones), dim=1)
    tar_projs_inv = torch.inverse(tar_projs)

    src_projs = src_projs.view(B, S_V, 3, 4)
    tar_projs_inv = tar_projs_inv.view(B, 1, 4, 4)

    proj_mats = src_projs @ tar_projs_inv
    return proj_mats

But when projecting the grid into the image, I don't understand which coordinates are used. Only pixel indices seemed to be used and are projected into the source image by slicing the projection matrix into rotation and translation, when it also contains the intrinsic matrix:

def homo_warp(src_feat, proj_mat, depth_values, batch):
    B, D, H_T, W_T = depth_values.shape
    C, H_S, W_S = src_feat.shape[1:]
    device = src_feat.device

    R = proj_mat[:, :, :3] # (B, 3, 3)
    T = proj_mat[:, :, 3:] # (B, 3, 1)
    # create grid from the ref frame
    ref_grid = create_meshgrid(H_T, W_T, normalized_coordinates=False,
                               device=device) # (1, H, W, 2)
    ref_grid = ref_grid.permute(0, 3, 1, 2) # (1, 2, H, W)
    ref_grid = ref_grid.reshape(1, 2, H_T*W_T) # (1, 2, H*W)
    ref_grid = ref_grid.expand(B, -1, -1) # (B, 2, H*W)
    ref_grid = torch.cat((ref_grid, torch.ones_like(ref_grid[:,:1])), 1) # (B, 3, H*W)
    ref_grid_d = ref_grid.repeat(1, 1, D) # (B, 3, D*H*W)
    src_grid_d = R @ ref_grid_d + T/depth_values.view(B, 1, D*H_T*W_T)
    del ref_grid_d, ref_grid, proj_mat, R, T, depth_values # release (GPU) memory

    # project negative depth pixels to somewhere outside the image
    # negative_depth_mask = src_grid_d[:, 2:] <= 1e-7
    # src_grid_d[:, 0:1][negative_depth_mask] = W
    # src_grid_d[:, 1:2][negative_depth_mask] = H
    # src_grid_d[:, 2:3][negative_depth_mask] = 1

    src_grid = src_grid_d[:, :2] / torch.clamp_min(src_grid_d[:, 2:], 1e-6) # divide by depth (B, 2, D*H*W)
    # del src_grid_d
    src_grid[:, 0] = (src_grid[:, 0])/((W_S - 1) / 2) - 1 # scale to -1~1
    src_grid[:, 1] = (src_grid[:, 1])/((H_S - 1) / 2) - 1 # scale to -1~1
    src_grid = src_grid.permute(0, 2, 1) # (B, D*H*W, 2)
    src_grid = src_grid.view(B, D, H_T*W_T, 2)

    warped_src_feat = F.grid_sample(src_feat, src_grid,
                                    mode='bilinear', padding_mode='zeros',
                                    align_corners=True) # (B, C, D, H*W)
    warped_src_feat = warped_src_feat.view(B, C, D, H_T, W_T)
    src_grid = src_grid.view(B, D, H_T, W_T, 2)
    if torch.isnan(warped_src_feat).isnan().any():
        __import__('ipdb').set_trace()
    return warped_src_feat, src_grid

Could you explain how are the coordinates from the grid projected into the source image and in which coordinate system is the grid defined?

Thanks in advance,
Sergio

Only 13 FPS?

I rerun your algorithm on LLFF dataset of resolution 512*512, but only get about 13 FPS. How to achieve 25 FPS mentioned in your paper?

win_size exceeds image extent

I want to ask how to solve this problem?

Properly formatted annots.npy file

May I ask how to obtain the annots.json and the new annots.npy? I want to finetune the model using my own data set similar to zju-mocap, but I found that the annots.npy file format of the two is different, which makes it unreadable. How do I get these files to work with?

NaN in training

Hi, when I trained my own dataset, error occurred as below:

I set 'shuffle' to False to check if some particular images in my dataset cause this error, but it still occurs randomly (mostly in the first epoch, but it occurred once in the second epoch while the first epoch seem good).
Do you have any idea? Thank you for your help!

parameter 'volume_planes'

Why use different 'enerf.cas_config.volume_planes' in training and testing stage?

near_far插值范围限定

您好，目前代码利用了拟合的单人SMPL模型6890各顶点得到了3Dbbox，用来限定cost volume插值的深度(near_far)，请问像project page里展示的多人情况和带背景的情况，是不用这个方法进行处理吗？

KeyError: 'rgb_level0'

I retrain with zjumocap dataset with command: python train_net.py --cfg_file configs/enerf/zjumocap_eval.yaml
while I get the error:
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /opt/conda/lib/python3.8/site-packages/lpips/weights/v0.1/vgg.pth
Traceback (most recent call last):
File "train_net.py", line 117, in
main()
File "train_net.py", line 109, in main
train(cfg, network)
File "train_net.py", line 51, in train
trainer.train(epoch, train_loader, optimizer, recorder)
File "/dfs/data/ENeRF/lib/train/trainers/trainer.py", line 56, in train
output, loss, loss_stats, image_stats = self.network(batch)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1015, in call_impl
return forward_call(*input, **kwargs)
File "lib/train/losses/enerf.py", line 23, in forward
color_loss = self.color_crit(batch[f'rgb{i}'], output[f'rgb_level{i}'])
KeyError: 'rgb_level0'

About the camera？

May I ask what camera is used for the dataset? gopro or other camera?

run gui_human.py on my own dataset. Problem with visualization

I built my own similar zju-mocap dataset. The size of the image and mask I input is 1088 x 1920, and the input ratio is 0.5, but I reported an error during visualization:

I noticed that the length of the output is 786432=512 x 512 x 3, which happens to be the output of the set zju-mocap dataset 1024 x 1024 x 3 under the corresponding input_ratio. What should I do to change the output to the correct output?
ps: because I found that if forced to use 512 x 512 x 3 as the output of reshape, the visualization will be incomplete:
pred_img = output[f'rgb_level{i}'][b].reshape(512, 512, 3)

The vedio of the ZJU-Mocap

Hi, I try to prepare my own Mocap dataset to run your code but the youtube link in https://chingswy.github.io/easymocap-public-doc/quickstart/capture_youtube.html is private or not exist. Could you please release the multiview vedio of your Mocap vedio so that I can have a try on preparing Mocap dataset. Thanks in advance!

zju3dv / enerf Goto Github PK

enerf's People

Stargazers

Watchers

Forkers

enerf's Issues

Recommend Projects

Recommend Topics

Recommend Org