zju3dv / enerf Goto Github PK
View Code? Open in Web Editor NEWSIGGRAPH Asia 2022: Code for "Efficient Neural Radiance Fields for Interactive Free-viewpoint Video"
Home Page: https://zju3dv.github.io/enerf
License: Other
SIGGRAPH Asia 2022: Code for "Efficient Neural Radiance Fields for Interactive Free-viewpoint Video"
Home Page: https://zju3dv.github.io/enerf
License: Other
Hi, it seems a little blurry when using your gui_human.py to visualize the results. Does the resolution ratio (input_ratio in the yaml) that cause the problem? Will the result seems much clearer if the parameter set to 1.0 for training and inference? Thank you!
Hi Haotong and Sida,
awesome work! I believe many are as impressed as I am. My question is about the experimental setting for the video on your website since it's not mentioned in the main paper. I wonder:
(1) how many cameras are you using?
(2) what are the training and testing splits? e.g. are testing done on completely new videos? Are the training data any similar to the test videos? etc.
(3) Are these generated with finetuning?
Thank you very much for your awesome work!
尊敬的前辈:
您好!最近拜读了贵学校的文章ENeRF,其是一项伟大的工作,对我很有启发。在复现了您的部分工作后,我想尝试使用自己的数据集在ENeRF上运行,但是出现了一些问题。我想请问一下我是否有机会得到您处理从colmap SfM result转换到 ENeRF input的方法。
Hi, thank you for your hard work. I have been trying to run the visualize module, but maybe it is not published yet? It says No such file or directory: 'lib/visualizers/enerf.py'
or maybe I am doing something run.
I was running using the following command python run.py --type visualize --cfg_file configs/enerf/llff_eval.yaml
Looking forward to your release of the visualize, interactive rendering code and outdoor dataset!
Thank you
Thanks for your great work! I found that the README only writes how to train on the DTU dataset, but the Depth_raw is difficult to get if I try to train my own llff data. What should I do? I would appreciate it if you could reply to me.
I used the provided generalization model to perform evaluation on DTU dataset as in readme, and got the psnr, ssim and lpips values:
whereas in readme, the quantitative results should be
I wonder why the quantitative evaluation results are different? And I want to know your evaluation results? Thanks
Thanks for your contribution, how was the camera color calibrated when capturing the dataset used in your project? Need to do some manual settings on the camera?
Hello, I want to use my own data for training and rendering, how do I process my own data? In addition, in the video on your project page, can you provide interfaces and data sets for testing? I would be very grateful if a tutorial could be provided on this.
Hi, I built two datasets, one for standing peroson and one for sitting person. The training result of standing data is much better than sitting data (about 4 PSNR). I noticed that someone said openpose achieve better result with standing person than sitting person.
zju3dv/EasyMocap#94
Is that the problem cause enerf achieve different results? Thank you!
I'm not quite clear about the calculation of self.XYZ in enerf_interactivate.py, could you please help explain? I know it's roughly pixel coordinates to camera coordinates, but it's a little confusing why I need to invert the matrix and then transpose.
Hi,
The pre-trained model link in Readme.md (dtu_pretrain) has expired.
Kindly re-share the model.
Thank You!
I want to finetune with zjumocap 313, just like what the config does. But met the following issue.
Is it caused by windows?
btw I modified some lines in
Line 391 in fe1adae
os.makedirs(model_dir, exist_ok=True)
, which will create "-p" folder under the workspace in windows.
Error logs:
(smpl-py38-torch110-cu111) PS D:\workspace4tian\ENeRF> python train_net.py --cfg_file configs/enerf/zjumocap/zjumocap_train.yaml
Workspace: D:\workspace4tian\ENeRF
configs/enerf/dtu_pretrain.yaml
configs/enerf/zjumocap/zjumocap_train.yaml
EXP NAME: zjumocap
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\torchvision\models\_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing `weights=VGG16_Weights.IMAGENET1K_V1`. You can also use `weights=VGG16_Weights.DEFAULT` to get the most up-to-date weights.
warnings.warn(msg)
Loading model from: D:\sdk\envs\smpl-py38-torch110-cu111\lib\site-packages\lpips\weights\v0.1\vgg.pth
子目录或文件 D:\workspace4tian\ENeRF\result\enerf\zjumocap\default 已经存在。
处理: D:\workspace4tian\ENeRF\result\enerf\zjumocap\default 时出错。
Load pretrain model: D:\workspace4tian\ENeRF\trained_model\enerf\dtu_pretrain\latest.pth
Traceback (most recent call last):
File "train_net.py", line 117, in <module>
main()
w.start()
File "C:\Program Files\Python38\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Program Files\Python38\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Python38\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "C:\Program Files\Python38\lib\multiprocessing\popen_spawn_win32.py", line 93, in __init__
reduction.dump(process_obj, to_child)
File "C:\Program Files\Python38\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <class 'lib.datasets.zjumocap.enerf.Dataset'>: it's not the same object as lib.datasets.zjumocap.enerf.Dataset
Workspace: D:\workspace4tian\ENeRF
configs/enerf/dtu_pretrain.yaml
configs/enerf/zjumocap/zjumocap_train.yaml
EXP NAME: zjumocap
(smpl-py38-torch110-cu111) PS D:\workspace4tian\ENeRF> Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Program Files\Python38\lib\multiprocessing\spawn.py", line 126, in _main
self = reduction.pickle.load(from_parent)
EOFError: Ran out of input
Why lib/datasets/zjumocap/enerf.py $input_views_num need to plus 1?
I wanted to run Enerf on a dataset I made myself, which I got with EasyMocap and this link, Now I get file structure like this:
now I'm still missing files like new_vertices to run the visualization, how can I fix this?
which kind of scenes can be generized well without finetuning?
Hi, I want to train ENeRF with my own data.
My own data constructed on 3*7 fixed grid multi-view cameras.
In this moment, I have to make bounding box data (*.npy) for my own data to train on ENeRF-Outdoor dataset setting.
Please let me know how to make the bounding box
(or is there any code for make bounding box?)
Hi, thanks for the great work!
However, after running your training script (python train_net.py --cfg_file configs/enerf/enerf_outdoor/actor1.yaml) on Actor1 for 50 epochs, I am getting the following results. The results for the color prediction are not as good as advertised on your project page, with lots of warping of the background. Also, the depth maps are quite poor, with the depth of the shadow region being incorrectly predicted. Do you know why this might be?
Color:
https://user-images.githubusercontent.com/9107279/219429442-24e2cc1d-bb5b-4d78-9f58-588e318fdbaa.mp4
Depth:
https://user-images.githubusercontent.com/9107279/219429583-eccd8139-173f-4a6c-b4c6-26d0e83e5db9.mp4
Hi, Thanks for your great work.
However, after reading your paper and code, I notice that there is a mask_util file include the ade20k label that did not mentioned in your paper
I am just curious about the meaning of this file?
Is this related to your future work or can I just ignore it?
Hi, I am confused about the dimension in Forward_feat and render_rays functions: the B,S,C,H,W, which is different from the traditional 5D B C D H W.
I assume the B is batch size and C,H,W are color, height and width. But I am not sure about the meaning of S dimension.
Can you offer me some intuition for this dimension?
Thanks!
Hi,
Thanks for sharing this nice work.
I would like know that how to fine-tune model for zjumocap dataset.
I have modified the config from zjumocap_eval.yaml, but the results are worse than the pre-trained model.
Do you have suggestions ?
Thanks !!!
Hi, thanks for the great work!
I tried to visualize ENeRF-Outdoor dataset by executing the code below.(Use pretrain model provided to you)
python run.py --type visualize --cfg_file configs/enerf/enerf_outdoor/actor1_path.yaml
However, the error below occurred, and when I looked for the solution, I guess it is because the pretrained model model was learned using multi-gpu and i loaded it on the single GPU.
Therefore, I would like to ask for an answer to the following question.
-----------------------------------ERROR Detail--------------------------------------
EXP NAME: actor1
load model: /home/ubuntu/ENeRF/trained_model/enerf/actor1/latest.pth
Traceback (most recent call last):
File "run.py", line 106, in
globals()'run_' + args.type
File "run.py", line 90, in run_visualize
load_network(network,
File "/home/ubuntu/ENeRF/lib/utils/net_utils.py", line 443, in load_network
net.load_state_dict(pretrained_model['net'], strict=strict)
File "/home/ubuntu/.conda/envs/enerf/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Network:
Missing key(s) in state_dict: "feature_net_bg.conv0.0.conv.weight", "feature_net_bg.conv0.0.bn.weight", "feature_net_bg.conv0.0.bn.bias", "feature_net_bg.conv0.0.bn.running_mean", "feature_net_bg.conv0.0.bn.running_var", "feature_net_bg.conv0.1.conv.weight", "feature_net_bg.conv0.1.bn.weight", "feature_net_bg.conv0.1.bn.bias", "feature_net_bg.conv0.1.bn.running_mean", "feature_net_bg.conv0.1.bn.running_var", "feature_net_bg.conv1.0.conv.weight", "feature_net_bg.conv1.0.bn.weight", "feature_net_bg.conv1.0.bn.bias", "feature_net_bg.conv1.0.bn.running_mean", "feature_net_bg.conv1.0.bn.running_var", "feature_net_bg.conv1.1.conv.weight", "feature_net_bg.conv1.1.bn.weight", "feature_net_bg.conv1.1.bn.bias", "feature_net_bg.conv1.1.bn.running_mean", "feature_net_bg.conv1.1.bn.running_var", "feature_net_bg.conv2.0.conv.weight", "feature_net_bg.conv2.0.bn.weight", "feature_net_bg.conv2.0.bn.bias", "feature_net_bg.conv2.0.bn.running_mean", "feature_net_bg.conv2.0.bn.running_var", "feature_net_bg.conv2.1.conv.weight", "feature_net_bg.conv2.1.bn.weight", "feature_net_bg.conv2.1.bn.bias", "feature_net_bg.conv2.1.bn.running_mean", "feature_net_bg.conv2.1.bn.running_var", "feature_net_bg.toplayer.weight", "feature_net_bg.toplayer.bias", "feature_net_bg.lat1.weight", "feature_net_bg.lat1.bias", "feature_net_bg.lat0.weight", "feature_net_bg.lat0.bias", "feature_net_bg.smooth1.weight", "feature_net_bg.smooth1.bias", "feature_net_bg.smooth0.weight", "feature_net_bg.smooth0.bias", "cost_reg_0_layer0.conv0.conv.weight", "cost_reg_0_layer0.conv0.bn.weight", "cost_reg_0_layer0.conv0.bn.bias", "cost_reg_0_layer0.conv0.bn.running_mean", "cost_reg_0_layer0.conv0.bn.running_var", "cost_reg_0_layer0.conv1.conv.weight", "cost_reg_0_layer0.conv1.bn.weight", "cost_reg_0_layer0.conv1.bn.bias", "cost_reg_0_layer0.conv1.bn.running_mean", "cost_reg_0_layer0.conv1.bn.running_var", "cost_reg_0_layer0.conv2.conv.weight", "cost_reg_0_layer0.conv2.bn.weight", "cost_reg_0_layer0.conv2.bn.bias", "cost_reg_0_layer0.conv2.bn.running_mean", "cost_reg_0_layer0.conv2.bn.running_var", "cost_reg_0_layer0.conv3.conv.weight", "cost_reg_0_layer0.conv3.bn.weight", "cost_reg_0_layer0.conv3.bn.bias", "cost_reg_0_layer0.conv3.bn.running_mean", "cost_reg_0_layer0.conv3.bn.running_var", "cost_reg_0_layer0.conv4.conv.weight", "cost_reg_0_layer0.conv4.bn.weight", "cost_reg_0_layer0.conv4.bn.bias", "cost_reg_0_layer0.conv4.bn.running_mean", "cost_reg_0_layer0.conv4.bn.running_var", "cost_reg_0_layer0.conv9.0.weight", "cost_reg_0_layer0.conv9.1.weight", "cost_reg_0_layer0.conv9.1.bias", "cost_reg_0_layer0.conv9.1.running_mean", "cost_reg_0_layer0.conv9.1.running_var", "cost_reg_0_layer0.conv11.0.weight", "cost_reg_0_layer0.conv11.1.weight", "cost_reg_0_layer0.conv11.1.bias", "cost_reg_0_layer0.conv11.1.running_mean", "cost_reg_0_layer0.conv11.1.running_var", "cost_reg_0_layer0.depth_conv.0.weight", "cost_reg_0_layer0.feat_conv.0.weight", "nerf_0_layer0.agg.global_fc.0.weight", "nerf_0_layer0.agg.global_fc.0.bias", "nerf_0_layer0.agg.agg_w_fc.0.weight", "nerf_0_layer0.agg.agg_w_fc.0.bias", "nerf_0_layer0.agg.fc.0.weight", "nerf_0_layer0.agg.fc.0.bias", "nerf_0_layer0.lr0.0.weight", "nerf_0_layer0.lr0.0.bias", "nerf_0_layer0.sigma.0.weight", "nerf_0_layer0.sigma.0.bias", "nerf_0_layer0.color.0.weight", "nerf_0_layer0.color.0.bias", "nerf_0_layer0.color.2.weight", "nerf_0_layer0.color.2.bias", "cost_reg_0_bg.conv0.conv.weight", "cost_reg_0_bg.conv0.bn.weight", "cost_reg_0_bg.conv0.bn.bias", "cost_reg_0_bg.conv0.bn.running_mean", "cost_reg_0_bg.conv0.bn.running_var", "cost_reg_0_bg.conv1.conv.weight", "cost_reg_0_bg.conv1.bn.weight", "cost_reg_0_bg.conv1.bn.bias", "cost_reg_0_bg.conv1.bn.running_mean", "cost_reg_0_bg.conv1.bn.running_var", "cost_reg_0_bg.conv2.conv.weight", "cost_reg_0_bg.conv2.bn.weight", "cost_reg_0_bg.conv2.bn.bias", "cost_reg_0_bg.conv2.bn.running_mean", "cost_reg_0_bg.conv2.bn.running_var", "cost_reg_0_bg.conv3.conv.weight", "cost_reg_0_bg.conv3.bn.weight", "cost_reg_0_bg.conv3.bn.bias", "cost_reg_0_bg.conv3.bn.running_mean", "cost_reg_0_bg.conv3.bn.running_var", "cost_reg_0_bg.conv4.conv.weight", "cost_reg_0_bg.conv4.bn.weight", "cost_reg_0_bg.conv4.bn.bias", "cost_reg_0_bg.conv4.bn.running_mean", "cost_reg_0_bg.conv4.bn.running_var", "cost_reg_0_bg.conv9.0.weight", "cost_reg_0_bg.conv9.1.weight", "cost_reg_0_bg.conv9.1.bias", "cost_reg_0_bg.conv9.1.running_mean", "cost_reg_0_bg.conv9.1.running_var", "cost_reg_0_bg.conv11.0.weight", "cost_reg_0_bg.conv11.1.weight", "cost_reg_0_bg.conv11.1.bias", "cost_reg_0_bg.conv11.1.running_mean", "cost_reg_0_bg.conv11.1.running_var", "cost_reg_0_bg.depth_conv.0.weight", "cost_reg_0_bg.feat_conv.0.weight", "nerf_0_bg.agg.global_fc.0.weight", "nerf_0_bg.agg.global_fc.0.bias", "nerf_0_bg.agg.agg_w_fc.0.weight", "nerf_0_bg.agg.agg_w_fc.0.bias", "nerf_0_bg.agg.fc.0.weight", "nerf_0_bg.agg.fc.0.bias", "nerf_0_bg.lr0.0.weight", "nerf_0_bg.lr0.0.bias", "nerf_0_bg.sigma.0.weight", "nerf_0_bg.sigma.0.bias", "nerf_0_bg.color.0.weight", "nerf_0_bg.color.0.bias", "nerf_0_bg.color.2.weight", "nerf_0_bg.color.2.bias", "cost_reg_1_layer0.conv0.conv.weight", "cost_reg_1_layer0.conv0.bn.weight", "cost_reg_1_layer0.conv0.bn.bias", "cost_reg_1_layer0.conv0.bn.running_mean", "cost_reg_1_layer0.conv0.bn.running_var", "cost_reg_1_layer0.conv1.conv.weight", "cost_reg_1_layer0.conv1.bn.weight", "cost_reg_1_layer0.conv1.bn.bias", "cost_reg_1_layer0.conv1.bn.running_mean", "cost_reg_1_layer0.conv1.bn.running_var", "cost_reg_1_layer0.conv2.conv.weight", "cost_reg_1_layer0.conv2.bn.weight", "cost_reg_1_layer0.conv2.bn.bias", "cost_reg_1_layer0.conv2.bn.running_mean", "cost_reg_1_layer0.conv2.bn.running_var", "cost_reg_1_layer0.conv3.conv.weight", "cost_reg_1_layer0.conv3.bn.weight", "cost_reg_1_layer0.conv3.bn.bias", "cost_reg_1_layer0.conv3.bn.running_mean", "cost_reg_1_layer0.conv3.bn.running_var", "cost_reg_1_layer0.conv4.conv.weight", "cost_reg_1_layer0.conv4.bn.weight", "cost_reg_1_layer0.conv4.bn.bias", "cost_reg_1_layer0.conv4.bn.running_mean", "cost_reg_1_layer0.conv4.bn.running_var", "cost_reg_1_layer0.conv9.0.weight", "cost_reg_1_layer0.conv9.1.weight", "cost_reg_1_layer0.conv9.1.bias", "cost_reg_1_layer0.conv9.1.running_mean", "cost_reg_1_layer0.conv9.1.running_var", "cost_reg_1_layer0.conv11.0.weight", "cost_reg_1_layer0.conv11.1.weight", "cost_reg_1_layer0.conv11.1.bias", "cost_reg_1_layer0.conv11.1.running_mean", "cost_reg_1_layer0.conv11.1.running_var", "cost_reg_1_layer0.depth_conv.0.weight", "cost_reg_1_layer0.feat_conv.0.weight", "nerf_1_layer0.agg.global_fc.0.weight", "nerf_1_layer0.agg.global_fc.0.bias", "nerf_1_layer0.agg.agg_w_fc.0.weight", "nerf_1_layer0.agg.agg_w_fc.0.bias", "nerf_1_layer0.agg.fc.0.weight", "nerf_1_layer0.agg.fc.0.bias", "nerf_1_layer0.lr0.0.weight", "nerf_1_layer0.lr0.0.bias", "nerf_1_layer0.sigma.0.weight", "nerf_1_layer0.sigma.0.bias", "nerf_1_layer0.color.0.weight", "nerf_1_layer0.color.0.bias", "nerf_1_layer0.color.2.weight", "nerf_1_layer0.color.2.bias", "cost_reg_1_bg.conv0.conv.weight", "cost_reg_1_bg.conv0.bn.weight", "cost_reg_1_bg.conv0.bn.bias", "cost_reg_1_bg.conv0.bn.running_mean", "cost_reg_1_bg.conv0.bn.running_var", "cost_reg_1_bg.conv1.conv.weight", "cost_reg_1_bg.conv1.bn.weight", "cost_reg_1_bg.conv1.bn.bias", "cost_reg_1_bg.conv1.bn.running_mean", "cost_reg_1_bg.conv1.bn.running_var", "cost_reg_1_bg.conv2.conv.weight", "cost_reg_1_bg.conv2.bn.weight", "cost_reg_1_bg.conv2.bn.bias", "cost_reg_1_bg.conv2.bn.running_mean", "cost_reg_1_bg.conv2.bn.running_var", "cost_reg_1_bg.conv3.conv.weight", "cost_reg_1_bg.conv3.bn.weight", "cost_reg_1_bg.conv3.bn.bias", "cost_reg_1_bg.conv3.bn.running_mean", "cost_reg_1_bg.conv3.bn.running_var", "cost_reg_1_bg.conv4.conv.weight", "cost_reg_1_bg.conv4.bn.weight", "cost_reg_1_bg.conv4.bn.bias", "cost_reg_1_bg.conv4.bn.running_mean", "cost_reg_1_bg.conv4.bn.running_var", "cost_reg_1_bg.conv9.0.weight", "cost_reg_1_bg.conv9.1.weight", "cost_reg_1_bg.conv9.1.bias", "cost_reg_1_bg.conv9.1.running_mean", "cost_reg_1_bg.conv9.1.running_var", "cost_reg_1_bg.conv11.0.weight", "cost_reg_1_bg.conv11.1.weight", "cost_reg_1_bg.conv11.1.bias", "cost_reg_1_bg.conv11.1.running_mean", "cost_reg_1_bg.conv11.1.running_var", "cost_reg_1_bg.depth_conv.0.weight", "cost_reg_1_bg.feat_conv.0.weight", "nerf_1_bg.agg.global_fc.0.weight", "nerf_1_bg.agg.global_fc.0.bias", "nerf_1_bg.agg.agg_w_fc.0.weight", "nerf_1_bg.agg.agg_w_fc.0.bias", "nerf_1_bg.agg.fc.0.weight", "nerf_1_bg.agg.fc.0.bias", "nerf_1_bg.lr0.0.weight", "nerf_1_bg.lr0.0.bias", "nerf_1_bg.sigma.0.weight", "nerf_1_bg.sigma.0.bias", "nerf_1_bg.color.0.weight", "nerf_1_bg.color.0.bias", "nerf_1_bg.color.2.weight", "nerf_1_bg.color.2.bias".
Unexpected key(s) in state_dict: "cost_reg_0.conv0.conv.weight", "cost_reg_0.conv0.bn.weight", "cost_reg_0.conv0.bn.bias", "cost_reg_0.conv0.bn.running_mean", "cost_reg_0.conv0.bn.running_var", "cost_reg_0.conv0.bn.num_batches_tracked", "cost_reg_0.conv1.conv.weight", "cost_reg_0.conv1.bn.weight", "cost_reg_0.conv1.bn.bias", "cost_reg_0.conv1.bn.running_mean", "cost_reg_0.conv1.bn.running_var", "cost_reg_0.conv1.bn.num_batches_tracked", "cost_reg_0.conv2.conv.weight", "cost_reg_0.conv2.bn.weight", "cost_reg_0.conv2.bn.bias", "cost_reg_0.conv2.bn.running_mean", "cost_reg_0.conv2.bn.running_var", "cost_reg_0.conv2.bn.num_batches_tracked", "cost_reg_0.conv3.conv.weight", "cost_reg_0.conv3.bn.weight", "cost_reg_0.conv3.bn.bias", "cost_reg_0.conv3.bn.running_mean", "cost_reg_0.conv3.bn.running_var", "cost_reg_0.conv3.bn.num_batches_tracked", "cost_reg_0.conv4.conv.weight", "cost_reg_0.conv4.bn.weight", "cost_reg_0.conv4.bn.bias", "cost_reg_0.conv4.bn.running_mean", "cost_reg_0.conv4.bn.running_var", "cost_reg_0.conv4.bn.num_batches_tracked", "cost_reg_0.conv9.0.weight", "cost_reg_0.conv9.1.weight", "cost_reg_0.conv9.1.bias", "cost_reg_0.conv9.1.running_mean", "cost_reg_0.conv9.1.running_var", "cost_reg_0.conv9.1.num_batches_tracked", "cost_reg_0.conv11.0.weight", "cost_reg_0.conv11.1.weight", "cost_reg_0.conv11.1.bias", "cost_reg_0.conv11.1.running_mean", "cost_reg_0.conv11.1.running_var", "cost_reg_0.conv11.1.num_batches_tracked", "cost_reg_0.depth_conv.0.weight", "cost_reg_0.feat_conv.0.weight", "nerf_0.agg.view_fc.0.weight", "nerf_0.agg.view_fc.0.bias", "nerf_0.agg.global_fc.0.weight", "nerf_0.agg.global_fc.0.bias", "nerf_0.agg.agg_w_fc.0.weight", "nerf_0.agg.agg_w_fc.0.bias", "nerf_0.agg.fc.0.weight", "nerf_0.agg.fc.0.bias", "nerf_0.lr0.0.weight", "nerf_0.lr0.0.bias", "nerf_0.sigma.0.weight", "nerf_0.sigma.0.bias", "nerf_0.color.0.weight", "nerf_0.color.0.bias", "nerf_0.color.2.weight", "nerf_0.color.2.bias", "cost_reg_1.conv0.conv.weight", "cost_reg_1.conv0.bn.weight", "cost_reg_1.conv0.bn.bias", "cost_reg_1.conv0.bn.running_mean", "cost_reg_1.conv0.bn.running_var", "cost_reg_1.conv0.bn.num_batches_tracked", "cost_reg_1.conv1.conv.weight", "cost_reg_1.conv1.bn.weight", "cost_reg_1.conv1.bn.bias", "cost_reg_1.conv1.bn.running_mean", "cost_reg_1.conv1.bn.running_var", "cost_reg_1.conv1.bn.num_batches_tracked", "cost_reg_1.conv2.conv.weight", "cost_reg_1.conv2.bn.weight", "cost_reg_1.conv2.bn.bias", "cost_reg_1.conv2.bn.running_mean", "cost_reg_1.conv2.bn.running_var", "cost_reg_1.conv2.bn.num_batches_tracked", "cost_reg_1.conv3.conv.weight", "cost_reg_1.conv3.bn.weight", "cost_reg_1.conv3.bn.bias", "cost_reg_1.conv3.bn.running_mean", "cost_reg_1.conv3.bn.running_var", "cost_reg_1.conv3.bn.num_batches_tracked", "cost_reg_1.conv4.conv.weight", "cost_reg_1.conv4.bn.weight", "cost_reg_1.conv4.bn.bias", "cost_reg_1.conv4.bn.running_mean", "cost_reg_1.conv4.bn.running_var", "cost_reg_1.conv4.bn.num_batches_tracked", "cost_reg_1.conv5.conv.weight", "cost_reg_1.conv5.bn.weight", "cost_reg_1.conv5.bn.bias", "cost_reg_1.conv5.bn.running_mean", "cost_reg_1.conv5.bn.running_var", "cost_reg_1.conv5.bn.num_batches_tracked", "cost_reg_1.conv6.conv.weight", "cost_reg_1.conv6.bn.weight", "cost_reg_1.conv6.bn.bias", "cost_reg_1.conv6.bn.running_mean", "cost_reg_1.conv6.bn.running_var", "cost_reg_1.conv6.bn.num_batches_tracked", "cost_reg_1.conv7.0.weight", "cost_reg_1.conv7.1.weight", "cost_reg_1.conv7.1.bias", "cost_reg_1.conv7.1.running_mean", "cost_reg_1.conv7.1.running_var", "cost_reg_1.conv7.1.num_batches_tracked", "cost_reg_1.conv9.0.weight", "cost_reg_1.conv9.1.weight", "cost_reg_1.conv9.1.bias", "cost_reg_1.conv9.1.running_mean", "cost_reg_1.conv9.1.running_var", "cost_reg_1.conv9.1.num_batches_tracked", "cost_reg_1.conv11.0.weight", "cost_reg_1.conv11.1.weight", "cost_reg_1.conv11.1.bias", "cost_reg_1.conv11.1.running_mean", "cost_reg_1.conv11.1.running_var", "cost_reg_1.conv11.1.num_batches_tracked", "cost_reg_1.depth_conv.0.weight", "cost_reg_1.feat_conv.0.weight", "nerf_1.agg.view_fc.0.weight", "nerf_1.agg.view_fc.0.bias", "nerf_1.agg.global_fc.0.weight", "nerf_1.agg.global_fc.0.bias", "nerf_1.agg.agg_w_fc.0.weight", "nerf_1.agg.agg_w_fc.0.bias", "nerf_1.agg.fc.0.weight", "nerf_1.agg.fc.0.bias", "nerf_1.lr0.0.weight", "nerf_1.lr0.0.bias", "nerf_1.sigma.0.weight", "nerf_1.sigma.0.bias", "nerf_1.color.0.weight", "nerf_1.color.0.bias", "nerf_1.color.2.weight", "nerf_1.color.2.bias".
I want to request the ENeRF-Outdoor dataset, but I can't find the agreement file that you mentioned for requesting the dataset.
When using train_net.py to train, an error occurs when performing eval:
cv2.error:resize.cpp:4062:erroe:(-215:Assert failed) !ssize.empty() in function 'resize'
Locating errors 90-93 in dtu/enerf.py
tar_dpt = data_utils.read_pfm(scene_info['dpt_paths'][tar_view])[0].astype(np.float32) tar_dpt0(128, 160)
tar_dpt = cv2.resize(tar_dpt, None, fx=0.5, fy=0.5, interpolation=cv2.INTER_NEAREST) tar_dpt1(64, 80)
tar_dpt = tar_dpt[44:556, 80:720] tar_dpt2(20, 0)
tar_mask = (tar_dpt > 0.).astype(np.uint8) tar_mask(20, 0)
After spliting, tar_mask is empty
how to solve it?
Great work! I met a problem when I using the flower in llff dataset to run gui_human.py with parameters '--cfg_file configs/enerf/llff/flower.yaml'.
The error is:
Traceback (most recent call last):
File "D:\pycharm\ENeRF\gui_human.py", line 380, in
main()
File "D:\pycharm\ENeRF\gui_human.py", line 231, in main
rend = Renderer() # prepare network and dataloader
File "D:\pycharm\ENeRF\gui_human.py", line 50, in init
self.frame_start = cfg.test_dataset.frames[0]
File "D:\pycharm\ENeRF\lib\config\yacs.py", line 115, in getattr
raise AttributeError(name)
AttributeError: frames
It seems like it don't have a 'frame' in test_dataset
Could you please help me with this problem? Thanks in advance!
Hi, thanks for your hard work. I'm trying to run ENeRF evaluate and it can work normally. I'm curious whether this project has a GUI to display the reconstruction model. Could you please tell me how to execute?
Dear author, hi~
Could I ask for the agreement form of the dataset download, I found that the link to download the agreement form you provided earlier is no longer valid.
The ENerf is the first to achieve real-time photorealistic rendering of arbitrary dynamic scenes, which will greatly promote future scientific research. And it also has great application value in the future life!
Hi, thanks for sharing your great work!
I am trying to run the evaluation on scan114 only (have not had the space to download the other datasets yet). However, I have encountered a CUDA out of memory runtime error as shown, after running the command python run.py --type evaluate --cfg_file configs/enerf/dtu_pretrain.yaml enerf.cas_config.render_if False,True enerf.cas_config.volume_planes 48,8 enerf.eval_depth True
:
load model: /home/ENeRF-master/trained_model/enerf/dtu_pretrain/latest.pth
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /home/anaconda3/lib/python3.9/site-packages/lpips/weights/v0.1/vgg.pth
0%| | 0/4 [00:03<?, ?it/s]
Traceback (most recent call last):
File "/home/ENeRF-master/run.py", line 111, in <module>
globals()['run_' + args.type]()
File "/home/ENeRF-master/run.py", line 70, in run_evaluate
output = network(batch)
File "/home/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "lib/networks/enerf/network.py", line 96, in forward
ret_i = self.batchify_rays(
File "lib/networks/enerf/network.py", line 49, in batchify_rays
ret = self.render_rays(rays[:, i:i + chunk], **kwargs)
File "lib/networks/enerf/network.py", line 40, in render_rays
net_output = nerf_model(vox_feat, img_feat_rgb_dir)
File "/home/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "/home/ENeRF-master/lib/networks/enerf/nerf.py", line 40, in forward
x = torch.cat((x, img_feat_rgb_dir), dim=-1)
RuntimeError: CUDA out of memory. Tried to allocate 774.00 MiB (GPU 0; 23.70 GiB total capacity; 1.13 GiB already allocated; 321.56 MiB free; 1.45 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
I have tried to include os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "max_split_size_mb:512"
at the beginning of the run.py file, however, I have received the exact same error. Any suggestion on how I should resolve this error?
Thank you!
Hi! About building a custom outdoor dataset like yours, could you please give some suggestions about how to get my own 'background.ply'? Also, for the 3D bounding box in 'vhull', is there any faster method than those mentioned in Easymocap? Thank you for your help!
Hello! Thanks for sharing the code!
When I eval this model on LLFF dataset with test part (unseen scenes in theory), I found it perform much better in 'fortress' scene than other methods such as IBRNet, but other scenes have similar performances. So I would like to ask whether the 'latest.pth' you released is a model that has not been finetuned?
In addition, I want to know if I want to get an image whose image size does not fit the cost volume network (such as the original image size of the LLFF dataset), how should I do it?(matbe resize the size of the depth and the original image?)
I'm looking forward to your reply.
Hi, thanks for sharing your great work!
I want to know how to run your code on the Real Forward-facing and NeRF Synthetic datasets? It seems that you only release the dataloader code for DTU dataset? Can you release the dataloader codes for the Real Forward-facing and NeRF Synthetic datasets?
Thanks very much!
Hi, I notice that you are using trimesh when training the enerf_outdoor dataset, which is not included in your requirement.txt
Thank you for sharing this great work!
I would like to try ST-NeRF and DynamicCap dataset.
Could you release config files and dataloaders for ST-NeRF and DynamicCap dataset ?
Thanks in advance.
Hi, the real-time dynamic rendering demo on your project page is so cool! And I want to make the same thing on my own data (sequence of images). What should I do?
Hi, Thanks for your great work! During training, I find the code pass through the function name build_feature_volume twice, is this means coarse to fine? And I wondering how to change the parameters in cfg.enerf.cas_config, I cannot find the related parameters in lib.config. Looking forward to your reply!
Best wishes!
How to get the bakground.ply and npy files in vhull in the enerf-outdoor dataset. I would be very grateful if a tutorial could be provided on this.
我正在尝试调整这个文件中的一些数值以尝试微调预训练模型。在其中我产生了一些疑问,请问enerf项下的train_input_views_prob的作用,我现在的理解是train_dataset项下的input_views控制了输入的训练视角的数量,test_dataset项下则是控制参与评估的数量。
我在finetune时使用了如下设置
train_dataset:
data_root: 'zju_mocap'
scene: 'CoreView_test4'
split: train
frames: [0, 599, 1]
input_views: [0, -1, 1]
render_views: [0, -1, 1]
input_ratio: 0.5
test_dataset:
data_root: 'zju_mocap'
scene: 'CoreView_test4'
split: test
frames: [0, 600, 100]
input_views: [0, -1, 2]
render_views: [1, -1, 2]
input_ratio: 0.5
(其他的部分与原文件相同)
另外,我在用我自己的类似zju-mocap数据集(6个同步摄像头)进行训练的时候,有一个很奇怪的现象是,训练后的psnr,ssim,lpips等值变得比之前更优了,但是从gui中渲染出来的效果却没有使用预训练模型时候的渲染效果清晰,请问这是什么原因呢?
Hi,
I was working with your code and when reviewing the projection of the feature maps into the cost volume, there's something that I don't understand. In the function homo_warp, the projection matrix that is computed is to go from the target view camera coordinates to the source view in order to interpolate the features:
def get_proj_mats(batch, src_scale, tar_scale):
B, S_V, C, H, W = batch['src_inps'].shape
src_ext = batch['src_exts']
src_ixt = batch['src_ixts'].clone()
src_ixt[:, :, :2] *= src_scale
src_projs = src_ixt @ src_ext[:, :, :3]
tar_ext = batch['tar_ext']
tar_ixt = batch['tar_ixt'].clone()
tar_ixt[:, :2] *= tar_scale
tar_projs = tar_ixt @ tar_ext[:, :3]
tar_ones = torch.zeros((B, 1, 4)).to(tar_projs.device)
tar_ones[:, :, 3] = 1
tar_projs = torch.cat((tar_projs, tar_ones), dim=1)
tar_projs_inv = torch.inverse(tar_projs)
src_projs = src_projs.view(B, S_V, 3, 4)
tar_projs_inv = tar_projs_inv.view(B, 1, 4, 4)
proj_mats = src_projs @ tar_projs_inv
return proj_mats
But when projecting the grid into the image, I don't understand which coordinates are used. Only pixel indices seemed to be used and are projected into the source image by slicing the projection matrix into rotation and translation, when it also contains the intrinsic matrix:
def homo_warp(src_feat, proj_mat, depth_values, batch):
B, D, H_T, W_T = depth_values.shape
C, H_S, W_S = src_feat.shape[1:]
device = src_feat.device
R = proj_mat[:, :, :3] # (B, 3, 3)
T = proj_mat[:, :, 3:] # (B, 3, 1)
# create grid from the ref frame
ref_grid = create_meshgrid(H_T, W_T, normalized_coordinates=False,
device=device) # (1, H, W, 2)
ref_grid = ref_grid.permute(0, 3, 1, 2) # (1, 2, H, W)
ref_grid = ref_grid.reshape(1, 2, H_T*W_T) # (1, 2, H*W)
ref_grid = ref_grid.expand(B, -1, -1) # (B, 2, H*W)
ref_grid = torch.cat((ref_grid, torch.ones_like(ref_grid[:,:1])), 1) # (B, 3, H*W)
ref_grid_d = ref_grid.repeat(1, 1, D) # (B, 3, D*H*W)
src_grid_d = R @ ref_grid_d + T/depth_values.view(B, 1, D*H_T*W_T)
del ref_grid_d, ref_grid, proj_mat, R, T, depth_values # release (GPU) memory
# project negative depth pixels to somewhere outside the image
# negative_depth_mask = src_grid_d[:, 2:] <= 1e-7
# src_grid_d[:, 0:1][negative_depth_mask] = W
# src_grid_d[:, 1:2][negative_depth_mask] = H
# src_grid_d[:, 2:3][negative_depth_mask] = 1
src_grid = src_grid_d[:, :2] / torch.clamp_min(src_grid_d[:, 2:], 1e-6) # divide by depth (B, 2, D*H*W)
# del src_grid_d
src_grid[:, 0] = (src_grid[:, 0])/((W_S - 1) / 2) - 1 # scale to -1~1
src_grid[:, 1] = (src_grid[:, 1])/((H_S - 1) / 2) - 1 # scale to -1~1
src_grid = src_grid.permute(0, 2, 1) # (B, D*H*W, 2)
src_grid = src_grid.view(B, D, H_T*W_T, 2)
warped_src_feat = F.grid_sample(src_feat, src_grid,
mode='bilinear', padding_mode='zeros',
align_corners=True) # (B, C, D, H*W)
warped_src_feat = warped_src_feat.view(B, C, D, H_T, W_T)
src_grid = src_grid.view(B, D, H_T, W_T, 2)
if torch.isnan(warped_src_feat).isnan().any():
__import__('ipdb').set_trace()
return warped_src_feat, src_grid
Could you explain how are the coordinates from the grid projected into the source image and in which coordinate system is the grid defined?
Thanks in advance,
Sergio
I rerun your algorithm on LLFF dataset of resolution 512*512, but only get about 13 FPS. How to achieve 25 FPS mentioned in your paper?
Hi, when I trained my own dataset, error occurred as below:
I set 'shuffle' to False to check if some particular images in my dataset cause this error, but it still occurs randomly (mostly in the first epoch, but it occurred once in the second epoch while the first epoch seem good).
Do you have any idea? Thank you for your help!
Why use different 'enerf.cas_config.volume_planes' in training and testing stage?
您好,目前代码利用了拟合的单人SMPL模型6890各顶点得到了3Dbbox,用来限定cost volume插值的深度(near_far),请问像project page里展示的多人情况和带背景的情况,是不用这个方法进行处理吗?
I retrain with zjumocap dataset with command: python train_net.py --cfg_file configs/enerf/zjumocap_eval.yaml
while I get the error:
Setting up [LPIPS] perceptual loss: trunk [vgg], v[0.1], spatial [off]
Loading model from: /opt/conda/lib/python3.8/site-packages/lpips/weights/v0.1/vgg.pth
Traceback (most recent call last):
File "train_net.py", line 117, in
main()
File "train_net.py", line 109, in main
train(cfg, network)
File "train_net.py", line 51, in train
trainer.train(epoch, train_loader, optimizer, recorder)
File "/dfs/data/ENeRF/lib/train/trainers/trainer.py", line 56, in train
output, loss, loss_stats, image_stats = self.network(batch)
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1015, in call_impl
return forward_call(*input, **kwargs)
File "lib/train/losses/enerf.py", line 23, in forward
color_loss = self.color_crit(batch[f'rgb{i}'], output[f'rgb_level{i}'])
KeyError: 'rgb_level0'
May I ask what camera is used for the dataset? gopro or other camera?
I built my own similar zju-mocap dataset. The size of the image and mask I input is 1088 x 1920, and the input ratio is 0.5, but I reported an error during visualization:
I noticed that the length of the output is 786432=512 x 512 x 3, which happens to be the output of the set zju-mocap dataset 1024 x 1024 x 3 under the corresponding input_ratio. What should I do to change the output to the correct output?
ps: because I found that if forced to use 512 x 512 x 3 as the output of reshape, the visualization will be incomplete:
pred_img = output[f'rgb_level{i}'][b].reshape(512, 512, 3)
Hi, I try to prepare my own Mocap dataset to run your code but the youtube link in https://chingswy.github.io/easymocap-public-doc/quickstart/capture_youtube.html is private or not exist. Could you please release the multiview vedio of your Mocap vedio so that I can have a try on preparing Mocap dataset. Thanks in advance!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.