whu-usi3dv / freereg Goto Github PK
View Code? Open in Web Editor NEW[ICLR 2024] FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators
Home Page: https://whu-usi3dv.github.io/FreeReg/
[ICLR 2024] FreeReg: Image-to-Point Cloud Registration Leveraging Pretrained Diffusion Models and Monocular Depth Estimators
Home Page: https://whu-usi3dv.github.io/FreeReg/
I have some issues in loading the local model as follows. Since the MiDaS repository is also downloaded from the Internet. However, in this case, unsuccessful Internet access does not result in an error. Could you please send me a copy of it?
midas = torch.hub.load("/mnt/proj/SOTAs/ZoeDepth-main/pkgs/intel-isl_MiDaS_master", midas_model_type, pretrained=use_pretrained_midas, source='local')
A modification to the contents of the file is suggested.
非常牛的工作!但是我想问下在README提示的环境配置中,我无法成功安装MinkowskiEngine,我对这个库的难装程度也有所耳闻,请问你们确实在README提示的环境下完成MinkowskiEngine的安装的么?有没有其他细节可以提示一下?
I noticed the dpt_intrinsic of the demo is [[574.541,0,322.522],[0,577.584,238.559],[0,0,1]]. But for a new pair, how should we set this value?
The work is inspiring.
Note that we uniformly sample a dense grid of keypoints on both the depth map and the image.
So how many points are sampled for matching?
RuntimeError: CUDA out of memory. Tried to allocate 968.00 MiB (GPU 0; 11.76 GiB total capacity; 8.87 GiB already allocated; 484.12 MiB free; 9.95 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
My device: RTX 3060 with 12GB capacity.
Is there someway to run this? Thanks!
请问谁有保留这个ckpt吗,huggingface上这个模型权重被撤了
Hi. I have been able to run the demo, however when I run on my own images and point cloud, then it could not run:
/home/researcher/anaconda3/envs/freereg/lib/python3.8/site-packages/MinkowskiEngine-0.5.4-py3.8-linux-x86_64.egg/MinkowskiEngine/init.py:36: UserWarning: The environment variable
OMP_NUM_THREADS
not set. MinkowskiEngine will automatically setOMP_NUM_THREADS=16
. If you want to setOMP_NUM_THREADS
manually, please export it on the command line before running a python script. e.g.export OMP_NUM_THREADS=12; python your_program.py
. It is recommended to set it below 24.
warnings.warn(
logging improved.
Overwriting config with config_version None
img_size [384, 512]
/home/researcher/anaconda3/envs/freereg/lib/python3.8/site-packages/torch/functional.py:512: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:3587.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Params passed to Resize transform:
width: 512
height: 384
resize_target: True
keep_aspect_ratio: True
ensure_multiple_of: 32
resize_method: minimal
/home/researcher/anaconda3/envs/freereg/lib/python3.8/site-packages/torch/nn/modules/transformer.py:306: UserWarning: enable_nested_tensor is True, but self.use_nested_tensor is False because encoder_layer.self_attn.batch_first was not True(use batch_first for better inference performance)
warnings.warn(f"enable_nested_tensor is True, but self.use_nested_tensor is False because {why_not_sparsity_fast_path}")
Using pretrained resource local::./tools/zoe/models/ZoeD_M12_NK.pt
Loaded successfully
No module 'xformers'. Proceeding without it.
ControlLDM: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
/home/researcher/anaconda3/envs/freereg/lib/python3.8/site-packages/huggingface_hub/file_download.py:1132: FutureWarning:resume_download
is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, useforce_download=True
.
warnings.warn(
/home/researcher/anaconda3/envs/freereg/lib/python3.8/site-packages/transformers/modeling_utils.py:433: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage()
with safe_open(checkpoint_file, framework="pt") as f:
Loaded model config from [./tools/controlnet/models/control_v11f1p_sd15_depth.yaml]
Loaded state_dict from [./tools/controlnet/models/v1-5-pruned.ckpt]
Loaded state_dict from [./tools/controlnet/models/control_v11f1p_sd15_depth_ft.pth]
Global seed set to 12345
We force to use step-150 (~150 rather than 150) for our control process use 20 steps!
source-feat:['rgb_df', 'rgb_gf']
target-feat:['dpt_df', 'dpt_gf']
weight: [0.5 0.5]
we use zoe-ransac solver for source-rgb and target-dpt!
[Open3D WARNING] Read PTS: only points and colors attributes are supported.
Estimating zoe-depth for rgb on demo:
100%|██████████████████████████████████████████| 2/2 [00:00<00:00, 66576.25it/s]
50%|██████████████████████▌ | 1/2 [00:08<00:08, 8.99s/it]Segmentation fault (core dumped)
How can I make it work ? Thank you very much.
作者你好,我跑完提供数据的demo后,尝试在自己的数据上测试,将rgb_size改为(2158,3844),相机内参也做了相应更改,但是运行的时候出现以下错误:
logging improved.
Overwriting config with config_version None
img_size [384, 512]
/root/miniconda3/envs/gs_model/lib/python3.9/site-packages/torch/functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at /opt/conda/conda-bld/pytorch_1678402412426/work/aten/src/ATen/native/TensorShape.cpp:3483.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Params passed to Resize transform:
width: 512
height: 384
resize_target: True
keep_aspect_ratio: True
ensure_multiple_of: 32
resize_method: minimal
Using pretrained resource local::./tools/zoe/models/ZoeD_M12_N.pt
Loaded successfully
No module 'xformers'. Proceeding without it.
ControlLDM: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
making attention of type 'vanilla' with 512 in_channels
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla' with 512 in_channels
Loaded model config from [./tools/controlnet/models/control_v11f1p_sd15_depth.yaml]
Loaded state_dict from [./tools/controlnet/models/v1-5-pruned.ckpt]
Loaded state_dict from [./tools/controlnet/models/control_v11f1p_sd15_depth_ft.pth]
Seed set to 12344
We force to use step-150 (~150 rather than 150) for our control process use 20 steps!
source-feat:['rgb_df', 'rgb_gf']
target-feat:['dpt_df', 'dpt_gf']
weight: [0.5 0.5]
we use zoe-ransac solver for source-rgb and target-dpt!
Estimating zoe-depth for rgb on demo:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 9289.71it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [02:17<00:00, 68.83s/it]
0%| | 0/1 [00:00<?, ?it/s]/root/miniconda3/envs/gs_model/lib/python3.9/site-packages/torchvision/transforms/functional.py:1603: UserWarning: The default value of the antialias parameter of all the resizing transforms (Resize(), RandomResizedCrop(), etc.) will change from None to True in v0.17, in order to be consistent across the PIL and Tensor backends. To suppress this warning, directly pass antialias=True (recommended, future default), antialias=None (current default, which means False for Tensors and True for PIL), or antialias=False (only works on Tensors - PIL will still use antialiasing). This also applies if you are using the inference transforms from the models weights: update the call to weights.transforms(antialias=True).
warnings.warn(
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:12<00:00, 12.33s/it]
Evaling on demo...
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/root/data_user/ysl/FreeReg/demo.py", line 150, in
mm_reg.run()
File "/root/data_user/ysl/FreeReg/demo.py", line 121, in run
self.eval()
File "/root/data_user/ysl/FreeReg/demo.py", line 109, in eval
self.evalor.run({'demo':self.meta})
File "/root/data_user/ysl/FreeReg/pipeline/gen_eval.py", line 67, in run
smatch_xyz, tmatch_xyz = self.eval_pair(stype, ttype, sitem, titem, pps)
File "/root/data_user/ysl/FreeReg/pipeline/gen_eval.py", line 46, in eval_pair
gts, smask = self.eval_mask(source_type,sitem)
File "/root/data_user/ysl/FreeReg/pipeline/gen_eval.py", line 34, in eval_mask
gtd = gtd[uv[:,1],uv[:,0]]
IndexError: index 986 is out of bounds for axis 0 with size 968
这是什么原因呢
Hi !
Can you please help to check this error?
File "/FreeReg/tools/zoe/zoedepth/models/model_io.py", line 49, in load_state_dict/anaconda3/envs/freereg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1604, in load_state_dict
model.load_state_dict(state)
File "
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ZoeDepth:
Unexpected key(s) in state_dict: "core.core.pretrained.model.blocks.0.attn.relative_position_index", "core.core.pretrained.model.blocks.1.attn.relative_position_index", "core.core.pretrained.model.blocks.2.attn.relative_position_index", "core.core.pretrained.model.blocks.3.attn.relative_position_index", "core.core.pretrained.model.blocks.4.attn.relative_position_index", "core.core.pretrained.model.blocks.5.attn.relative_position_index", "core.core.pretrained.model.blocks.6.attn.relative_position_index", "core.core.pretrained.model.blocks.7.attn.relative_position_index", "core.core.pretrained.model.blocks.8.attn.relative_position_index", "core.core.pretrained.model.blocks.9.attn.relative_position_index", "core.core.pretrained.model.blocks.10.attn.relative_position_index", "core.core.pretrained.model.blocks.11.attn.relative_position_index", "core.core.pretrained.model.blocks.12.attn.relative_position_index", "core.core.pretrained.model.blocks.13.attn.relative_position_index", "core.core.pretrained.model.blocks.14.attn.relative_position_index", "core.core.pretrained.model.blocks.15.attn.relative_position_index", "core.core.pretrained.model.blocks.16.attn.relative_position_index", "core.core.pretrained.model.blocks.17.attn.relative_position_index", "core.core.pretrained.model.blocks.18.attn.relative_position_index", "core.core.pretrained.model.blocks.19.attn.relative_position_index", "core.core.pretrained.model.blocks.20.attn.relative_position_index", "core.core.pretrained.model.blocks.21.attn.relative_position_index", "core.core.pretrained.model.blocks.22.attn.relative_position_index", "core.core.pretrained.model.blocks.23.attn.relative_position_index".
Thank you for sharing the work. I have temporarily reproduced demo.py, but it takes up more GPU. I will try to use it to register my own dataset
Hello!
Based on my understanding, the Tpre matrix generated by the match results is the transformation matrix that converts the extrinsics of the input point cloud (pcd) projection to the camera pose where the image is registered within the point cloud.
Could you provide a example tool code that allows the user to directly obtain the final pose of the registered image? If my understanding is incorrect, could you please explain how to accurately determine the final pose of the registered image?
Thank you!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.