Asking for preprocess tsdf of 13.84 IoU ckpt

Hi there again, first of all thank you for replying to my earlier questions. I want to ask can you release the preprocessed tsdf with the 13.84 result IoU? Right now i'm trying to reproduce your scene reconstruction results but since my computation budget is small, I can only generating depth at angle=0 and step=5. But the results seems to be not as good as the paper. It would be very nice to have a preprocessed tsdf to validate the paper's results :D

Something wrong when I want to compute the depth metrics on all frames in each sequence

Hi, today when I want to compute the depth metrics on all frames in each sequence follow your instructions, I met an error.

May I ask that did you met this problem before?
I print the variable, it looks like

I tried some methods like using "torch.unsqueeze()" to expand dimensions but it doesn't work. I have no idea on how to fix it now. :(
Sorry to bother you.

How do I set the data if I want to move the image I acquired with this code?

How do I set the data if I want to move the image I acquired with this code?
I have acquired a series of monocular images. I would like to compose a scene flow using these images.What additional monocular images and what do you need?

How to generate the 3d reconstruction demo?

Hi,
How to generate the 3d reconstruction demo like this?

Error with training on single gpu.

When I set the parameter '--n_gpus' to 1, error occurs as follow.
File "/home/sober/SceneRF-main/scenerf/models/unet2d_sphere.py", line 247, in forward
encoded_feats = self.encoder(x)
File "/home/sober/anaconda3/envs/scenerf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/sober/SceneRF-main/scenerf/models/unet2d_sphere.py", line 217, in forward
for k, v in self.original_model._modules.items():
AttributeError: 'collections.OrderedDict' object has no attribute '_modules'

Some question about compute_transformation and dataset

Hi author, thanks for your excellent work first! I have been trying to reimplement your work on SCARED dataset recently. I have some questions about the compute_transformation function.

I understand it's to calculate the transformation matrix from the source frame to infer frame and source frame to target frame. But I thought it would be enough till this line, why still need to register the 3d to 3d points and make another transformation?
I don't know if you are familiar with SCARED dataset. It does not have a velo. Its depth map is directly presented in size of (H, W,3), where each pixel contains an (X, Y,Z) coordinate which is the vertex position in left camera space that the pixel projects to. So if I use compute_transformation, can I just assume T_velo_2_cam2 and T_cam0_2_cam2 to be identity matrix(e.g. eye(4))? And points variables like pts_velo_source to be the depth map reshape to size(H * W, 3)?
Do you have a suggestion on how to determine the scene_size and vox_origin?

Thanks for your time and really look forward to your reply!

About dataset

Your work is excellent.！What is the directory structure for storing the datasets in this project?

About cam_pts_to_angle

https://github.com/astra-vision/SceneRF/blob/f55d944f7df49213332467e81e3b245301a40870/scenerf/models/spherical_mapping.py#L100C47-L100C47

你好，在看代码的时候这里的 X，Y 轴的方向定义我不太理解，因为相机坐标系下的 X 是向右的，Y 是向下的，Z 是垂直相机光心向外的，就像 KITTI 数据集中关于坐标系展示的那样。所以这里的关于 -Y 轴以及关于 X 轴角度的计算我不太理解。可以帮我看看吗，非常感谢！

Are this method scene-specific?

Hi, thanks for your great work!
I wonder if SceneRF is a scene-specific method or it can be generalized to the unseen scene.

code question

Hi, I'm back again 😂. I seem to find a bug in this repo, but I'm not really sure.

It is in the predict function of scenerf/models/scenerf.py,

        feats_2d_sphere = [sample_feats_2d(x_rgb["1_1"].unsqueeze(0), pix_sphere_coords, (self.out_img_W, self.out_img_H))]
        for scale in [2, 4, 8, 16]:
            key = "1_{}".format(scale)
            feats_2d_sphere.append(sample_feats_2d(x_rgb[key].unsqueeze(0), pix_sphere_coords, (self.out_img_W//scale, self.out_img_H//scale)))

The input pix_sphere_coords of sample_feats_2d is used for different downsample resolution, but (img_w, img_H) has beed changed.
In sample_feats_2d, pix_sphere_coords is divided by (img_W, img_H), which will cause the bound of projected_pix becomes much smaller than (-1,1) according to the downsample ratio. This will lead to a lot of zeros in feats_2d.

    projected_pix = (projected_pix / torch.tensor(img_size).type_as(projected_pix).reshape(1, 2)) * 2 - 1
    projected_pix = projected_pix.reshape(1, 1, -1, 2)

Did I make a mistake? Many thanks!

The results of retraining on kitti dataset is worse than the results produced by the pretrained model

I have followed the training pipeline and retrained the model on kitti dataset, but the result is worse than that produced by the pretrained model. I have tried several hyper-parameters(batch_size=4 lr=1.e-5, batch_size=8 lr=2e-5..), the results were still not comparable with the pretrained model.
So would you please provide some more suggestion of the training? Thanks a lot
The visual depth results are showed below
ours:

pretrained model:

the detph eval metrics are showed below (the results are evaluated on the first 20 images)
ours:
|All |0.206474|1.792310|7.127426|0.358133|0.642755|0.824265|0.903641|00000219|

pretrianed:
|All |0.173536|1.199905|5.350147|0.273417|0.745632|0.901323|0.949406|00000219|

generate the 3d reconstruction demo

hi，i see you use the open3d plot mesh and capture the image.
i want to know why to capture the image and how to capture the image ?

How to understand the sphericalMapping parameters?

Hi,
How to set the magical number, such as 104.7294 in our dataset?

No Module named scenerf"

Why does "No Module named scenerf" keep appearing during verification?

Question about FPS in inference

Hi there again. I want to ask a bit about the inference speed of the model in scene reconstruction (image to voxels). Have your team measured this? Currently in the codebase i'm seeing 3 different stages for inferencing (image2noveldepth, noveldepth2tsdf, tsdf2voxel), I am very curious about this. Hope to hear from you soon.

What's the difference between the data_odometry pose and the pose in data_odometry_labels?

Hi,
I found they are two kinds of poses and different.

By the way, the pose is from world to camera. The pose is given in the camera coordinate.
Is correct?

Some questions about performance against baseline Adabins

Hi there, love your team's work! I just have a small question about the project.
In your paper, specifically table 6, the 2 depth-estimation baseline (AdaBins with LiDAR sup and Monodepth2 with self-sup) seems to have reasonable performance in voxel reconstruction compared to your method. Can you explain more why SceneRF (or other methods such as PixelNeRF) should be investigated, and why we don't just use SOTA depth-estimation method and unproject + voxelized them and call it a day?

The details of converting image coordinates to spherical coordinates are somewhat confusing

Hi, thanks for your great work!
I'm a bit confused about coordinate conversion. There are a few questions. Could you please explain it？
First, which kind of camera frame is used in your implementation?

Second, in the projection formula described in the paper, what does ▽x and ▽y mean?
Third, does the spherical frame have the same axis as the camera frame?
I understand this process is projecting the image onto a sphere, which enlarges the FOV, but the formula is really difficult to understand if considering details.

How to generate preprocess dataset?

Hi, @anhquancao
I don't know how to generate the preprocess dataset.
Is it the same as the monoscene data preprocess script

[Question] 3D reconstruction from image slides

Thank for great repo

I have list images from 360 degree continue. How can I make 3D reconstructions (360) image from slide?

Thank you.

Are there any changes to the code and configuration for indoor scene reconstruction?

hello! you address single-view reconstruction of complex (and possibly large) scenes, in a fully selfsupervised manner.
Does large mean the maximum distance of 3D reconstruction? Are there any changes to the code and configuration for indoor scene reconstruction?

environment problem

Hi, I met a bug in DecoderSphere.
feats = F.grid_sample(
x,
map_sphere,
align_corners=False,
mode='bilinear'
)
grid_sampler(): expected grid and input to have same batch size, but got input with sizes [4, 2560, 14, 41] and grid with sizes [1, 1, 658, 2].

Here is my enviroment information. Is there anything wrong? I truly appreciate your help.

No module named 'scenerf'

(scenerf) root@autodl-container-9cb9118e00-b877646d:~/autodl-tmp/SceneRF# python scenerf/scripts/train_kitti.py --bs=1 --n_gpus=1 --enable_log=True --preprocess_root=$KITTI_PREPROCESS --root=$KITTI_ROOT --logdir=$KITTI_LOG --n_gaussians=4 --n_pts_per_gaussian=8 --max_epochs=50 --exp_prefix=Train
Traceback (most recent call last):
File "scenerf/scripts/train_kitti.py", line 8, in
from scenerf.data.semantic_kitti.kitti_dm import KittiDataModule
ModuleNotFoundError: No module named 'scenerf'

There's nothing wrong with the reference path, why can't I find scenerf?

depth estimation on other images

Hello thank you for your work and your code.

Is there a simple way to perform depth estimation on custom images ?

Question about Function "depth2disp"

I have a question about the function "depth2disp" which is defined in "scenerf/models/utils.py":

def depth2disp(depth, min_depth=0.1, max_depth=100):
    """Convert depth to disp
    """
    depth = torch.clamp(depth, min=min_depth, max=max_depth)
    min_disp = 1 / max_depth
    max_disp = 1 / min_depth
    scaled_disp = 1 / depth
    disp = scaled_disp - min_disp / (max_disp - min_disp)

    return disp

Is the expression "disp = scaled_disp - min_disp / (max_disp - min_disp)" missing a parenthesis？I guess it should be "disp = (scaled_disp - min_disp) / (max_disp - min_disp)"?

Bugs in generate_novel_depths.py

There is a bug in scenerf/scripts/reconstruction/generate_novel_depths.py. The step variable on L70 for computing poses is overwritten in the inner for loop on L82. This results in computing depth with only 0m and 10m, ignoring 0.5m, 1.0m, 1.5m, etc.

Questions about code

Hi, thanks for your kind help last time. Now I'm learning NeRF from your code. It's a really great work but I am not good at coding. I met some problems in learning. I will try to list them below.

I find some conv layers seem not to be used in the network. They are self.resize_1_1 to self.resize_output_1_16 in scenerf/models/unet2d_sphere.py. Should I delete these layers?
I find 'steps' appear repeatedly in scenerf/models/utils/sample_rel_poses. Should I use the first line?
steps = torch.arange(start=0, end=max_distance, step=step)
steps = torch.arange(start=0, end=0.6, step=0.5)
In scenerf/models/utils/uniform_sampling, the step of noise seems not to be consistent with sensor_distance_sampled, as the intervals of torch.linspace will be n_pts_per_ray-1 instead of n_pts_per_ray. Is this ok?

Thank you very much again!

Error in Chekcpoints Saving

I tried to run the training script on SemanticKITTI dataset. However, I find the training process only saves the model at epoch 0.

I noticed that in in scenerf/scripts/train_kitti.py, the checkpoint is defined as

ModelCheckpoint(
    save_last=True,
    monitor="valdepth/abs_rel",
    save_top_k=1,
    mode="max",
    filename="{epoch:03d}-{valdepth/abs_rel:.4f}",
)

I suppose the saving mode max is a typo and it should be min. Please let me know if I'm correct. Thank you.

train semantickitti problem

Hi, wonderful and very novel job!
by running train_kitti.py, I found unexpected argument 'dataset' when instancing model,
so I just delete the argument 'dataset', as in the picture

And then the program runs perfectly.

Is the argument 'dataset' useless?
Thank u

Can the scene be reconstructed in 3D?

Hello, thank you very much for your great work. Can you tell me if this model can be used to reconstruct a 3D scene, based on monocular video (video obtained without camera parameters, only with cell phone recording)?

About Scene Reconstruction Performance

Hi all. I'm trying to evaluate this method on Semantic KITTI using the provided pretrained model. But I got some less-than-one numbers in all three metrics. Orz

I checked the TSDF results and found that it seemed like the prediction of voxels was strange. Below is a comparison of ground truth (left) and my prediction (right).

I was wondering if you had any insight about what might be happening. Thx!

FileNotFoundError: [Errno 2] No such file or directory: '/SHFP12/02_bevdet/SceneRF/kitti_odometry/dataset/sequences/00/voxels/003228.invalid'

Hi, @anhquancao
I found there is no "003228.invalid" in the the SemanticKITTI voxel data .
How can you fixed this issue?

    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/SHFP12/02_bevdet/SceneRF/scenerf/data/semantic_kitti/kitti_dataset.py", line 403, in __getitem__
    data['target_1_1'] = self.read_semKITTI_label(label_path, invalid_path)
  File "/SHFP12/02_bevdet/SceneRF/scenerf/data/semantic_kitti/kitti_dataset.py", line 413, in read_semKITTI_label
    INVALID = SemanticKittiIO._read_invalid_SemKITTI(invalid_path)
  File "/SHFP12/02_bevdet/SceneRF/scenerf/data/semantic_kitti/io_data.py", line 126, in _read_invalid_SemKITTI
    invalid = _read_SemKITTI(path, dtype=np.uint8, do_unpack=True)
  File "/SHFP12/02_bevdet/SceneRF/scenerf/data/semantic_kitti/io_data.py", line 114, in _read_SemKITTI
    bin = np.fromfile(path, dtype=dtype)  # Flattened array
FileNotFoundError: [Errno 2] No such file or directory: '/SHFP12/02_bevdet/SceneRF/kitti_odometry/dataset/sequences/00/voxels/003228.invalid'

ERROR: Unexpected segmentation fault encountered in worker.

Hi
I am trying to train the model from scratch with this command:

python3 scenerf/scripts/train.py \
    --bs=1 --n_gpus=1 --enable_log=True \
    --preprocess_root=/home/trainer/Datasets/preprocess \
    --root=/home/trainer/Datasets/Kitti/ \
    --logdir=./kitti/logs \
    --n_gaussians=4 --n_pts_per_gaussian=8 --max_epochs=50 --exp_prefix=Train

But faced this problem:

root@devbox:/home/trainer/scenerf# ./train.sh 
Global seed set to 42
Using cache found in /root/.cache/torch/hub/rwightman_gen-efficientnet-pytorch_master
Loading base model ()...Done.
Removing last two layers (global_pool & classifier).
Building Encoder-Decoder model..Done.
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
Global seed set to 42
initializing ddp: GLOBAL_RANK: 0, MEMBER: 1/1
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All DDP processes registered. Starting ddp with 1 processes
----------------------------------------------------------------------------------------------------

00 5 23
01 4 11
02 7 21
03 10 20
04 7 8
05 2 22
06 7 23
07 2 22
09 7 20
10 5 20
Preprocess time: --- 3.7724807262420654 seconds ---
08 2 23
Preprocess time: --- 0.8777365684509277 seconds ---
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name              | Type               | Params
---------------------------------------------------------
0 | spherical_mapping | SphericalMapping   | 0     
1 | net_rgb           | UNet2DSphere       | 231 M 
2 | pe                | PositionalEncoding | 0     
3 | mlp               | ResnetFC           | 5.4 M 
4 | mlp_gaussian      | ResnetFC           | 5.4 M 
5 | ray_som           | RaySOM             | 0     
---------------------------------------------------------
242 M     Trainable params
0         Non-trainable params
242 M     Total params
970.275   Total estimated model params size (MB)
Validation sanity check:   0%|                                                                                                                                            | 0/2 [00:00<?, ?it/s]ERROR: Unexpected segmentation fault encountered in worker.
ERROR: Unexpected segmentation fault encountered in worker.
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1120, in _try_get_data
    data = self._data_queue.get(timeout=timeout)
  File "/usr/lib/python3.8/queue.py", line 179, in get
    self.not_empty.wait(remaining)
  File "/usr/lib/python3.8/threading.py", line 306, in wait
    gotit = waiter.acquire(True, timeout)
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
    _error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 6354) is killed by signal: Segmentation fault. 

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "scenerf/scripts/train.py", line 161, in <module>
    main()
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "scenerf/scripts/train.py", line 156, in main
    trainer.fit(model, data_module)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 552, in fit
    self._run(model)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 922, in _run
    self._dispatch()
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 990, in _dispatch
    self.accelerator.start_training(self)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/accelerators/accelerator.py", line 92, in start_training
    self.training_type_plugin.start_training(trainer)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in start_training
    self._results = trainer.run_stage()
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1000, in run_stage
    return self._run_train()
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1035, in _run_train
    self._run_sanity_check(self.lightning_module)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/trainer/trainer.py", line 1122, in _run_sanity_check
    self._evaluation_loop.run()
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 111, in run
    self.advance(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/dataloader/evaluation_loop.py", line 110, in advance
    dl_outputs = self.epoch_loop.run(
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/base.py", line 111, in run
    self.advance(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/pytorch_lightning/loops/epoch/evaluation_epoch_loop.py", line 94, in advance
    batch_idx, batch = next(dataloader_iter)
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 628, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1316, in _next_data
    idx, data = self._get_data()
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1272, in _get_data
    success, data = self._try_get_data()
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1133, in _try_get_data
    raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 6354) exited unexpectedly
ERROR: Unexpected segmentation fault encountered in worker.

I think that the error is happing in this function:

SceneRF/scenerf/data/utils/fusion.py

Line 171 in 0a98b2c

def vox2world(vol_origin, vox_coords, vox_size):

Any suggestion?

About to generate mesh

Hi, I want to know when this section of code will be added.

Pretrained model checkoint

Hello!

Thanks for the amazing work and discovering the source code.
I would like to try the pre-trained model, but goodle drive says that the ckpt file has been moved to the trash and viewers cannot download it.

Could you please share the actual link to the pre-trained ckpt file?

astra-vision / scenerf Goto Github PK

scenerf's People

Contributors

Stargazers

Watchers

Forkers

scenerf's Issues

Name Version Build Channel

Recommend Projects

Recommend Topics

Recommend Org