Coder Social home page Coder Social logo

brummi / monorec Goto Github PK

View Code? Open in Web Editor NEW
583.0 28.0 84.0 43.45 MB

Official implementation of the paper: MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera (CVPR 2021)

License: MIT License

Python 99.90% Shell 0.10%
depth-estimation deep-learning unsupervised-learning cvpr2021

monorec's Introduction

MonoRec

Paper | Video (CVPR) | Video (Reconstruction) | Project Page

This repository is the official implementation of the paper:

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

Felix Wimbauer*, Nan Yang*, Lukas Von Stumberg, Niclas Zeller and Daniel Cremers

CVPR 2021 (arXiv)

If you find our work useful, please consider citing our paper:

@InProceedings{wimbauer2020monorec,
  title = {{MonoRec}: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera},
  author = {Wimbauer, Felix and Yang, Nan and von Stumberg, Lukas and Zeller, Niclas and Cremers, Daniel},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2021},
}

πŸ—οΈοΈ Setup

The conda environment for this project can be setup by running the following command:

conda env create -f environment.yml

πŸƒ Running the Example Script

We provide a sample from the KITTI Odometry test set and a script to run MonoRec on it in example/. To download the pretrained model and put it into the right place, run download_model.sh. You can manually do this by downloading the weights from here and unpacking the file to saved/checkpoints/monorec_depth_ref.pth. The example script will plot the keyframe, depth prediction and mask prediction.

cd example
python test_monorec.py

πŸ—ƒοΈ Data

In all of our experiments we used the KITTI Odometry dataset for training. For additional evaluations, we used the KITTI, Oxford RobotCar, TUM Mono-VO and TUM RGB-D datasets. All datapaths can be specified in the respective configuration files. In our experiments, we put all datasets into a seperate folder ../data.

KITTI Odometry

To setup KITTI Odometry, download the color images and calibration files from the official website (around 145 GB). Instead of the given velodyne laser data files, we use the improved ground truth depth for evaluation, which can be downloaded from here.

Unzip the color images and calibration files into ../data. The lidar depth maps can be extracted into the given folder structure by running data_loader/scripts/preprocess_kitti_extract_annotated_depth.py.

For training and evaluation, we use the poses estimated by Deep Virtual Stereo Odometry (DVSO). They can be downloaded from here and should be placed under ../data/{kitti_path}/poses_dso. This folder structure is ensured when unpacking the zip file in the {kitti_path} directory.

To supplement the self-supervised training, we use sparse depth maps generated by Deep Virtual Stereo Odometry (DVSO) during the pose etimation. They can be downloaded from here and should be palced under ../data/{kitti_path}/sequences/{seq_num}/image_depth_sparse. This folder structure is ensured when unpacking the zip file in the {kitti_path} directory.

The auxiliary moving object masks can be downloaded from here. They should be placed under ../data/{kitti_path}/sequences/{seq_num}/mvobj_mask. This folder structure again is ensured when unpacking the zip file in the {kitti_path} directory.

Finally, for mask training, we also use index masks for the training data, which can be downloaded from here. They should be placed under ../data/{kitti_path}/sequences/{seq_num}/. This folder structure again is ensured when unpacking the zip file in the {kitti_path} directory.

Oxford RobotCar

To setup Oxford RobotCar, download the camera model files and the large sample from the official website. Code, as well as, camera extrinsics need to be downloaded from the official GitHub repository. Please move the content of the python folder to data_loaders/oxford_robotcar/. extrinsics/, models/ and sample/ need to be moved to ../data/oxford_robotcar/. Note that for poses we use the official visual odometry poses, which are not provided in the large sample. They need to be downloaded manually from the raw dataset and unpacked into the sample folder.

TUM Mono-VO

Unfortunately, TUM Mono-VO images are provided only in the original, distorted form. Therefore, they need to be undistorted first before fed into MonoRec. To obtain poses for the sequences, we run the publicly available version of Direct Sparse Odometry.

TUM RGB-D

The official sequences can be downloaded from the official website and need to be unpacked under ../data/tumrgbd/{sequence_name}. Note that our provided dataset implementation assumes intrinsics from fr3 sequences. Note that the data loader for this dataset also relies on the code from the Oxford Robotcar dataset.

πŸ‹οΈ Training & Evaluation

This repository provides training and evaluation configurations to reproduce the results from the paper.

To train a model from scratch, first set the dataset_dir fields to the directory in which KITTI Odometry is located (default ../data/dataset). Then run the following commands in the given order:

python train.py --config configs/train/monorec/monorec_depth.json --options stereo                          # Depth Bootstrap
python train_monorec.py --config configs/train/monorec/monorec_mask.json --options stereo                   # Mask Bootstrap
python train_monorec.py --config configs/train/monorec/monorec_mask_ref.json --options mask_loss            # Mask Refinement
python train_monorec.py --config configs/train/monorec/monorec_depth_ref.json --options stereo stereo_repr  # Depth Refinement

The final model will be stored under saved/models/monorec_depth_ref/00/checkpoint.pth.

We also provide checkpoints for each training stage:

Training stage Download
Depth Bootstrap Link
Mask Bootstrap Link
Mask Refinement Link
Depth Refinement (final model) Link

Run download_model.sh to download the final model. It will automatically get moved to saved/checkpoints.

To reproduce the evaluation results on different datasets, run the following commands:

python evaluate.py --config configs/evaluate/eval_monorec.json        # KITTI Odometry
python evaluate.py --config configs/evaluate/eval_monorec_oxrc.json   # Oxford Robotcar

☁️ Pointclouds

To reproduce the pointclouds depicted in the paper and video, use the following commands:

python create_pointcloud.py --config configs/test/pointcloud_monorec.json       # KITTI Odometry
python create_pointcloud.py --config configs/test/pointcloud_monorec_oxrc.json  # Oxford Robotcar
python create_pointcloud.py --config configs/test/pointcloud_monorec_tmvo.json  # TUM Mono-VO

monorec's People

Contributors

brummi avatar christian-hiebl avatar gitouni avatar hardik01shah avatar nynyg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

monorec's Issues

What's the correspondence between KITTI raw data and KITTI Odometry?

Hi, thanks for your great contribution to 3d reconstruction.

For KITTI Odometry dataset, we can download RGB images from the official website, which are named by sequence number. But how can we get "the improved ground truth depth" mentioned in the README.md, which are named like "2011_09_26_drive_0001". What's the correspondence between them.

Thanks a lot.

Clarification regarding mask module

I wanted to clarify the following:

  1. What type of information does stereo image add to the monorec network?
  2. Is it possible to train the end to end model only using monocular images ?
  3. Is it possible to train only the mask module only using monocular images ?

Question about ground truth Depth Annotations

When you say the "improved ground truth depth," does that correspond to this on the KITTI website "Download annotated depth maps data set (14 GB)"? Does this contain all of the annotations I need to evalutate the model or should I download the other links on the website?

Mask module bootstrap does not converge

Hi:

Thank you for sharing this amazing work!

I tried to train the model several times but the mask module bootstrapping can not converge on the KITTI dataset. The only thing I changed in the code is from here .

The training iou is increasing but the validation iou is decreasing so is it overfitting?
image

The result looks very noisy even in the training set:
image

However, the mask module refinement converged as well as claimed in the paper.

below is the training log from mask module bootstrapping:

2022-05-21 18:56:28,980 - train - INFO - 1503377 trainable parameters
2022-05-21 19:33:24,088 - trainer - INFO - epoch : 1
2022-05-21 19:33:24,089 - trainer - INFO - loss : 0.33975716474378714
2022-05-21 19:33:24,090 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-21 19:33:24,091 - trainer - INFO - abs_rel_sparse_metric: 1.975
2022-05-21 19:33:24,092 - trainer - INFO - rmse_sparse_metric: 1.975
2022-05-21 19:33:24,092 - trainer - INFO - loss_iou : 0.08929756164550781
2022-05-21 19:33:24,093 - trainer - INFO - loss_prec : 0.09143932342529297
2022-05-21 19:33:24,094 - trainer - INFO - loss_loss : 0.33975664774576825
2022-05-21 19:33:24,095 - trainer - INFO - loss_acc : 0.9185029093424479
2022-05-21 19:33:24,096 - trainer - INFO - loss_rec : 0.9724029541015625
2022-05-21 19:33:24,096 - trainer - INFO - val_loss : 2.165919481880135
2022-05-21 19:33:24,097 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-21 19:33:24,098 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-21 19:33:24,099 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-21 19:33:24,100 - trainer - INFO - val_loss_iou : 0.09720667865541247
2022-05-21 19:33:24,100 - trainer - INFO - val_loss_prec : 0.10425782865948147
2022-05-21 19:33:24,101 - trainer - INFO - val_loss_loss : 2.165919621785482
2022-05-21 19:33:24,102 - trainer - INFO - val_loss_acc : 0.8980947070651584
2022-05-21 19:33:24,103 - trainer - INFO - val_loss_rec : 0.7917230923970541
2022-05-21 19:33:24,324 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-21 19:33:24,545 - trainer - INFO - Saving current best: model_best.pth ...
2022-05-21 20:09:51,592 - trainer - INFO - epoch : 2
2022-05-21 20:09:51,593 - trainer - INFO - loss : 0.18604853542870842
2022-05-21 20:09:51,596 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-21 20:09:51,599 - trainer - INFO - abs_rel_sparse_metric: 1.9338541666666667
2022-05-21 20:09:51,601 - trainer - INFO - rmse_sparse_metric: 1.9338541666666667
2022-05-21 20:09:51,604 - trainer - INFO - loss_iou : 0.13620360056559244
2022-05-21 20:09:51,606 - trainer - INFO - loss_prec : 0.13805244445800782
2022-05-21 20:09:51,609 - trainer - INFO - loss_loss : 0.1860488001505534
2022-05-21 20:09:51,612 - trainer - INFO - loss_acc : 0.9541281127929687
2022-05-21 20:09:51,614 - trainer - INFO - loss_rec : 0.9810747273763021
2022-05-21 20:09:51,617 - trainer - INFO - val_loss : 3.564138030219409
2022-05-21 20:09:51,619 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-21 20:09:51,621 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-21 20:09:51,623 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-21 20:09:51,625 - trainer - INFO - val_loss_iou : 0.10033636622958714
2022-05-21 20:09:51,627 - trainer - INFO - val_loss_prec : 0.1108185847600301
2022-05-21 20:09:51,629 - trainer - INFO - val_loss_loss : 3.5641377766927085
2022-05-21 20:09:51,631 - trainer - INFO - val_loss_acc : 0.9085981580946181
2022-05-21 20:09:51,633 - trainer - INFO - val_loss_rec : 0.7776173485649956
2022-05-21 20:09:52,095 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-21 20:46:30,289 - trainer - INFO - epoch : 3
2022-05-21 20:46:30,292 - trainer - INFO - loss : 0.13589516458225262
2022-05-21 20:46:30,298 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-21 20:46:30,303 - trainer - INFO - abs_rel_sparse_metric: 2.1313541666666667
2022-05-21 20:46:30,308 - trainer - INFO - rmse_sparse_metric: 2.1313541666666667
2022-05-21 20:46:30,313 - trainer - INFO - loss_iou : 0.17558348337809246
2022-05-21 20:46:30,318 - trainer - INFO - loss_prec : 0.17756632486979165
2022-05-21 20:46:30,323 - trainer - INFO - loss_loss : 0.1358951695760091
2022-05-21 20:46:30,329 - trainer - INFO - loss_acc : 0.9673453776041666
2022-05-21 20:46:30,332 - trainer - INFO - loss_rec : 0.9840897623697916
2022-05-21 20:46:30,337 - trainer - INFO - val_loss : 2.532514634024766
2022-05-21 20:46:30,342 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-21 20:46:30,347 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-21 20:46:30,353 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-21 20:46:30,358 - trainer - INFO - val_loss_iou : 0.07900547981262207
2022-05-21 20:46:30,364 - trainer - INFO - val_loss_prec : 0.08653933472103542
2022-05-21 20:46:30,371 - trainer - INFO - val_loss_loss : 2.532514360215929
2022-05-21 20:46:30,377 - trainer - INFO - val_loss_acc : 0.8775181240505643
2022-05-21 20:46:30,382 - trainer - INFO - val_loss_rec : 0.7745491663614908
2022-05-21 20:46:30,914 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-21 21:22:54,544 - trainer - INFO - epoch : 4
2022-05-21 21:22:54,551 - trainer - INFO - loss : 0.11389048684442969
2022-05-21 21:22:54,571 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-21 21:22:54,589 - trainer - INFO - abs_rel_sparse_metric: 2.0490625
2022-05-21 21:22:54,608 - trainer - INFO - rmse_sparse_metric: 2.0490625
2022-05-21 21:22:54,621 - trainer - INFO - loss_iou : 0.1996185811360677
2022-05-21 21:22:54,638 - trainer - INFO - loss_prec : 0.20159257253011068
2022-05-21 21:22:54,657 - trainer - INFO - loss_loss : 0.1138906987508138
2022-05-21 21:22:54,675 - trainer - INFO - loss_acc : 0.9730851236979167
2022-05-21 21:22:54,692 - trainer - INFO - loss_rec : 0.9856961059570313
2022-05-21 21:22:54,709 - trainer - INFO - val_loss : 8.58495722938743
2022-05-21 21:22:54,726 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-21 21:22:54,741 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-21 21:22:54,756 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-21 21:22:54,772 - trainer - INFO - val_loss_iou : 0.10192386971579658
2022-05-21 21:22:54,786 - trainer - INFO - val_loss_prec : 0.19616170724232992
2022-05-21 21:22:54,802 - trainer - INFO - val_loss_loss : 8.584956698947483
2022-05-21 21:22:54,818 - trainer - INFO - val_loss_acc : 0.9627514945136176
2022-05-21 21:22:54,834 - trainer - INFO - val_loss_rec : 0.3266310956743028
2022-05-21 21:22:55,087 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-21 21:59:20,427 - trainer - INFO - epoch : 5
2022-05-21 21:59:20,429 - trainer - INFO - loss : 0.09532881293710185
2022-05-21 21:59:20,430 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-21 21:59:20,431 - trainer - INFO - abs_rel_sparse_metric: 2.378229166666667
2022-05-21 21:59:20,432 - trainer - INFO - rmse_sparse_metric: 2.378229166666667
2022-05-21 21:59:20,432 - trainer - INFO - loss_iou : 0.22963691711425782
2022-05-21 21:59:20,433 - trainer - INFO - loss_prec : 0.23156720479329426
2022-05-21 21:59:20,434 - trainer - INFO - loss_loss : 0.09532897313435873
2022-05-21 21:59:20,435 - trainer - INFO - loss_acc : 0.9782980346679687
2022-05-21 21:59:20,436 - trainer - INFO - loss_rec : 0.9877736409505208
2022-05-21 21:59:20,437 - trainer - INFO - val_loss : 10.559455184162491
2022-05-21 21:59:20,437 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-21 21:59:20,438 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-21 21:59:20,439 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-21 21:59:20,440 - trainer - INFO - val_loss_iou : 0.08753267923990886
2022-05-21 21:59:20,441 - trainer - INFO - val_loss_prec : 0.1575686534245809
2022-05-21 21:59:20,441 - trainer - INFO - val_loss_loss : 10.559455023871529
2022-05-21 21:59:20,442 - trainer - INFO - val_loss_acc : 0.9632636176215278
2022-05-21 21:59:20,443 - trainer - INFO - val_loss_rec : 0.26647456487019855
2022-05-21 21:59:20,643 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-21 22:35:45,561 - trainer - INFO - epoch : 6
2022-05-21 22:35:45,562 - trainer - INFO - loss : 0.07562574363061988
2022-05-21 22:35:45,563 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-21 22:35:45,564 - trainer - INFO - abs_rel_sparse_metric: 2.0819791666666667
2022-05-21 22:35:45,565 - trainer - INFO - rmse_sparse_metric: 2.0819791666666667
2022-05-21 22:35:45,566 - trainer - INFO - loss_iou : 0.2568237050374349
2022-05-21 22:35:45,567 - trainer - INFO - loss_prec : 0.25863550821940107
2022-05-21 22:35:45,567 - trainer - INFO - loss_loss : 0.0756257692972819
2022-05-21 22:35:45,568 - trainer - INFO - loss_acc : 0.9828921508789062
2022-05-21 22:35:45,569 - trainer - INFO - loss_rec : 0.9893025716145833
2022-05-21 22:35:45,570 - trainer - INFO - val_loss : 11.668905624705884
2022-05-21 22:35:45,571 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-21 22:35:45,572 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-21 22:35:45,573 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-21 22:35:45,573 - trainer - INFO - val_loss_iou : 0.08269011974334717
2022-05-21 22:35:45,574 - trainer - INFO - val_loss_prec : 0.19104923142327201
2022-05-21 22:35:45,575 - trainer - INFO - val_loss_loss : 11.668904622395834
2022-05-21 22:35:45,576 - trainer - INFO - val_loss_acc : 0.9682659573025174
2022-05-21 22:35:45,577 - trainer - INFO - val_loss_rec : 0.18886461522844103
2022-05-21 22:35:45,747 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-21 23:12:12,250 - trainer - INFO - epoch : 7
2022-05-21 23:12:12,253 - trainer - INFO - loss : 0.06769100399202822
2022-05-21 23:12:12,259 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-21 23:12:12,266 - trainer - INFO - abs_rel_sparse_metric: 2.3123958333333334
2022-05-21 23:12:12,272 - trainer - INFO - rmse_sparse_metric: 2.3123958333333334
2022-05-21 23:12:12,278 - trainer - INFO - loss_iou : 0.27149088541666666
2022-05-21 23:12:12,283 - trainer - INFO - loss_prec : 0.27367930094401044
2022-05-21 23:12:12,288 - trainer - INFO - loss_loss : 0.0676910400390625
2022-05-21 23:12:12,293 - trainer - INFO - loss_acc : 0.9849922688802083
2022-05-21 23:12:12,297 - trainer - INFO - loss_rec : 0.9895144653320312
2022-05-21 23:12:12,300 - trainer - INFO - val_loss : 12.180131924545599
2022-05-21 23:12:12,304 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-21 23:12:12,308 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-21 23:12:12,312 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-21 23:12:12,316 - trainer - INFO - val_loss_iou : 0.08412468433380127
2022-05-21 23:12:12,320 - trainer - INFO - val_loss_prec : 0.15997138288285997
2022-05-21 23:12:12,325 - trainer - INFO - val_loss_loss : 12.180132548014322
2022-05-21 23:12:12,328 - trainer - INFO - val_loss_acc : 0.9604445563422309
2022-05-21 23:12:12,331 - trainer - INFO - val_loss_rec : 0.2399449348449707
2022-05-21 23:12:12,700 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-21 23:48:32,490 - trainer - INFO - epoch : 8
2022-05-21 23:48:32,503 - trainer - INFO - loss : 0.06005031946408053
2022-05-21 23:48:32,553 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-21 23:48:32,580 - trainer - INFO - abs_rel_sparse_metric: 2.2465625
2022-05-21 23:48:32,610 - trainer - INFO - rmse_sparse_metric: 2.2465625
2022-05-21 23:48:32,640 - trainer - INFO - loss_iou : 0.2933336893717448
2022-05-21 23:48:32,670 - trainer - INFO - loss_prec : 0.29533246358235676
2022-05-21 23:48:32,701 - trainer - INFO - loss_loss : 0.06005029678344727
2022-05-21 23:48:32,733 - trainer - INFO - loss_acc : 0.9866059366861979
2022-05-21 23:48:32,764 - trainer - INFO - loss_rec : 0.9910132853190105
2022-05-21 23:48:32,795 - trainer - INFO - val_loss : 14.821350382227036
2022-05-21 23:48:32,825 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-21 23:48:32,855 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-21 23:48:32,866 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-21 23:48:32,870 - trainer - INFO - val_loss_iou : 0.08032339811325073
2022-05-21 23:48:32,885 - trainer - INFO - val_loss_prec : 0.2632046275668674
2022-05-21 23:48:32,913 - trainer - INFO - val_loss_loss : 14.82135009765625
2022-05-21 23:48:32,939 - trainer - INFO - val_loss_acc : 0.9833286073472765
2022-05-21 23:48:32,964 - trainer - INFO - val_loss_rec : 0.14326214790344238
2022-05-21 23:48:33,342 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-22 00:24:59,842 - trainer - INFO - epoch : 9
2022-05-22 00:24:59,843 - trainer - INFO - loss : 0.058349815968667826
2022-05-22 00:24:59,844 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-22 00:24:59,845 - trainer - INFO - abs_rel_sparse_metric: 2.221875
2022-05-22 00:24:59,846 - trainer - INFO - rmse_sparse_metric: 2.221875
2022-05-22 00:24:59,847 - trainer - INFO - loss_iou : 0.3101000722249349
2022-05-22 00:24:59,847 - trainer - INFO - loss_prec : 0.3118464660644531
2022-05-22 00:24:59,848 - trainer - INFO - loss_loss : 0.058349742889404296
2022-05-22 00:24:59,849 - trainer - INFO - loss_acc : 0.9872409057617187
2022-05-22 00:24:59,850 - trainer - INFO - loss_rec : 0.9917488606770833
2022-05-22 00:24:59,851 - trainer - INFO - val_loss : 12.517815000481075
2022-05-22 00:24:59,851 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-22 00:24:59,852 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-22 00:24:59,853 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-22 00:24:59,854 - trainer - INFO - val_loss_iou : 0.07787870698504978
2022-05-22 00:24:59,855 - trainer - INFO - val_loss_prec : 0.2857825491163466
2022-05-22 00:24:59,856 - trainer - INFO - val_loss_loss : 12.517814636230469
2022-05-22 00:24:59,856 - trainer - INFO - val_loss_acc : 0.9768172370062934
2022-05-22 00:24:59,857 - trainer - INFO - val_loss_rec : 0.13944166236453587
2022-05-22 00:25:00,027 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-22 01:01:40,023 - trainer - INFO - epoch : 10
2022-05-22 01:01:40,027 - trainer - INFO - loss : 0.0482115954470828
2022-05-22 01:01:40,037 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-22 01:01:40,048 - trainer - INFO - abs_rel_sparse_metric: 1.9503125
2022-05-22 01:01:40,059 - trainer - INFO - rmse_sparse_metric: 1.9503125
2022-05-22 01:01:40,069 - trainer - INFO - loss_iou : 0.3312089538574219
2022-05-22 01:01:40,078 - trainer - INFO - loss_prec : 0.3328692626953125
2022-05-22 01:01:40,087 - trainer - INFO - loss_loss : 0.04821149190266927
2022-05-22 01:01:40,096 - trainer - INFO - loss_acc : 0.9895101928710938
2022-05-22 01:01:40,105 - trainer - INFO - loss_rec : 0.9922865804036458
2022-05-22 01:01:40,113 - trainer - INFO - val_loss : 13.955339850650894
2022-05-22 01:01:40,122 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-22 01:01:40,131 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-22 01:01:40,141 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-22 01:01:40,150 - trainer - INFO - val_loss_iou : 0.0780107511414422
2022-05-22 01:01:40,159 - trainer - INFO - val_loss_prec : 0.26500195927090114
2022-05-22 01:01:40,168 - trainer - INFO - val_loss_loss : 13.95534176296658
2022-05-22 01:01:40,176 - trainer - INFO - val_loss_acc : 0.9831354353162978
2022-05-22 01:01:40,184 - trainer - INFO - val_loss_rec : 0.13859159416622585
2022-05-22 01:01:40,399 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-22 01:38:00,857 - trainer - INFO - epoch : 11
2022-05-22 01:38:00,858 - trainer - INFO - loss : 0.04933202152824833
2022-05-22 01:38:00,859 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-22 01:38:00,859 - trainer - INFO - abs_rel_sparse_metric: 2.0902083333333334
2022-05-22 01:38:00,860 - trainer - INFO - rmse_sparse_metric: 2.0902083333333334
2022-05-22 01:38:00,861 - trainer - INFO - loss_iou : 0.3328869883219401
2022-05-22 01:38:00,862 - trainer - INFO - loss_prec : 0.33459765116373696
2022-05-22 01:38:00,863 - trainer - INFO - loss_loss : 0.04933205286661784
2022-05-22 01:38:00,864 - trainer - INFO - loss_acc : 0.9893855794270834
2022-05-22 01:38:00,865 - trainer - INFO - loss_rec : 0.9925393676757812
2022-05-22 01:38:00,866 - trainer - INFO - val_loss : 16.757661278049152
2022-05-22 01:38:00,867 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-22 01:38:00,868 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-22 01:38:00,869 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-22 01:38:00,870 - trainer - INFO - val_loss_iou : 0.06531501478619045
2022-05-22 01:38:00,871 - trainer - INFO - val_loss_prec : 0.3991677496168349
2022-05-22 01:38:00,871 - trainer - INFO - val_loss_loss : 16.75765821668837
2022-05-22 01:38:00,872 - trainer - INFO - val_loss_acc : 0.9846226374308268
2022-05-22 01:38:00,873 - trainer - INFO - val_loss_rec : 0.10530128081639607
2022-05-22 01:38:01,035 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-22 02:14:19,960 - trainer - INFO - epoch : 12
2022-05-22 02:14:19,968 - trainer - INFO - loss : 0.04878552877263549
2022-05-22 02:14:19,987 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-22 02:14:19,999 - trainer - INFO - abs_rel_sparse_metric: 2.0572916666666665
2022-05-22 02:14:20,012 - trainer - INFO - rmse_sparse_metric: 2.0572916666666665
2022-05-22 02:14:20,025 - trainer - INFO - loss_iou : 0.34831891377766927
2022-05-22 02:14:20,042 - trainer - INFO - loss_prec : 0.35023468017578124
2022-05-22 02:14:20,057 - trainer - INFO - loss_loss : 0.04878554026285807
2022-05-22 02:14:20,073 - trainer - INFO - loss_acc : 0.9892347208658854
2022-05-22 02:14:20,087 - trainer - INFO - loss_rec : 0.9926947021484375
2022-05-22 02:14:20,101 - trainer - INFO - val_loss : 14.85178346435229
2022-05-22 02:14:20,116 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-22 02:14:20,129 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-22 02:14:20,228 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-22 02:14:20,245 - trainer - INFO - val_loss_iou : 0.07334701220194499
2022-05-22 02:14:20,266 - trainer - INFO - val_loss_prec : 0.3393416934543186
2022-05-22 02:14:20,287 - trainer - INFO - val_loss_loss : 14.851782904730904
2022-05-22 02:14:20,308 - trainer - INFO - val_loss_acc : 0.9831968943277994
2022-05-22 02:14:20,331 - trainer - INFO - val_loss_rec : 0.12560137112935385
2022-05-22 02:14:20,523 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-22 02:50:43,264 - trainer - INFO - epoch : 13
2022-05-22 02:50:43,266 - trainer - INFO - loss : 0.04147944352905446
2022-05-22 02:50:43,267 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-22 02:50:43,269 - trainer - INFO - abs_rel_sparse_metric: 1.975
2022-05-22 02:50:43,270 - trainer - INFO - rmse_sparse_metric: 1.975
2022-05-22 02:50:43,272 - trainer - INFO - loss_iou : 0.3653294626871745
2022-05-22 02:50:43,273 - trainer - INFO - loss_prec : 0.3668243408203125
2022-05-22 02:50:43,273 - trainer - INFO - loss_loss : 0.04147947947184245
2022-05-22 02:50:43,274 - trainer - INFO - loss_acc : 0.9910907999674479
2022-05-22 02:50:43,275 - trainer - INFO - loss_rec : 0.9932733154296876
2022-05-22 02:50:43,276 - trainer - INFO - val_loss : 10.780205345609122
2022-05-22 02:50:43,277 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-22 02:50:43,278 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-22 02:50:43,279 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-22 02:50:43,280 - trainer - INFO - val_loss_iou : 0.08617289861043294
2022-05-22 02:50:43,281 - trainer - INFO - val_loss_prec : 0.17777089277903238
2022-05-22 02:50:43,282 - trainer - INFO - val_loss_loss : 10.780204772949219
2022-05-22 02:50:43,283 - trainer - INFO - val_loss_acc : 0.9674543804592557
2022-05-22 02:50:43,284 - trainer - INFO - val_loss_rec : 0.2760783831278483
2022-05-22 02:50:43,848 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-22 03:27:08,776 - trainer - INFO - epoch : 14
2022-05-22 03:27:08,777 - trainer - INFO - loss : 0.03722713669809006
2022-05-22 03:27:08,779 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-22 03:27:08,780 - trainer - INFO - abs_rel_sparse_metric: 1.8515625
2022-05-22 03:27:08,782 - trainer - INFO - rmse_sparse_metric: 1.8515625
2022-05-22 03:27:08,784 - trainer - INFO - loss_iou : 0.3863360850016276
2022-05-22 03:27:08,785 - trainer - INFO - loss_prec : 0.38789883931477864
2022-05-22 03:27:08,786 - trainer - INFO - loss_loss : 0.0372270933787028
2022-05-22 03:27:08,788 - trainer - INFO - loss_acc : 0.9919660441080729
2022-05-22 03:27:08,789 - trainer - INFO - loss_rec : 0.9940956624348959
2022-05-22 03:27:08,790 - trainer - INFO - val_loss : 15.135421420551008
2022-05-22 03:27:08,791 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-22 03:27:08,792 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-22 03:27:08,793 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-22 03:27:08,794 - trainer - INFO - val_loss_iou : 0.08546278874079387
2022-05-22 03:27:08,795 - trainer - INFO - val_loss_prec : 0.3615957631005181
2022-05-22 03:27:08,796 - trainer - INFO - val_loss_loss : 15.13542005750868
2022-05-22 03:27:08,797 - trainer - INFO - val_loss_acc : 0.9845504760742188
2022-05-22 03:27:08,798 - trainer - INFO - val_loss_rec : 0.13796117570665148
2022-05-22 03:27:09,118 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...
2022-05-22 04:03:31,666 - trainer - INFO - epoch : 15
2022-05-22 04:03:31,672 - trainer - INFO - loss : 0.034510827331791914
2022-05-22 04:03:31,679 - trainer - INFO - a1_sparse_metric: 0.0
2022-05-22 04:03:31,687 - trainer - INFO - abs_rel_sparse_metric: 1.9503125
2022-05-22 04:03:31,695 - trainer - INFO - rmse_sparse_metric: 1.9503125
2022-05-22 04:03:31,703 - trainer - INFO - loss_iou : 0.3962159729003906
2022-05-22 04:03:31,711 - trainer - INFO - loss_prec : 0.3980547078450521
2022-05-22 04:03:31,720 - trainer - INFO - loss_loss : 0.03451083501180013
2022-05-22 04:03:31,729 - trainer - INFO - loss_acc : 0.9925803629557292
2022-05-22 04:03:31,739 - trainer - INFO - loss_rec : 0.9941372680664062
2022-05-22 04:03:31,749 - trainer - INFO - val_loss : 16.985714921520817
2022-05-22 04:03:31,754 - trainer - INFO - val_a1_sparse_metric: 0.0
2022-05-22 04:03:31,765 - trainer - INFO - val_abs_rel_sparse_metric: 79.0
2022-05-22 04:03:31,774 - trainer - INFO - val_rmse_sparse_metric: 79.0
2022-05-22 04:03:31,784 - trainer - INFO - val_loss_iou : 0.0905357797940572
2022-05-22 04:03:31,795 - trainer - INFO - val_loss_prec : 0.2897793451944987
2022-05-22 04:03:31,805 - trainer - INFO - val_loss_loss : 16.985714382595486
2022-05-22 04:03:31,815 - trainer - INFO - val_loss_acc : 0.9830976062350802
2022-05-22 04:03:31,826 - trainer - INFO - val_loss_rec : 0.16758330663045248
2022-05-22 04:03:32,004 - trainer - INFO - Saving checkpoint: saved/models/monorec_mask/00/checkpoint.pth ...

run creat_pointcloud.py

Hi author:
I faced the following problem when running creat_pointcloud.py.

Problem:
RuntimeError: The NVIDIA driver on your system is too old.

Environment on my computer:
Ubuntu 16.04 LTS, GEFORCE RTX 2060, CUDA 10.1, NVIDIA driver: 430.64.

So, Would you please tell me the required environment config?
Expecting for your reply.

Tool for reconstructed scene visualization

Hi!

Thanks for this amazing job :)
I am wondering if you could kindly inform which tool you are using for the reconstructed scene visualization, it looks really cool !
Thanks!

Inference on nuScenes dataset

Hi! Thanks for the great work and open-sourced code!
I tried to test the inference result of the model on the nuScenes dataset using only the pre-trained model, but the inference results are bad, can you tell me how to use the nuScenes data for inference correctly? Or should I retrain the model using the nuScenes data?
Can you give me some suggestions? Thanks!

Multi card training

Hi!
The training process takes longer than I expect. Is it possible to use multi card in one machine to train the model?
I tried to add --device in the args but still only one card is working.
Thank you!

TypeError: __init__() got an unexpected keyword argument 'att_cp_loc'

Hi. Thanks for your job. When I run the last training
python train_monorec.py --config configs/train/monorec/monorec_depth_ref.json --options stereo stereo_repr ,
there was a TypeError:

Traceback (most recent call last):
File "train_monorec.py", line 70, in
main(config, config.args.options)
File "train_monorec.py", line 25, in main
model = config.initialize('arch', module_arch)
File "/home/ubuntu/space/MonoRec-main/utils/parse_config.py", line 81, in initialize
return getattr(module, module_name)(*args, **module_args)
TypeError: init() got an unexpected keyword argument 'att_cp_loc'

And whatever I run any training , there often appears:

Ground truth poses are not avaialble for sequence 01.
Ground truth poses are not avaialble for sequence 02.
Ground truth poses are not avaialble for sequence 06.
Ground truth poses are not avaialble for sequence 08.
Ground truth poses are not avaialble for sequence 09.
Ground truth poses are not avaialble for sequence 10.
Ground truth poses are not avaialble for sequence 00.
Ground truth poses are not avaialble for sequence 04.
Ground truth poses are not avaialble for sequence 05.
Ground truth poses are not avaialble for sequence 07.

IndexError: list index out of range

Hi Felix,
When I'm running the train.py file, the progress bar stays at 0% for the first epoch of training ,and an error ”IndexError: list index out of range” occurs after running for a while,could you tell me how to solve this problem,thank you.

Usage of "--options" in training command

Hi:

Thank you for the wonderful work, I have a question about the usage of the "--options" in training commands.
My understanding is the "--options" are used for the loss functions in "model/loss_functions/monorec_loss.py". but since the monorec_mask.json has defined the loss function as "mask_loss", which does not have the "stereo" option, why do we need to add "--options stereo" to train the mask module?
python train_monorec.py --config configs/train/monorec/monorec_mask.json --options stereo

Any clarification is greatly appreciated!

TUM RGBD Point Cloud Issues

I tried to run the TUM RGBD Dataset using the following config file

{
  "name": "Pointcloud Creation",
  "n_gpu": 8,
  "output_dir": "saved/pointclouds/monorec",
  "file_name": "tmrgbd.ply",
  "roi": [
    160,
    480,
    40,
    600
  ],
  "start": 0,
  "end": 300,
  "max_d": 20,
  "use_mask": true,
  "arch": {
    "type": "MonoRecModel",
    "args": {
      "pretrain_mode": 0,
      "checkpoint_location": [
          "saved/checkpoints/monorec_depth_ref.pth"
      ]
    }
  },
  "data_set": {
    "type": "TUMRGBDDataset",
    "args": {
      "dataset_dir": "data/tumrgbd/",
      "frame_count": 2,
      "target_image_size": [
        480,
        640
      ],
      "dilation": 1
    }
  }
}

Dataset: https://vision.in.tum.de/rgbd/dataset/freiburg3/rgbd_dataset_freiburg3_long_office_household.tgz

I'm using FR3 dataset since the intrinsic parameters were set already for this in the module

_intrinsics = torch.tensor(
[[535.4, 0, 320.1, 0],
[0, 539.2, 247.6, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]
], dtype=torch.float32)

But these are the results i get from the point cloud

Screenshot from 2021-07-13 17-47-48
Screenshot from 2021-07-13 17-47-57
Screenshot from 2021-07-13 17-48-16

This was the output on console


<THE MODEL ARCH OUTPUT>

        (1): ConvReLU2(
          (pad_0): PadSameConv2d()
          (conv_y): Conv2d(64, 64, kernel_size=(3, 1), stride=(1, 1))
          (leaky_relu): LeakyReLU(negative_slope=0.1)
          (pad_1): PadSameConv2d()
          (conv_x): Conv2d(64, 64, kernel_size=(1, 3), stride=(1, 1))
        )
      )
      (3): Refine(
        (conv2d_t): ConvTranspose2d(192, 48, kernel_size=(4, 4), stride=(2, 2))
        (pad): PadSameConv2dTransposed()
        (leaky_relu): LeakyReLU(negative_slope=0.1)
      )
      (4): Sequential(
        (0): ConvReLU2(
          (pad_0): PadSameConv2d()
          (conv_y): Conv2d(96, 32, kernel_size=(3, 1), stride=(1, 1))
          (leaky_relu): LeakyReLU(negative_slope=0.1)
          (pad_1): PadSameConv2d()
          (conv_x): Conv2d(32, 32, kernel_size=(1, 3), stride=(1, 1))
        )
        (1): PadSameConv2d()
        (2): Conv2d(32, 24, kernel_size=(3, 3), stride=(1, 1))
        (3): LeakyReLU(negative_slope=0.1)
      )
    )
    (predictors): ModuleList(
      (0): Sequential(
        (0): PadSameConv2d()
        (1): Conv2d(256, 1, kernel_size=(3, 3), stride=(1, 1))
      )
      (1): Sequential(
        (0): PadSameConv2d()
        (1): Conv2d(128, 1, kernel_size=(3, 3), stride=(1, 1))
      )
      (2): Sequential(
        (0): PadSameConv2d()
        (1): Conv2d(64, 1, kernel_size=(3, 3), stride=(1, 1))
      )
      (3): Sequential(
        (0): PadSameConv2d()
        (1): Conv2d(24, 1, kernel_size=(3, 3), stride=(1, 1))
      )
    )
  )
)
  0%|                                                                   | 0/300 [00:00<?, ?it/s]/opt/conda/conda-bld/pytorch_1587428398394/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.
/opt/conda/conda-bld/pytorch_1587428398394/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.
/opt/conda/conda-bld/pytorch_1587428398394/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.
/opt/conda/conda-bld/pytorch_1587428398394/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.
/opt/conda/conda-bld/pytorch_1587428398394/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.
/opt/conda/conda-bld/pytorch_1587428398394/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.
/opt/conda/conda-bld/pytorch_1587428398394/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.
/opt/conda/conda-bld/pytorch_1587428398394/work/torch/csrc/utils/tensor_numpy.cpp:141: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program.
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:3226: UserWarning: Default grid_sample and affine_grid behavior has changed to align_corners=False since 1.3.0. Please specify align_corners=True if the old behavior is desired. See the documentation of grid_sample for details.
  warnings.warn("Default grid_sample and affine_grid behavior has changed "
/opt/conda/lib/python3.7/site-packages/torch/nn/functional.py:1558: UserWarning: nn.functional.tanh is deprecated. Use torch.tanh instead.
  warnings.warn("nn.functional.tanh is deprecated. Use torch.tanh instead.")
100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 300/300 [02:03<00:00,  2.42it/s]

These are my dependencies

torchvision==0.6.0
jupyterlab
tensorboard
tensorboardX
opencv-python==4.2.0.34
scipy==1.7.0
kornia==0.3.2
scikit-image==0.18.2
tqdm==4.61.2
pykitti==0.3.1
colour-demosaicing==0.1.6

and i'm using Pytorch 1.5.0 from docker pytorch/pytorch:1.5-cuda10.1-cudnn7-devel

Training without moving objects masks

Hello!
Thanks for the interesting work. Now, I'm in the process of training the model in my own dataset.
The dataset now contains stereo images with their depth image, and also sparse depth images with camera positions generated by DSO as you suggested.
My dataset does not have such masks of moving objects like those used in KITTI example.
So my question is, is it possible to train the model without masks? and if yes what will be the correct steps to train it without taking into account these masks?
Thanks in advance

Why set temporal frames from default 2 to 10 when 'lidar_depth' and 'annotated_lidar' are enabled?

Hi,

Thank you al lot for your impressive work.

I found a place that I do not understand. As shown in the code below

if self.annotated_lidar and self.lidar_depth:
extra_frames = max(extra_frames, 10)
self._offset = max(self._offset, 5)

why set the nearby frames from default 2 to at least 10 when 'lidar_depth' and 'annotated_lidar' are enabled. Is it a general setting for both training and testing or only a setting for testing? Thank you!

I think this should do the trick without downgrading your `torchvision`.

I think this should do the trick without downgrading your torchvision.

class ColorJitterMulti(torchvision.transforms.ColorJitter):
    def fix_transform(self):
        self.params = self.get_params(self.brightness, self.contrast, self.saturation, self.hue)

    def __call__(self, img):
        fn_idx, brightness_factor, contrast_factor, saturation_factor, hue_factor = self.params
        for fn_id in fn_idx:
            if fn_id == 0 and brightness_factor is not None:
                img = F.adjust_brightness(img, brightness_factor)
            elif fn_id == 1 and contrast_factor is not None:
                img = F.adjust_contrast(img, contrast_factor)
            elif fn_id == 2 and saturation_factor is not None:
                img = F.adjust_saturation(img, saturation_factor)
            elif fn_id == 3 and hue_factor is not None:
                img = F.adjust_hue(img, hue_factor)

        return img

Originally posted by @Brummi in #30 (comment)

Can I infer the model without image_depth_annotated?

Hi! Thanks for the great work and open-sourced code!

I successfully run the "test_monorec.py" and "create_pointcloud.py" scripts. When I try to run "create_pointcloud.py" on KITTI sequence 11-21, I found "image_depth_annotated" is needed which is processed from the lidar data. I checked the code and found this is needed in function "preprocess_depth_dso()".

I wish to know what's the purpose of using "image_depth_annotated" here and can I infer the model without "image_depth_annotated"? I guess we can, as the required inputs for inferring should only be [sequential mono images, estimated poses and sparse depths] from an existing VO system. Please give me some hints if I can do this by easily modifying some code and let me know if I misunderstood anything. Thanks!

Do not understand the difference between Dts and Dt

Thanks for your outstanding work in this area, it can be applied to a lot of places. But I do not understand one thing in the paper that you write below

"Specifically, for each image, we predict its depth maps Dt and DSt using
the cost volumes formed by temporal stereo images C and static stereo images CS, respectively."
(The location is in Muti-stage Training --> Bootstrapping --> Mask Module below, if you could not find it you can use search button in your PDF reader to find the sentence)

Can you tell me what does "static stereo images" mean and where are they implemented in your code?
By the way, if convenient, can you send me the code about how to generate moving objection mask ?
My email is "[email protected]" and you can send the code to it,

Thanks for your great contribution again.
best wishes

poses and dso_poses

  1. In the training stage, do you only use the dso_poses? What the role of poses?
  2. If I use the ScanNet dataset for training, what should I prepare?
  3. The color images of ScanNet is monocular, If I only train the depth moudule, is that ok?
  4. Can I use the lidar depth (annotated) for training? I mean just have one set of depth gt and pose, is that ok?

TypeError: 'tuple' object is not callable

when train the model with command python train.py --config configs/train/monorec/monorec_depth.json --options stereo, get an error:

Traceback (most recent call last):
  File "/home/hyel/python_work/MonoRec/train.py", line 70, in <module>
    main(config, config.args.options)
  File "/home/hyel/python_work/MonoRec/train.py", line 49, in main
    trainer.train()
  File "/home/hyel/python_work/MonoRec/base/base_trainer.py", line 73, in train
    result = self._train_epoch(epoch)
  File "/home/hyel/python_work/MonoRec/trainer/trainer.py", line 84, in _train_epoch
    for batch_idx, (data, target) in enumerate(self.data_loader):
  File "/home/hyel/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in __next__
    data = self._next_data()
  File "/home/hyel/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1199, in _next_data
    return self._process_data(data)
  File "/home/hyel/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1225, in _process_data
    data.reraise()
  File "/home/hyel/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/_utils.py", line 429, in reraise
    raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/hyel/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 202, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/hyel/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/hyel/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/hyel/python_work/MonoRec/data_loader/kitti_odometry_dataset.py", line 250, in __getitem__
    self._crop_boxes[dataset_index])
  File "/home/hyel/python_work/MonoRec/data_loader/kitti_odometry_dataset.py", line 126, in preprocess_image
    img = self.color_transform(img)
  File "/home/hyel/python_work/MonoRec/data_loader/kitti_odometry_dataset.py", line 383, in __call__
    return map_fn(x, self.transform)
  File "/home/hyel/python_work/MonoRec/utils/util.py", line 22, in map_fn
    return fn(batch)
TypeError: 'tuple' object is not callable

modify function in kitti_odometry_dataset.py from:

class ColorJitterMulti(torchvision.transforms.ColorJitter):
    def fix_transform(self):
        self.transform = self.get_params(self.brightness, self.contrast,
                                         self.saturation, self.hue)

    def __call__(self, x):
        return map_fn(x, self.transform)

to

class ColorJitterMulti(torchvision.transforms.ColorJitter):
    def fix_transform(self):
        self.transform = self.get_params(self.brightness, self.contrast,
                                         self.saturation, self.hue)

    def __call__(self, x):
        return map_fn(x, self.forward)           # change self.transform to self.forward

can fix this error

this is a bug? is it right to modify the code like this?

DVSO pose

hello, I'd like to know the pose of the DVSO results, is which camera's of kiiti datasets, please.

Using ORB-SLAM as input for live reconstruction?

I've seen it suggested on other issues that this pairs well orb-slam. Would it be possible to do dense reconstruction in real time from Monocular ORB-SLAM output? The orb slam trajectory format looks different from the one used in the demo here.

How to prepare the 'image_depth_annotated' dataset?

Hi, thank you for sharing your great work.

I can create point clouds using create_pointcloud.py with 'image_depth_sparse' folder (downloaded DVSO) as depth_folder.
But I'm not sure this is correct.
Do I need to make image_depth_annotated images by myself?

Thanks in advance.

Inference on custom data

Hello,

Thank you for sharing your amazing work. I am just trying to inference on custom RGB data. But I checked in the example that the model is using ['keyframe', 'keyframe_pose', 'keyframe_intrinsics', 'frames', 'poses', 'intrinsics', 'sequence', 'image_id'] for inference. I am just curious to know is it possible to run the inference pipeline just for RGB data?
Thank you

Caught IndexError in DataLoader worker process 0

Hi,When I try to generate point cloud I am getting this error
Caught IndexError in DataLoader worker process 0
I changed the gpu num in the json but it didn't work, do you know what is causing it

How to train with other depth annotations

Hi,

I'm following your work and wish to train the model with the dense annotated depth. Can I simply set lidar_depth=true and dso_depth=false in monorec_depth.json file?

Moreover, in the .json file, the batchsize=8 and n_gpu=8, so is there totally 8*8=64 batches during the distributed learning? Since I do not have so many cards, so I cant train the model with 64 total batches a time. With your experience, should I set the learning rate with a higher value or try to train with more epochs than 70? Thank you a lot!

How to run on my own data

Hi.First of all, thank you for your efforts, and I am also surprised by the effect.
If I want to make predictions in my own data, how should I get the configs

train "monorec_depth_ref" problem

Question 1:

  • When I run the last training python train_monorec.py --config configs/train/monorec/monorec_depth_ref.json --options stereo stereo_repr
    it can't be continued.
    Train Epoch: 1 [12800/13714 (93%)] Loss: 0.340444 Loss_dict: {'sdl_0': tensor(0.0036, device='cuda:0'), 'static_md2l_0': tensor(0.0692, device='cuda:0'), 'dynamic_md2l_0': tensor(0.1163, device='cuda:0'), 'md2l_0': tensor(0.0693, device='cuda:0'), 'sdl_1': tensor(0.0036, device='cuda:0'), 'static_md2l_1': tensor(0.0695, device='cuda:0'), 'dynamic_md2l_1': tensor(0.1168, device='cuda:0'), 'md2l_1': tensor(0.0696, device='cuda:0'), 'sdl_2': tensor(0.0036, device='cuda:0'), 'static_md2l_2': tensor(0.0702, device='cuda:0'), 'dynamic_md2l_2': tensor(0.1136, device='cuda:0'), 'md2l_2': tensor(0.0703, device='cuda:0'), 'sdl_3': tensor(0.0039, device='cuda:0'), 'static_md2l_3': tensor(0.0722, device='cuda:0'), 'dynamic_md2l_3': tensor(0.1166, device='cuda:0'), 'md2l_3': tensor(0.0722, device='cuda:0'), 'loss': tensor(0.3404, device='cuda:0')}
    hold on here

Question 2:

  • Could you provide your GPU configuration and training time?

Nuscenes dataset inverse depth map generation from lidar file

Hi Felix,

I'm trying to run the monorec on the Nuscenes dataset. The visualization results seem decent. But the quantitative metrics are abnormal (e.g. Abs_rel_sparse_metric reaches 0.5) when I evaluate it.

I guess maybe the GT depth map caused this. I generate inverse depth maps for nuscenes dataloader by using 1/depth (pixel wise). Does this remain the same distribution as the training process?

Could you help me with this?

No module named 'model.layers'

when I run the scripts 'cd example' and 'python test_example.py', the following error occurred:
Traceback (most recent call last):
File "test_monorec.py", line 10, in
from data_loader.kitti_odometry_dataset import KittiOdometryDataset
File "../data_loader/kitti_odometry_dataset.py", line 13, in
from utils import map_fn
File "../utils/init.py", line 2, in
from .ply_utils import *
File "../utils/ply_utils.py", line 5, in
from model.layers import Backprojection
ModuleNotFoundError: No module named 'model.layers'

I don't know why, and the test_example.py line 7-8 already has the scripts:
import sys
sys.path.append("..")

Dataset for running example

We provide a sample from the KITTI Odometry test set and a script to run MonoRec on it in example/. To download the pretrained model  ...

As mentioned in the README, is there a link to the sample from the KITTI odometry dataset?

Realtime

Hello. I wanted to ask about how suitable MonoRec could be for realtime reconstruction. Any information will be greatly appreciated.

Mask Module Cost Volume Aggregation Method

Hi @Brummi and @nynyg:

Thank you for sharing the great work!
I'm curious about why you choose max-pooling to combine the multiple cost volumes for the mask module, is it only because the max-pooling works for arbitrary input frames numbers? or do you have more theoretical analysis or assumptions? have you tried other aggregation methods like SUM, AVG, CONCAT, etc..?

Thank you again!

Different evaluation results are obtained on different GPUs

Hi,
Thanks for your amazing work!

I recently encountered a problem, I ran the evaluation of the KITTI dataset on different GPUs and they got different results from the run.

The GPU I used are:

  1. Tesla P100-16GB (provided by Colab)
  2. RTX 3090-24GB (desktop)
  3. RTX 3060-8GB (laptop)

The version of dependency packages are:

torch                     1.11.0+cu113             pypi_0    pypi
torchaudio                0.11.0+cu113             pypi_0    pypi
torchvision               0.12.0+cu113             pypi_0    pypi
opencv-python             4.1.2.30                 pypi_0    pypi
python                    3.7.13               h12debd9_0
scipy                     1.4.1                    pypi_0    pypi
scikit-image              0.18.3                   pypi_0    pypi
pykitti                   0.3.1                    pypi_0    pypi
kornia                    0.6.4                    pypi_0    pypi

The final metric results are as follows:

## RTX 3090
'loss': 0.0, 'metrics': [0.15538795390363738, 2.7975851661019338, 5.625108310379362, 0.18630796428342675, 0.8499496491823949, 0.9419053045479213, 0.9715246053337913], 'metrics_correct': [0.1553879539036371, 2.797585166101927, 5.625108310379343, 0.18630796428342658, 0.849949649182393, 0.941905304547917, 0.9715246053337869], 'valid_batches': 4317.0, 'loss_loss': 0.0, 'metrics_info': ['abs_rel_sparse_metric', 'sq_rel_sparse_metric', 'rmse_sparse_metric', 'rmse_log_sparse_metric', 'a1_sparse_metric', 'a2_sparse_metric', 'a3_sparse_metric']

## RTX3060
loss': 0.0, 'metrics': [0.15538795390363738, 2.7975851661019338, 5.625108310379362, 0.18630796428342675, 0.8499496491823949, 0.9419053045479213, 0.9715246053337913], 'metrics_correct': [0.1553879539036371, 2.797585166101927, 5.625108310379343, 0.18630796428342658, 0.849949649182393, 0.941905304547917, 0.9715246053337869], 'valid_batches': 4317.0, 'loss_loss': 0.0, 'metrics_info': ['abs_rel_sparse_metric', 'sq_rel_sparse_metric', 'rmse_sparse_metric', 'rmse_log_sparse_metric', 'a1_sparse_metric', 'a2_sparse_metric', 'a3_sparse_metric']

## P100 (provided by colab)
'loss': 0.0, 'metrics': [0.050102306426327375, 0.29000417841479736, 2.2656219501116497, 0.08224765892285756, 0.9723125736912782, 0.9907187367026828, 0.9957081838727743], 'metrics_correct': [0.0501023064263273, 0.29000417841479575, 2.2656219501116426, 0.08224765892285742, 0.9723125736912774, 0.9907187367026803, 0.9957081838727698], 'valid_batches': 4317.0, 'loss_loss': 0.0, 'metrics_info': ['abs_rel_sparse_metric', 'sq_rel_sparse_metric', 'rmse_sparse_metric', 'rmse_log_sparse_metric', 'a1_sparse_metric', 'a2_sparse_metric', 'a3_sparse_metric']

Full evaluation logs are attached as bellow:
colab_p100_eval_log.txt
RTX3090_eval_log.txt
RTX3060_eval_log.txt

I found that 3090 and 3060 run with the same result, but it differs greatly from the result of P100, which is very closed to the result provided by the paper.

I did not change any code during the above evaluation process, and they use the same data set.

Have you encountered this problem?
Could you please give me some advice?

the setup of KITTI dataset

Hi, I was trying to reproduce the point clouds for KITTI Odometry dataset with the pre-trained model you provided. I am a bit confused about the setup of KITTI dataset for this.

You mentioned

"To setup KITTI Odometry, download the color images and calibration files from the official website (around 145 GB).

I wonder if this 145 GB is comprised of:
Download odometry data set (color, 65 GB)
Download odometry data set (velodyne laser data, 80 GB)?
at the official website.

For the task I described, (not to train the model, just to reproduce the point clouds), I have done the following:

  1. download odometry data set, including color (65 GB) and calibration files (1 MB) from the
  2. Replace the calib.txt and times.txt files in color (65 GB) with calib.txt and times.txt files in calibration files (1 MB)
  3. Download the pre-trained final model by running download_model.sh
    From the error I got, it seems I still missing '../../../data/dataset/sequences/00/image_depth_annotated'

Would be really appreciate it if you could let me know how can I got this. If possible, also correct me if I did anything wrong.

Thanks for your help

example script list index error

I was trying to run the example script, but ran into an error:

Traceback (most recent call last):
File "test_monorec.py", line 41, in <module>
batch, depth = dataset.__getitem__(index)
File "../data_loader/kitti_odometry_dataset.py", line 252, in __getitem__
keyframe_pose = torch.tensor(dataset.poses[index + self._offset], dtype=torch.float32)
IndexError: list index out of range

I found out that dataset poses for sequence 07 is an empty list. In this code below in kitti_odometry_dataset.py string constant is "poses_dvso". However, examples poses dir has name "poses_dso".

if self.use_dso_poses:
    for dataset in self._datasets:
          dataset.pose_path = self.dataset_dir / "poses_dvso"
          dataset._load_poses()

It seems that changing the constant to "poses_dso" fixes the error.

moving mask generation

Hi Felix,

Thank you very much for opening the source code!
could you please provide the code about how to generate the moving mask?

thanks!

attribute lookup CalibData on pykitti.odometry failed

Hi,when I set up the dataset,runing the order
python train.py --config configs/train/monorec/monorec_depth.json --options stereo
"enumerate(self.data_loader)" will alert pickle.PicklingError: Can't pickle <class 'pykitti.odometry.CalibData'>: attribute lookup CalibData on pykitti.odometry failed

DVSO keypoints as metric depth?

Hi, thanks for sharing this amazing work.

I have a question on the DVSO keypoints you provide in the readme. As far as I understood, they actually contain disparity values, which you convert to depth as follows:

depth = (w * depth / (0.54 * f_x * 65535))

The conversion depth = (baseline * focal) / disparity is fine for me, but I don't really understand how the width and the 65535 interact in this formula. Could you clarify a bit better how to obtain metric depth values from these keypoints?

Thank you in advance.

Vertex colors on PLY file

Hello,

Thanks for the repo! it really helped me learn a lot about 3d object detection.
I was able to build the PLY file, but when opening it, I'm only seeing gray scaled voxels in blender/meshlab

How were you guys able to visualize the colour information?

FileNotFoundError

Run "python train_monorec.py --config configs/train/monorec/monorec_mask.json --options stereo"

File "/home/user/monorec/MonoRec-main-test/data_loader/kitti_odometry_dataset.py", line 67, in init
with open(self.dataset_dir / "sequences" / sequence / (index_mask_name + ".json")) as f:
FileNotFoundError: [Errno 2] No such file or directory: '../data/dataset/sequences/01/index_mask.json'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.