Coder Social home page Coder Social logo

opendrivelab / occnet Goto Github PK

View Code? Open in Web Editor NEW
540.0 17.0 48.0 21.62 MB

[ICCV 2023] OccNet: Scene as Occupancy

Home Page: https://arxiv.org/abs/2306.02851

License: Apache License 2.0

Python 93.99% Shell 0.12% C++ 0.52% Cuda 5.37%
3d-object-detection autonomous-driving 3d-occupancy

occnet's Introduction

Occupancy and Flow Challenge

The tutorial of Occupancy and Flow track for CVPR 2024 Autonomous Grand Challenge.

Introduction

Understanding the 3D surroundings including the background stuffs and foreground objects is important for autonomous driving. In the traditional 3D object detection task, a foreground object is represented by the 3D bounding box. However, the geometrical shape of the object is complex, which can not be represented by a simple 3D box, and the perception of the background stuffs is absent. The goal of this task is to predict the 3D occupancy of the scene. In this task, we provide a large-scale occupancy benchmark based on the nuScenes dataset. The benchmark is a voxelized representation of the 3D space, and the occupancy state and semantics of the voxel in 3D space are jointly estimated in this task. The complexity of this task lies in the dense prediction of 3D space given the surround-view images.

News

Tip

๐ŸงŠ We release a 3D occupancy synthetic dataset LightwheelOcc, with dense occupancy and depth label and realistic sensor configuration simulating nuScenes dataset. Check it out!

  • 2024/07/12 Test server reopen.
  • 2024/06/01 The challenge wraps up.
  • 2024/04/09 We release the technical report of the new RayIoU metric, as well as a new occupancy method: SparseOcc.
  • 2024/03/14 We release a new version (openocc_v2.1) of the occupancy ground-truth, including some bug fixes regarding the occupancy flow. Delete the old version and download the new one! Please refer to getting_started for details.
  • 2024/03/01 The challenge begins.

Table of Contents

Task Definition

Given images from multiple cameras, the goal is to predict the semantics and flow of each voxel grid in the scene. The paticipants are required to submit their prediction on nuScenes OpenOcc test set.

Rules for Occupancy and Flow Challenge

  • We allow using annotations provided in the nuScenes dataset. During inference, the input modality of the model should be camera only.
  • No future frame is allowed during inference.
  • In order to check the compliance, we will ask the participants to provide technical reports to the challenge committee and the participant will be asked to provide a public talk about the method after winning the award.
  • Every submission provides method information. We encourage publishing code, but do not make it a requirement.
  • Each team can have at most one account on the evaluation server. Users that create multiple accounts to circumvent the rules will be excluded from the challenge.
  • Each team can submit at most three results per day during the challenge.
  • Any attempt to circumvent these rules will result in a permanent ban of the team or company from the challenge.

(back to top)

Evaluation Metrics

Leaderboard ranking for this challenge is by the Occupancy Score. It consists of two parts: Ray-based mIoU, and absolute velocity error for occupancy flow.

The implementation is here: projects/mmdet3d_plugin/datasets/ray_metrics.py

Ray-based mIoU

We use the well-known mean intersection-over-union (mIoU) metric. However, the elements of the set are now query rays, not voxels.

Specifically, we emulate LiDAR by projecting query rays into the predicted 3D occupancy volume. For each query ray, we compute the distance it travels before it intersects any surface. We then retrieve the corresponding class label and flow prediction.

We apply the same procedure to the ground-truth occupancy to obtain the groud-truth depth, class label and flow.

A query ray is classified as a true positive (TP) if the class labels coincide and the L1 error between the ground-truth depth and the predicted depth is less than either a certain threshold (e.g. 2m).

Let $C$ be he number of classes.

$$ mIoU=\frac{1}{C}\displaystyle \sum_{c=1}^{C}\frac{TP_c}{TP_c+FP_c+FN_c}, $$

where $TP_c$ , $FP_c$ , and $FN_c$ correspond to the number of true positive, false positive, and false negative predictions for class $c_i$.

We finally average over distance thresholds of {1, 2, 4} meters and compute the mean across classes.

For more details about this metric, please refer to the technical report.

AVE for Occupancy Flow

Here we measure velocity errors for a set of true positives (TP). We use a threshold of 2m distance.

The absolute velocity error (AVE) is defined for 8 classes ('car', 'truck', 'trailer', 'bus', 'construction_vehicle', 'bicycle', 'motorcycle', 'pedestrian') in m/s.

Occupancy Score

The final occupancy score is defined to be a weighted sum of mIoU and mAVE. Note that the velocity errors are converted to velocity scores as max(1 - mAVE, 0.0). That is,

OccScore = mIoU * 0.9 + max(1 - mAVE, 0.0) * 0.1

(back to top)

OpenOcc Dataset

Basic Information

  • The nuScenes OpenOcc dataset contains 17 classes. Voxel semantics for each sample frame is given as [semantics] in the labels.npz. Occupancy flow is given as [flow] in the labels.npz.
Type Info
train 28,130
val 6,019
test 6,008
cameras 6
voxel size 0.4m
range [-40m, -40m, -1m, 40m, 40m, 5.4m]
volume size [200, 200, 16]
#classes 0 - 16

Download

  1. Download the nuScenes dataset and put in into data/nuscenes

  2. Download our openocc_v2.1.zip and infos.zip from OpenDataLab or Google Drive

  3. Unzip them in data/nuscenes

Hierarchy

The hierarchy of folder data/nuscenes is described below:

nuscenes
โ”œโ”€โ”€ maps
โ”œโ”€โ”€ nuscenes_infos_train_occ.pkl
โ”œโ”€โ”€ nuscenes_infos_val_occ.pkl
โ”œโ”€โ”€ nuscenes_infos_test_occ.pkl
โ”œโ”€โ”€ openocc_v2
โ”œโ”€โ”€ samples
โ”œโ”€โ”€ v1.0-test
โ””โ”€โ”€ v1.0-trainval
  • openocc_v2 is the occuapncy GT.
  • nuscenes_infos_{train/val/test}_occ.pkl contains meta infos of the dataset.
  • Other folders are borrowed from the official nuScenes dataset.

Known Issues

  • nuScenes (issue #721) lacks translation in the z-axis, which makes it hard to recover accurate 6d localization and would lead to the misalignment of point clouds while accumulating them over whole scenes. Ground stratification occurs in several data.

(back to top)

Baseline

We provide a baseline model based on BEVFormer.

Please refer to getting_started for details.

(back to top)

Submission

Submission format

The submission must be a single dict with the following structure:

submission = {
    'method': '',                           <str> -- name of the method
    'team': '',                             <str> -- name of the team, identical to the Google Form
    'authors': ['']                         <list> -- list of str, authors
    'e-mail': '',                           <str> -- e-mail address
    'institution / company': '',            <str> -- institution or company
    'country / region': '',                 <str> -- country or region, checked by iso3166*
    'results': {
        [token]: {                          <str> -- frame (sample) token
            'pcd_cls'                       <np.ndarray> [N] -- predicted class ID, np.uint8,
            'pcd_dist'                      <np.ndarray> [N] -- predicted depth, np.float16,
            'pcd_flow'                      <np.ndarray> [N, 2] -- predicted flow, np.float16,
        },
        ...
    }
}

Below is an example of how to save the submission:

import pickle, gzip

with gzip.open('submission.gz', 'wb', compresslevel=9) as f:
    pickle.dump(submission, f, protocol=pickle.HIGHEST_PROTOCOL)

We provide example scripts based on mmdetection3d to generate the submission file, please refer to baseline for details.

(back to top)

Working with your own codebase

We understand that many participants may use your own codebases. Here, we provide a simple standlone package that converts your occupancy predictions to the submission format. Please follows the steps below:

  1. Save the prediction results on nuScenes OpenOcc val locally, in the same format as the occupancy ground truth.
  2. Perform ray projection locally and save the projection results.
cd tools/ray_iou
python ray_casting.py --pred-root your_prediction
  1. Test whether the evaluation on nuScenes OpenOcc val meets expectations locally.
python metric.py --pred output/my_pred_pcd.gz --gt output/nuscenes_infos_val_occ_pcd.gz
  1. Save and project the prediction results of nuScenes OpenOcc test according to steps 1 and 2, and upload them to the competition server.

License and Citation

If you use the challenge dataset in your paper, please consider citing OccNet with the following BibTex:

@article{sima2023_occnet,
    title={Scene as Occupancy},
    author={Chonghao Sima and Wenwen Tong and Tai Wang and Li Chen and Silei Wu and Hanming Deng and Yi Gu and Lewei Lu and Ping Luo and Dahua Lin and Hongyang Li},
    year={2023},
    eprint={2306.02851},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

If you use RayIoU as the evaluation metric, please consider citing the following BibTex:

@misc{liu2024fully,
    title={Fully Sparse 3D Occupancy Prediction}, 
    author={Haisong Liu and Yang Chen and Haiguang Wang and Zetong Yang and Tianyu Li and Jia Zeng and Li Chen and Hongyang Li and Limin Wang},
    year={2024},
    eprint={2312.17118},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

This dataset is under CC BY-NC-SA 4.0 license. Before using the dataset, you should register on the website and agree to the terms of use of the nuScenes. All code within this repository is under Apache 2.0 License.

(back to top)

occnet's People

Contributors

1349949 avatar afterthat97 avatar chonghaosima avatar faikit avatar hangzhaomit avatar hli2020 avatar sephyli avatar tongwwt avatar waveleaf27 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

occnet's Issues

Question about the result about other SOTA methods in the paper?

Hi! Thank you for your awesome contribution.

I notice you compared your method with other SOTA methods (BEVDet/BEVDepth/TPVFormer...) in the Table 3 of the paper. However, the repos of these methods do not provide their implement in OpenOcc dataset. Did you reproduce them in OpenOcc dataset. If so, any configs or code could provide?

joint training of det+occ would decrease the performance of occ?

In table 14 show BEVNet-R50 (bev_tiny_occ.py) give the 36.11 iou and 17.37 miou, since the paper didn't provide the occ performance of joint training model, i training the joint BEVNet-R50 (bev_tiny_det_occ.py) by myself, and i found the occ performance has been decreased! The miou is 17.32, but iou is 30.90 (from 36.11->30.90), is this result normal (Compare to bev_tiny_occ.py and bev_tiny_det_occ.py, the occ part seems to be not changed)? or i miss something important? Thanks!

Question about the implement of the lidar segmentation task

Thank you for your awesome contributions. Recently I've been trying to apply your work in lidar segmentation tasks. My question is how you generate the lidar segmentation result from the voxel output?

As you mentioned in your paper, i.e., "We transfer semantic occupancy prediction to LiDAR segmentation by assigning the point label based on associated voxel label, and then evaluate the model on the mIoU metric.", did you assign the labels of points by its located voxel directly? Or did you use some interpolation strategies? Such as, trilinear interpolation on the voxel logits like OccFormer.

Besides, did you train the networks again with sparse labeled voxels as supervision signal, which is generated by the sparse lidar segmentation ground truth?

Question about the sweeps folder

Hi๏ผŒin the dataset description part I noticed that you only describe the data in samples folder.
Does it mean the sweeps folder is not required for training?

Can you provide a pretrained model?

I have 2 gpus (RTX 4090), but "cuda out of memory" occurs even when learning a tiny model.
Please provide the checkpoint of the pretrained model.

Question about the this baseline

Hi, thanks to your great work.
I found the baseline code of this Challenge based on bevformer you provide does not have alignment operation between prev bev and current bev. In bevformer repository it is defined in "transformer.py". But in "transformerOcc.py" I do not find it. So can I ask why?

8ed535ae1a0db732e5235fac86233a2f

Creating occ gt for mini dataset

Dear OccNet Team,

thank you for your excellent work! I really adore that you release your occ data, but my computer does not have such great memory space for the whole nuscenes dataset, so I would prefer to use mini dataset first.

In my case, how can I get the occ_gt for mini dataset? Could you please release your code regarding generating occ_gt?

Thanks!

'python setup.py install' error

When I switched to the mmdet directory and executed the ' python setup.py install ' command, the following error occurred:

RuntimeError: Scikit-learn requires Python 3.9 or later. The current Python version is 3.8.18 installed in /home/fky/.conda/envs/open-mmlab/bin/python.

Questions about the joint detection and occupancy for OccNet?

Hi, thanks for sharing this wonderful code.

I want to ask whether the performances in Table 3 are obtained by using detection and occupancy joint training strategy or just use the occupancy prediction training.

BTW, I cannot find the joint training config for the OccNet in the folder of projects/configs/hybrid, could you provide the configs to obtain the performances in your Table 3 and Table 5?

Question about 'occ_invalid_path'?

Thanks for sharing the excellent work! The provided mask storied in 'occ_invalid_path' seems to be incorrect. In your code, '1' means a voxel can be observed in camera view and '0' means not. I use the following code to load the data:

voxel_num = 200 * 200 *16
voxel_size = [16, 200, 200]
# load data
occ_path = "xxx/000_occ.npy"
occ_invalid_path = "xxx/000_occ_invalid.npy"
occ_sparse = np.load(occ_path)
mask_sparse = np.load(occ_invalid_path)
# gather data
occ = np.zeros(voxel_num)
mask = np.ones(voxel_num)
occ[occ_sparse[:, 0]] = occ_sparse[:, 1] + 1
mask[mask_sparse] = 0
occ = occ.reshape(*voxel_size).transpose(2, 1, 0)
mask = mask.reshape(*voxel_size).transpose(2, 1, 0).astype(np.bool)
# validate the mask
occupied_voxel_num = (occ!=0).sum()
camera_occupied_voxel_num = (occ[mask]!=0).sum()
print(f'occupied voxel num :{occupied_voxel_num}')
print(f'camera occupied voxel num :{camera_occupied_voxel_num}')

I got outputs as follows:

occupied voxel num : 62363
camera occupied voxel num : 62363

That's means all the voxels can be observed in camera view and seems to be something wrong with mask in 'occ_invalid_path'. And I also visualize the mask:
Screenshot 2023-08-25 at 17 04 19
I really want to know is the mask provided in the OpenOcc dataset correct?

Are the evaluation results on the val normal?

image
What's more, I find OSError: [Errno 16] Device or resource busy: '.nfs000000002d363caa000002ee' while evaluating though it doesn't influence the evaluation. Have you ever had this happen?
image

error when testing

I train the baseline with 1 A100-40G๏ผŒusing ./tools/dist_train.sh ./projects/configs/bevformer/bevformer_base_occ.py 1.
After 24epoch๏ผŒI tried to use ./tools/dist_test.py ./projects/configs/bevformer/bevformer_base_occ.py work_dirs/bevformer_base_occ/epoch_24.pth 1.
After loading checkpoint and evaluate for 6019tasks, I saw the memory increased from18G to 42G, and suddenly it got error: torch.distributed.elastic.multiprocessing.api:failed.
So how can I fix this.

The problem of occupancy prediction by bevformer

Hello,

occupancy_classes is set to 16 in the paper, but there are 17 labels in gt_occ (consider free space). So I wanted to ask whether the first (from the original) or the second is more appropriate.

occupancy_preds = occupancy_preds.view(-1, self.occupancy_classes)
occupancy_preds = occupancy_preds.view(-1, self.occupancy_classes + 1)

I also found that both options can be trained and the initial loss of the first is much lower than that of the second.

600643356810c8d30d29e95ec0568ef
f45e8dafad3e945b8b8924a628540db

About Planning

Hi Author, Nice work!
When I read your code, I found that there is no part about Planning module, which is Table 7 in your paper, can you please provide the source code for this part๏ผŒThanks๏ผ

maybe cascade voxel decoder is not necessary?

After reading the paper of Scene as Occupancy I am troubled by the effectiveness of cascade voxel decoder. When I come out to read the code OCCNET, I cannot find the implementation of cascade voxel decoder, perhaps the decoder is not necessary ?

AssertionError: The length of results is not equal to the dataset len: 12038 != 6019

I only have one gpu, so I run this

python tools/train.py projects/configs/hybrid/hybrid_tiny_occ.py --gpu-ids 0

but when finish the first epoch, there is error

2024-02-19 12:35:12,600 - mmdet - INFO - Saving checkpoint at 1 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 12038/6019, 4.9 task/s, elapsed: 2469s, ETA: -1233sTraceback (most recent call last):
  File "tools/train.py", line 257, in <module>
    main()
  File "tools/train.py", line 253, in main
    meta=meta)

many lines...

miniconda3/envs/occ/lib/python3.7/site-packages/mmdet3d-0.18.1-py3.7-linux-x86_64.egg/mmdet3d/datasets/nuscenes_dataset.py", line 443, in format_results
    format(len(results), len(self)))
AssertionError: The length of results is not equal to the dataset len: 12038 != 6019

using collect_env()

sys.platform: linux
Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0]
CUDA available: True
GPU 0: NVIDIA RTX A6000
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.3.r11.3/compiler.29920130_0
GCC: gcc (Ubuntu 9.5.0-1ubuntu1~22.04) 9.5.0
PyTorch: 1.10.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 11.3
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  - CuDNN 8.2
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.2.0, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.11.0
OpenCV: 4.9.0
MMCV: 1.4.0
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.3

Could any one provide some advise?

Submission for challenge2024 meets erros.

I used to run the baseline code and got the submission.gz file. Then I submitted this file for the challenge20204 but met this error.
1713334218987
And here's my model card which was created for the submission.
1713334523432

Question about the motivation of OccNet dataset

Hi there, firstly thank you to your team for contributing significant work for the community.
I want to ask about the motivation for your dataset, compared to others such as Occ3D[1], OpenOccupancy[2]. I see that your dataset only annotate 16 classes + 1 unknown class. As i'm aware the unknown class contains free space and some long tail classes in nuscene-lidarseg (animals, firetruck, etc). Why do your team choose this approach compared to seperating long tail classes to general objects (GO) and a different free class.
Best regards.
[1] Occ3D: A Large-Scale 3D Occupancy Prediction Benchmark for Autonomous Driving
[2] OpenOccupancy: A Large Scale Benchmark for Surrounding Semantic Occupancy Perception

code release for GT occupancy and flow annotation

Great work! When do you plan to release your code for GT occupancy and flow annotations? Or, do you know any other open-sourced GT occupancy generation codes? We'd like to transfer other datasets in a similar way.

KeyError when doing "2.Prepared nuScenes 3D detection data" with create_data.py

Environment

docker container, with image
sudo nvidia-docker run -it -p 7600:7600 -p 8022:22 --name="occnet" -v /home/apan/occnet_ws:/home/occnet_ws -it pytorch/pytorch:1.10.0-cuda11.3-cudnn8-devel /bin/bash, python==3.7

Problem

when doing step '2.Prepared nuScenes 3D detection data' with create_data.py, there would be a KeyError:

v1.0-trainval ./data/nuscenes
======
Loading NuScenes tables for version v1.0-trainval...
23 category,
8 attribute,
4 visibility,
64386 instance,
12 sensor,
10200 calibrated_sensor,
2631083 ego_pose,
68 log,
850 scene,
34149 sample,
2631083 sample_data,
1166187 sample_annotation,
4 map,
Done loading in 27.901 seconds.
======
Reverse indexing ...
Done reverse indexing in 6.5 seconds.
======
total scene num: 850
exist scene num: 850
train scene: 700, val scene: 150
[                                                  ] 0/34149, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/create_data.py", line 249, in <module>
    max_sweeps=args.max_sweeps)
  File "tools/create_data.py", line 75, in nuscenes_data_prep
    root_path, out_dir, can_bus_root_path, info_prefix, version=version, max_sweeps=max_sweeps)
  File "/home/occnet_ws/OccNet/tools/data_converter/nuscenes_converter.py", line 93, in create_nuscenes_infos
    nusc, nusc_can_bus, train_scenes, val_scenes, test, max_sweeps=max_sweeps)
  File "/home/occnet_ws/OccNet/tools/data_converter/nuscenes_converter.py", line 237, in _fill_trainval_infos
    l2e_r = info['lidar2ego_rotation']
KeyError: 'lidar2ego_rotation'

I shift the console to path /home/occnet_ws/OccNet,while the file organized as follow:

home/occnet_ws/OccNet/data# tree -L 2
|-- can_bus
|   |-- basemap
|   |-- expansion
|   `-- prediction
|-- nuscenes
|   |-- maps
|   |-- readme.md
|   |-- samples
|   |-- sweeps
|   |-- v1.0-test
|   `-- v1.0-trainval
`-- occ_gt_release_v1_0
    |-- nuscenes_infos_temporal_train_occ_gt.pkl
    |-- nuscenes_infos_temporal_val_occ_gt.pkl
    |-- occ_gt_train.json
    |-- occ_gt_val.json
    |-- train
    `-- val

then I found 'lidar2ego_rotation' in nuscene_converter.py line 237, there is no declaration about this key:

# started from line 215

        info = {
            'lidar_path': lidar_path,
            'token': sample['token'],
            'prev': sample['prev'],
            'next': sample['next'],
            'can_bus': can_bus,
            'frame_idx': frame_idx,  # temporal related info
            'sweeps': [],
            'cams': dict(),
            'scene_token': sample['scene_token'],  # temporal related info
            'scene_name': scene_name,
            'lidar2ego_translation': cs_record['translation'],
            'ego2global_rotation': pose_record['rotation'],
            'timestamp': sample['timestamp'],
        }

        if sample['next'] == '':
            frame_idx = 0
        else:
            frame_idx += 1

        l2e_r = info['lidar2ego_rotation']
        l2e_t = info['lidar2ego_translation']
        e2g_r = info['ego2global_rotation']
        e2g_t = info['ego2global_translation']
        l2e_r_mat = Quaternion(l2e_r).rotation_matrix
        e2g_r_mat = Quaternion(e2g_r).rotation_matrix

...

`

`

KeyError: 'occ_path'

Hi Thanks for your great work. I'am tring to reproduce the result. And I am using the nuscenes mini datasets. I generate the pkl file using:

python tools/create_data.py nuscenes --root-path ./data/nuscenes --out-dir ./data/nuscenes --extra-tag nuscenes --version v1.0-mini --canbus ./data

And I also downloaded provided openocc_v2. The floder is organized like this:

โ”œโ”€โ”€ CITATION.cff
โ”œโ”€โ”€ CODE_OF_CONDUCT.md
โ”œโ”€โ”€ LICENSE
โ”œโ”€โ”€ README.md
โ”œโ”€โ”€ ckpts
โ”‚ย ย  โ””โ”€โ”€ r101_dcn_fcos3d_pretrain.pth
โ”œโ”€โ”€ data
โ”‚ย ย  โ”œโ”€โ”€ LICENSE
โ”‚ย ย  โ”œโ”€โ”€ can_bus
โ”‚ย ย  โ”œโ”€โ”€ can_bus.zip
โ”‚ย ย  โ”œโ”€โ”€ download_can_bus.sh
โ”‚ย ย  โ”œโ”€โ”€ download_full.sh
โ”‚ย ย  โ”œโ”€โ”€ download_mini.sh
โ”‚ย ย  โ”œโ”€โ”€ nuscenes
โ”‚ย ย  โ””โ”€โ”€ v1.0-mini.tgz
โ”œโ”€โ”€ docs
โ”‚ย ย  โ””โ”€โ”€ getting_started.md
โ”œโ”€โ”€ figs
โ”‚ย ย  โ””โ”€โ”€ occupanc_1.gif
โ”œโ”€โ”€ lib
โ”‚ย ย  โ””โ”€โ”€ dvr
โ”œโ”€โ”€ projects
โ”‚ย ย  โ”œโ”€โ”€ __init__.py
โ”‚ย ย  โ”œโ”€โ”€ __pycache__
โ”‚ย ย  โ”œโ”€โ”€ configs
โ”‚ย ย  โ””โ”€โ”€ mmdet3d_plugin
โ”œโ”€โ”€ tools
โ”‚ย ย  โ”œโ”€โ”€ analysis_tools
โ”‚ย ย  โ”œโ”€โ”€ create_data.py
โ”‚ย ย  โ”œโ”€โ”€ data_converter
โ”‚ย ย  โ”œโ”€โ”€ dist_test.sh
โ”‚ย ย  โ”œโ”€โ”€ dist_train.sh
โ”‚ย ย  โ”œโ”€โ”€ fp16
โ”‚ย ย  โ”œโ”€โ”€ misc
โ”‚ย ย  โ”œโ”€โ”€ model_converters
โ”‚ย ย  โ”œโ”€โ”€ slurm_train.sh
โ”‚ย ย  โ”œโ”€โ”€ test.py
โ”‚ย ย  โ””โ”€โ”€ train.py
โ””โ”€โ”€ utils
    โ””โ”€โ”€ vis.py

The nuscenes folder are organized like:

โ”œโ”€โ”€ nuscenes
โ”‚ย ย  โ”œโ”€โ”€ LICENSE
โ”‚ย ย  โ”œโ”€โ”€ download_map.sh
โ”‚ย ย  โ”œโ”€โ”€ download_occ.sh
โ”‚ย ย  โ”œโ”€โ”€ download_pkl.sh
โ”‚ย ย  โ”œโ”€โ”€ map-expansion.zip
โ”‚ย ย  โ”œโ”€โ”€ maps
โ”‚ย ย  โ”œโ”€โ”€ nuscenes_infos_temporal_train_mono3d.coco.json
โ”‚ย ย  โ”œโ”€โ”€ nuscenes_infos_temporal_val_mono3d.coco.json
โ”‚ย ย  โ”œโ”€โ”€ nuscenes_infos_train_occ.pkl
โ”‚ย ย  โ”œโ”€โ”€ nuscenes_infos_val_occ.pkl
โ”‚ย ย  โ”œโ”€โ”€ nuscenes_map_anns_val.json
โ”‚ย ย  โ”œโ”€โ”€ openocc_v2
โ”‚ย ย  โ”œโ”€โ”€ openocc_v2.zip
โ”‚ย ย  โ”œโ”€โ”€ samples
โ”‚ย ย  โ”œโ”€โ”€ sweeps
โ”‚ย ย  โ”œโ”€โ”€ v1.0-mini
โ”‚ย ย  โ””โ”€โ”€ v1.0-test

And I used the checkpoints provided in the BEVFormer. I ran the test using:

./tools/dist_test.sh projects/configs/bevformer/bevformer_base_occ.py ./ckpts/r101_dcn_fcos3d_pretrain.pth 1

But I got this error:

 ETA:Traceback (most recent call last):
  File "./tools/test.py", line 267, in <module>
    main()
  File "./tools/test.py", line 238, in main
    outputs = custom_multi_gpu_test(model, data_loader, args.tmpdir,
  File "/data/*/Experiment-Occ/OccNet/projects/mmdet3d_plugin/bevformer/apis/test.py", line 74, in custom_multi_gpu_test
    for i, data in enumerate(data_loader):
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
    data.reraise()
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/torch/_utils.py", line 425, in reraise
    raise self.exc_type(msg)
KeyError: Caught KeyError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/data/*/Experiment-Occ/OccNet/projects/mmdet3d_plugin/datasets/nuscenes_occ.py", line 125, in __getitem__
    return self.prepare_test_data(idx)
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/mmdet3d-0.17.1-py3.8-linux-x86_64.egg/mmdet3d/datasets/custom_3d.py", line 172, in prepare_test_data
    example = self.pipeline(input_dict)
  File "/home/*/anaconda3/envs/occ/lib/python3.8/site-packages/mmdet/datasets/pipelines/compose.py", line 40, in __call__
    data = t(data)
  File "/data/*/Experiment-Occ/OccNet/projects/mmdet3d_plugin/datasets/pipelines/loading.py", line 21, in __call__
    occ_labels = np.load(results['occ_path'])
KeyError: 'occ_path'

Can any one know what's the problem. Thanks a lot!

ERROR when train for bev_det task

Hi! Thank you for your awesome contribution.

After I finished training Occ task I tried for BEV detection task, but I got following error:

...error...
File "/home/user/OccNet-main/projects/mmdet3d_plugin/bevformer/dense_heads/bevformer_head.py", line 451, in loss
    device = gt_labels_list[0].device
AttributeError: 'list' object has no attribute 'device'

I print the content of gt_labels_list like these:

 [tensor([8, 0, 2, 0, 0, 1, 0, 0, 0, 0, 8, 4, 0, 0, 0, 8, 0, 0, 0, 0, 0, 1, 0, 1,
        0, 1], device='cuda:0'), tensor([8, 0, 2, 5, 5, 0, 0, 1, 0, 0, 0, 0, 8, 4, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0,
        1, 0, 0, 1, 0, 1], device='cuda:0'), tensor([8, 0, 2, 5, 0, 5, 0, 0, 1, 0, 0, 0, 0, 8, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 5, 0, 0, 0, 0, 1], device='cuda:0')]

It's true that 'list' object has no attribute 'device', could u help me?

Thanks

Error while runnig ./tools/train.py

After running the cmd python3 ./tools/train.py , it is asking to install mmcv>=2.0.0 , but while running pip install mmcv==2.0.0 an error is coming that i have shown below,

By running ``python3 ./tools/train.py ```

Traceback (most recent call last):
File "./tools/train.py", line 21, in
from mmdet3d import version as mmdet3d_version
File "/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/mmdet3d/init.py", line 21, in
assert (mmcv_version >= digit_version(mmcv_minimum_version)
AssertionError: MMCV==1.4.0 is used but incompatible. Please install mmcv>=2.0.0rc4, <2.2.0.

By running pip install mmcv==2.0.0

Requirement already satisfied: six>=1.5 in /home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages (from python-dateutil>=2.7->matplotlib->mmengine>=0.2.0->mmcv==2.0.0) (1.16.0)
Building wheels for collected packages: mmcv
Building wheel for mmcv (setup.py) ... error
error: subprocess-exited-with-error

ร— python setup.py bdist_wheel did not run successfully.
โ”‚ exit code: 1
โ•ฐโ”€> [587 lines of output]
running bdist_wheel
/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/utils/cpp_extension.py:370: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
warnings.warn(msg.format('we could not find ninja.'))
running build
running build_py
gcc -pthread -B /home/umic/anaconda3/envs/open-mmlab/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DMMCV_WITH_CUDA -I/tmp/pip-install-lx22zhsn/mmcv_8d5cb1b3546046e9b22169fac12b078c/mmcv/ops/csrc/pytorch -I/tmp/pip-install-lx22zhsn/mmcv_8d5cb1b3546046e9b22169fac12b078c/mmcv/ops/csrc/common -I/tmp/pip-install-lx22zhsn/mmcv_8d5cb1b3546046e9b22169fac12b078c/mmcv/ops/csrc/common/cuda -I/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/include -I/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/include/TH -I/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/umic/anaconda3/envs/open-mmlab/include/python3.8 -c ./mmcv/ops/csrc/pytorch/points_in_boxes.cpp -o build/temp.linux-x86_64-3.8/./mmcv/ops/csrc/pytorch/points_in_boxes.o -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option โ€˜-Wstrict-prototypesโ€™ is valid for C/ObjC but not for C++
gcc -pthread -B /home/umic/anaconda3/envs/open-mmlab/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DMMCV_WITH_CUDA -I/tmp/pip-install-lx22zhsn/mmcv_8d5cb1b3546046e9b22169fac12b078c/mmcv/ops/csrc/pytorch -I/tmp/pip-install-lx22zhsn/mmcv_8d5cb1b3546046e9b22169fac12b078c/mmcv/ops/csrc/common -I/tmp/pip-install-lx22zhsn/mmcv_8d5cb1b3546046e9b22169fac12b078c/mmcv/ops/csrc/common/cuda -I/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/include -I/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/include/TH -I/home/umic/anaconda3/envs/open-mmlab/lib/python3.8/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/umic/anaconda3/envs/open-mmlab/include/python3.8 -c ./mmcv/ops/csrc/pytorch/info.cpp -o build/temp.linux-x86_64-3.8/./mmcv/ops/csrc/pytorch/info.o -std=c++14 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE="_gcc" -DPYBIND11_STDLIB="_libstdcpp" -DPYBIND11_BUILD_ABI="_cxxabi1011" -DTORCH_EXTENSION_NAME=_ext -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option โ€˜-Wstrict-prototypesโ€™ is valid for C/ObjC but not for C++
./mmcv/ops/csrc/pytorch/info.cpp:15:30: fatal error: cuda_runtime_api.h: No such file or directory
#include <cuda_runtime_api.h>
^
compilation terminated.
error: command 'gcc' failed with exit status 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for mmcv
Running setup.py clean for mmcv
Failed to build mmcv
ERROR: Could not build wheels for mmcv, which is required to install pyproject.toml-based projects

How is the flow linked to each voxel

I am trying to train a network that predicts the dynamic voxels in a scene, so I need to link annotated flow to annotated voxels. Do you remember if it's possible to link the flow with each voxel? I wasn't able to figure out the relation between the two looking at the arrays and didn't find any information on this. Thank you!

Question about labels

A label file such as \train\scene-0001\000_occ.npy contains a np array with a shape of (39068,2). I have a few questions about this array. 1) Considering that the voxel space is (200,200,16), the 39068 points from the array represent occupied voxels from the 640000 total voxels?
2) I want to train a network that will use labels of shape (200,200,16), is there a place in the code where you translate the (39068,2) labels to the full voxel space?
3) Is there a way to extract only the voxels visible to the FRONT camera, in case I want to predict only the front scene, and not the whole surround scene?
Thank you!

Object flow is same as object velocity in Nuscenes?

Hi,

Thanks for releasing the data. The supplementary material says that "we annotate the flow velocity of voxel based
on the 3D box velocity". Does this step simply take the object velocity in the Nuscenes ground truth, and rename it as the flow for each voxel in the object? Or is there any difference?

Question about the openocc dataset

Hi๏ผŒ I wonder the relationship between the dataset published in cvpr2023-challenge repository and the openocc dataset in this repository. I see the voxel size and range of theirs is different๏ผŒbut you put the leaderboard of CVPR in your README.

Question about Generating High-quality Annotation

Hello,

The annotated voxel with unlabeled LiDAR points from intermediate frame based on the surrounding labeled voxels to further improve the data density.

In this way, the data density can be improved, but it is bound to produce additional errors, especially the shadow caused by dynamic objects, how to solve this problem

Where do I find the pre-trained OCCnet

I have train the OCCnet through the code, but cannot approach the IOU 37.69 & mIoU 19.48 with resnet50. How do I access to the pretrained model weights?

cascade voxel decoder for occnet-challenge version

Thanks for your great work!
I'm having some problems reading through the occnet-challenge version of the code, I can't find the code for the cascade voxel decoder in that challenge version, please help point out the location of the relevant code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.