cc-hy / cmkd Goto Github PK

View Code? Open in Web Editor NEW

108.0 108.0 9.0 3.4 MB

Cross-Modality Knowledge Distillation Network for Monocular 3D Object Detection (ECCV 2022 Oral)

License: Apache License 2.0

Dockerfile 0.16% Python 85.43% C++ 5.15% Cuda 8.75% C 0.35% Shell 0.16%

cmkd's People

Contributors

Stargazers

Watchers

Forkers

jlqzzz cv-det xiaolong-rrl xiaoxusanheyi tianyuez douwenhao68 wyh20000305 sadjadasghari alexandredipiazza

cmkd's Issues

AttributeError: 'EasyDict' object has no attribute 'POST_PROCESSING'

执行命令python train_cmkd.py --cfg /home/ubuntu/CMKD/tools/cfgs/kitti_models/CMKD/CMKD-scd/cmkd_kitti_eigen_R50_scd_bev.yaml，训练30个epoch出现上面那个错误，也就是你们建议的训练技巧，是怎么回事？

where to find the version for nuscenes?

Hi, thanks for sharing your great work. I didn't find the mmdet version for nuscenes. Where to find it?

About the speed of multi-gpu training

Hi, @Cc-Hy .When I train the model on kitti train, 2 GPUs takes more time than 1 GPU, which is really strange. Do you encounter this pthenomenon?

用了你们的两个模型2284.pth和2304.pth,精度有高有低？

AttributeError: 'EasyDict' object has no attribute 'POST_PROCESSING'

执行命令python train_cmkd.py --cfg /home/ubuntu/CMKD/tools/cfgs/kitti_models/CMKD/CMKD-scd/cmkd_kitti_eigen_R50_scd_bev.yaml，训练30哥epoch出现上面那个错误，是怎么回事？

Training Problem

Sorry to disturb you, I would like to ask you a few questions:
During the calculation of MSE loss for the supervised middle layer BEV features, there are many empty values. How did you consider this? Did you add a mask to the 0 values?
When I trained with other data, I found that the overall loss is affected by the BEV layer. The loss of the later anchors is very low, and the recall rate of the test results is very poor. Have you encountered this problem?
Thank you, looking forward to your reply~

Use of customised amount of training data

Hi author, does the code support training with customised amounts of training data (to reproduce Figure 7's results in the paper)? If not, how do we go about this?

About get_cls_layer_loss

I’m sorry to bother you, but I would like to know why the threshold here is 0.1 and why the loss is not divided by the batch size. And the expression in QFL is different from the original text, how was this considered?

The stablity of performance

Hi, @Cc-Hy . I run the code on kitti raw on 2 GPUs without any modification, but still find that the performance is not very stable. Firstly, the performance is not stable during final 10-epoch training, getting 1 or 2 point difference of the last two epochs. Secondly, performance of different runs gets 1 or 2 point difference. And they are all lower than performance on README, especially for the easy setting, 2 points lower 3 points lower. Any suggestions?

Question about inference (Lidar camera calibration)

Hi, thanks for sharing the nice source code.

Since bounding boxs and image voxel features generated by the model are in lidar coordinate,
should the relative camera pose to LiDAR be fixed when inference?
In other words, the lidar camera extrinsic parameter used in inference shoud be same as in training?

响应蒸馏那部分的分类蒸馏的loss具体在源码那个位置？

你好，作者：看了源码不知道这个分类loss的源码的具体位置在哪里：希望作者给指出loss具体位置。

Training process

Hello,

I want to ask some details about the training process.

CMKD uses pretrained SECOND-net for the Teacher Network.

When training the CMKD's Student network (CMKD Mono),
First, you use ~bev.yaml file to train the model with feature distillation loss

In this process, you don't calculate detection loss? If so, the purpose of this process is only to train BEV image feature to have similar patterns to the BEV lidar feature?

Second, you use ~V2.yaml file to train the model with feature distillation loss + detection loss

This process is used for final 3d object detection? Was 20 epochs enough for training?

+) Also, do you freeze the teacher network(SECOND) and do updates only for the Student network(CMKD Mono) ?

Thank you!

what is the performance on kitti train

Hi, @Cc-Hy , I wonder the performance on the kitti train. Can you kindly provide it? Lots of thanks!

How to

It's a wonderful repository!

Can I use a custom dataset to train?
If possible, which labeling tool is suitable?

Thanks.

NameError: name 'normalize' is not defined

换成了4卡的80服务器在进行测试预训练模型的时候出现NameError: name 'normalize' is not defined
执行代码：python test_cmkd.py --cfg /home/data/xl/CMKD/tools/cfgs/kitti_models/CMKD/CMKD-scd/cmkd_kitti_R50_scd_V2.yaml --batch_size 8 --ckpt /home/data/xl/CMKD/cmkd-r50-kitti-eigen-3-class-mod-2304.pth

不知道哪里出错了，求指点

Number of training samples used in paper

Hi there,

In the paper it says '~42k' raw KITTI samples are used - is this a typo? Apparently you only used ~18k train + eigen clean to get those results, or I am missing something here?

Hope you can clarify a bit. Thanks!

响应蒸馏的分类蒸馏在代码具体那个位置？

On the use of 'gt_mask'

Hi, could you please explain how 'gt_mask' can be utilised? Apparently it shows up in the cmkd.py script and doesn't seem to be used in training. Many thanks.

如何在testing数据上进行测试

这里的结果实在 kitti val上的结果，在testing上，预测结果如何？

针对多GPu训练的问题

执行了你们给出的多GPU训练的命令：

30个epoch 训练还有10个深度训练后面直接10次epoch 结果就是直接用测试集测试出来的最终结果吗我拿到的这个结果是不是就可以和原文进行对比了。
是不是没有用到验证集val呀

kitti_raw数据集的测试结果与原文相差巨大

1.我用了完整的kitti_raw数据集测试拿到的结果如下图所示，其中ped与原文的测试结果相差巨大，cyc和car的没有问题

2.我用kitti_3D数据集测试得到的结果，其中car的测试结果与原文相差4-5个点这是为什么(原文复现)，其他的cyc和ped问题不大。难道就因为我用来了2卡80，把batch_size调为2所导致的吗，求指教。

“验证集的结果好与提交的测试集结果很差”

针对kitti-raw的数据集，我们在修改了文中的loss后，按照github上的多GPU分阶段训练(second)执行训练验证后，结果很好，进行测试，把结果提交上去后结果很差很差，不知道为什么？过拟合了？

About of model weights

Hi, How to obtain the pre-trained student and teacher models?

about training V2.yaml

Hello,

When training with V2.yaml file, in train_cmkd.py why do you load pretrained_lidar_model to model.model_img?

Thank you.

when use two gpus in one node, error!

Hi, when I use 2 gpus in one node to train train_cmkd.py, the error like:
2022-11-25 09:49:08,244 INFO Start training xxx/project_3D/CMKD-main/tools/cfgs/kitti_models/CMKD/cmkd_caic_R50_scd_V2(default)

epochs: 0%| | 0/10 [00:00<?, ?it/s]
epochs: 0%| | 0/10 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/xxx/project_3D/CMKD-main/tools/train_cmkd.py", line 226, in
main()
File "/xxx/project_3D/CMKD-main/tools/train_cmkd.py", line 198, in main
merge_all_iters_to_one_epoch=args.merge_all_iters_to_one_epoch,
File "/xxx/project_3D/CMKD-main/tools/train_utils/train_utils.py", line 245, in train_model_cmkd
dataloader_iter = iter(train_loader)
File "/root/anaconda3/envs/CMKDK/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 355, in iter
return self._get_iterator()
File "/root/anaconda3/envs/CMKDK/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 301, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
File "/root/anaconda3/envs/CMKDK/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 914, in init
w.start()
File "/root/anaconda3/envs/CMKDK/lib/python3.7/multiprocessing/process.py", line 112, in start
self._popen = self._Popen(self)
File "/root/anaconda3/envs/CMKDK/lib/python3.7/multiprocessing/context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/root/anaconda3/envs/CMKDK/lib/python3.7/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/root/anaconda3/envs/CMKDK/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/root/anaconda3/envs/CMKDK/lib/python3.7/multiprocessing/popen_fork.py", line 20, in init
self._launch(process_obj)
File "/root/anaconda3/envs/CMKDK/lib/python3.7/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/root/anaconda3/envs/CMKDK/lib/python3.7/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle spconv.core_cc.csrc.sparse.all.ops_cpu3d.Point2VoxelCPU objects

I start the code with:
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train_cmkd.py --launcher pytorch --cfg ${CONFIG_FILE} --tcp_port 16677 --pretrained_lidar_model ${TEACHER_MODEL_PATH}

when I use 1 gpu to train , it's ok! That's why?

关于深度估计预训练

你好，请问提供的image pre-trained backbone是否经过深度估计预训练啊，没有的话如何进行深度估计预训练呢

Tensorboard eval visualization

Hi,

I used the script provided in GETTING_STARTED.md, it works. But afterwards, when I show Tensorboard the log path as output/kitti_models/CMKD/CMKD-scd/cmkd_kitti_R50_scd_bev/default/ckpt, nothing is visible in the dash screen:

Am I doing something wrong?

测试精度较低

您好，我在采用CMKD-R50模型进行测试时精度很低，Car Moderate@R40仅14.45，远低于官方的23.0，且recall_roi_0.3为0，请问是什么原因？

2023-01-31 16:42:38,363   INFO  ==> Loading parameters from checkpoint ../checkpoints/cmkd-r50-kitti-eigen-3-class-mod-2304.pth to GPU
2023-01-31 16:42:47,954   INFO  ==> Checkpoint trained from version: pcdet+0.5.2+830fba9+py0836fc9
2023-01-31 16:42:48,062   INFO  ==> Done (loaded 649/649)
2023-01-31 16:42:48,823   INFO  *************** EPOCH 2304 EVALUATION *****************
eval: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 943/943 [16:52<00:00,  1.07s/it, recall_0.3=(0, 10421) / 17558]
2023-01-31 16:59:41,392   INFO  *************** Performance of EPOCH 2304 *****************
2023-01-31 16:59:41,393   INFO  Generate label finished(sec_per_example: 0.2687 second).
2023-01-31 16:59:41,396   INFO  Average predicted number of objects(3769 samples): 7.862
2023-01-31 16:59:41,396   INFO  recall_roi_0.3: 0.000000
2023-01-31 16:59:41,396   INFO  recall_rcnn_0.3: 0.593519
2023-01-31 16:59:41,396   INFO  precision_rcnn_0.3: 0.351692
2023-01-31 16:59:41,397   INFO  recall_roi_0.5: 0.000000
2023-01-31 16:59:41,397   INFO  recall_rcnn_0.5: 0.408873
2023-01-31 16:59:41,397   INFO  precision_rcnn_0.5: 0.242280
2023-01-31 16:59:41,397   INFO  recall_roi_0.7: 0.000000
2023-01-31 16:59:41,397   INFO  recall_rcnn_0.7: 0.194555
2023-01-31 16:59:41,397   INFO  precision_rcnn_0.7: 0.115285
2023-01-31 16:59:41,569   INFO  Result is save to /data/tc/code-bev/CMKD-main/output/kitti_models/CMKD/cmkd_kitti_eigen_R50_scd_V2/default/eval/epoch_2304/val/default
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 16 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 20 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 25 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 30 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 35 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 24 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 72 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 16 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 20 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 25 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 16 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 25 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 16 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 20 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 30 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 35 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 24 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
/home/tc/anaconda3/envs/CMKD/lib/python3.8/site-packages/numba/cuda/dispatcher.py:488: NumbaPerformanceWarning: Grid size 72 will likely result in GPU under-utilization due to low occupancy.
  warn(NumbaPerformanceWarning(msg))
2023-01-31 17:00:19,911   INFO  

Car [email protected], 0.70, 0.70:
bbox AP:98.2600, 92.4201, 87.0927
bev  AP:28.9126, 20.5439, 17.7329
3d   AP:20.9526, 14.4463, 12.6663
aos  AP:98.20, 92.04, 86.24

Pedestrian [email protected], 0.50, 0.50:
bbox AP:67.4147, 56.8572, 48.8748
bev  AP:10.6632, 7.1401, 5.8156
3d   AP:7.5704, 4.8982, 3.8422
aos  AP:37.82, 31.81, 27.13

Cyclist [email protected], 0.50, 0.50:
bbox AP:64.8872, 40.9815, 38.8914
bev  AP:6.7082, 3.3724, 3.3272
3d   AP:5.1716, 2.8833, 2.8526
aos  AP:46.29, 29.05, 27.62

2023-01-31 17:00:19,922 INFO Evaluation done.*

无法测试！

验证时和测试时出现如图所示的问题，

cuda11.6 90卡驱动11.3 py=3.8 pytorch=1.8.1 不知道哪里出现问题

About cfg flies

Thank you for sharing your great work!

I have some questions regarding the yaml files.

I think the difference between 'cmkd_kitti_R50' and 'cmkd_kitti_eigen_R50' is that the former one is trained with kitti3d dataset and the later one is trained with KITTI Raw data.

Then, what's the difference between cmkd_kitti_R50 _scd_bev.yaml , cmkd_kitti_R50_scd.yaml , cmkd_kitti_R50_scd_V2.yaml ?

What kind of data is this -> 'train': [kitti_infos_train_soft.pkl] in cmkd_kitti_R50_scd.yaml
To train CMKD the repo recommends to train BEV first, then whole model.

2-1) What does it mean by train BEV first?
2-2) Does cmkd_kitti_R50 _scd_bev.yaml used for training 'python train_cmkd.py --cfg xxx_bev.yaml' and
cmkd_kitti_R50_scd_V2.yaml used for. training 'python train_cmkd.py --cfg xxx.yaml --pretrained_img_model ${BEV_pretrained_model_path} ' ?

Thank you.

nuScenes dataset evaluation

Hi, thanks for your great work!

I'm trying to get results from nuScenes mini dataset.
Do you plan to add a brief documentation about this dataset?

Cheers,

Batuhan

Huge fluctuations in eval results with test.py

Hi, I ran into some very strange results. I pre-trained a SECOND teacher for 80 epochs.

When evaluating the 60th and 80th epochs' checkpoints with test.py, the results are good (Car-Mod>80), but the 70th epoch gives extremely bad results (Car-Mod~10, average predicted number of objects: 300+).

I also followed the advice in another issue to modify init.py under pcdet/datasets/ but it didn't help.

The SECOND model is trained on Train and evaluated on Val.

Table 4 results

Hello,

Can I get the experimental details of results in Table 4. ?

Because this repo has some differences from paper, it's kind of confusing.
Are these results use Resnet101 with bin num=80, just as CaDDN setting?

Thank you.

下载的预训练second teacher 为什么是一个压缩文件

Inconsistent evaluation results

Hi,

I ran into inconsistent evaluation results where the automatic eval at the end of training and manual evaluation using test.py give slight different results for the same checkpoint. Sometimes the AP differs by ~1 AP and sometimes < 1 AP. Do you have any idea why?

What I am training and evaluating is a CaDDN model using the provided CaDDN.yaml.

Support on Waymo experiments

Hi there, can you please provide the necessary code to support experiments on Waymo?

Settings in config file

Hi, Is it necessarily to set FOV_POINTS_ONLY: True when training the teacher model?
If set FOV_POINTS_ONLY: True, POINT_CLOUD_RANGE, VOXEL_SIZE_LIDAR, VOXEL_SIZE_IMG should be adjusted? and how do you set these parameters for Waymo dataset?

How to download the calib files of kitty_raw

When I try to generate the data infos of kitty_cmkd, it raised an error that data/kitti/raw/calib/2011_09_30.txt is missed.
However, the kitty raw data downloaded with the raw dataset download script only provide the three calib files: calib_cam_to_cam.txt, calib_imu_to_velo.txt, and calib_velo_to_cam.txt.
So how to download the calib files of kitty_raw?

Extremely low eval results

Hi,

I recently started running the code on a new machine. Strangely the eval AP results become extremely bad (see the screenshot). Yet, the other results (i.e. recall, precision, avg no. of objects) look ok (I think?). What might be the cause? Thanks.

the config file of the teacher model

Hi, @Cc-Hy , could you kindly provide the config file of the teacher model? Lots of thanks!

Not familiar PCDet, can you give more detail training scripts?

@Cc-Hy
Hi, I am not familiar PCDet, can you give one training script?
BHW, where is the KD code line in your repo? I want to check your code part of KD loss.
Thanks your work!

CMKD on nuscenes dataset

Hi, does CMKD support training and testing on nuscenes dataset? What should be used as labelled and unlabelled data in the case of nuscenes dataset? Thanks!

Why training script automatic evaluate the last 10 epoch?

@Cc-Hy
Hi,
I try to train kitti train, I change epoch10 to epoch20. But the training always eval the last 10 checkpoint.
How can I eval 1~10 checkpoint?

CMKD应该属于哪一类单目检测器？

@Cc-Hy
hello

CMKD提及的depth pre-trained backbone以及initialize the backbone with the weights pre-trained on COCO分别是指在什么阶段做的处理呢？不是太懂

Models and results in Model Zoo

Dear author, in the model zoo, I am confused by descriptions in the model zoo. Specifically, for models in rows of "kitti train", did they not use eigen data? - which in turn means these models are trained in a fully supervised way without the CMKD framework?

Generation of depth GT

Hi, thanks for the great work.

Just wondering how the dense depth GT used in the image branch is generated? According to CaDDN, it is obtained by projection from LiDAR followed by depth completion. Is depth completion actually performed?

This version of code

HELLO,

I have a question regarding this repo.

Without BEV DA module and BEV KD loss, the student model is same as CaDDN.
Does this repo version(using Resnet50 and some layer modification in detection, loss weight) of CaDDN produces comparable results with CaDDN(Resnet101) ?

+) in order to just reproduce the student model without KD loss,
what config do I have to use?

Thank you.

cc-hy / cmkd Goto Github PK

cmkd's People

Contributors

Stargazers

Watchers

Forkers

cmkd's Issues

Recommend Projects

Recommend Topics

Recommend Org