mileyan / simple_shot Goto Github PK

License: MIT License

Python 100.00%

simple_shot's Introduction

End-to-end Pseudo-LiDAR for Image-Based 3D Object Detection

This paper has been accepted by Computer Vision and Pattern Recognition 2020.

End-to-end Pseudo-LiDAR for Image-Based 3D Object Detection

by Rui Qian*, Divyansh Garg*, Yan Wang*, Yurong You*, Serge Belongie, Bharath Hariharan, Mark Campbell, Kilian Q. Weinberger and Wei-Lun Chao

Citation

@inproceedings{qian2020end,
  title={End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection},
  author={Qian, Rui and Garg, Divyansh and Wang, Yan and You, Yurong and Belongie, Serge and Hariharan, Bharath and Campbell, Mark and Weinberger, Kilian Q and Chao, Wei-Lun},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5881--5890},
  year={2020}
}

###Abstract

Reliable and accurate 3D object detection is a necessity for safe autonomous driving. Although LiDAR sensors can provide accurate 3D point cloud estimates of the environment, they are also prohibitively expensive for many settings. Recently, the introduction of pseudo-LiDAR (PL) has led to a drastic reduction in the accuracy gap between methods based on LiDAR sensors and those based on cheap stereo cameras. PL combines state-of-the-art deep neural networks for 3D depth estimation with those for 3D object detection by converting 2D depth map outputs to 3D point cloud inputs. However, so far these two networks have to be trained separately. In this paper, we introduce a new framework based on differentiable Change of Representation (CoR) modules that allow the entire PL pipeline to be trained end-to-end. The resulting framework is compatible with most state-of-the-art networks for both tasks and in combination with PointRCNN improves over PL consistently across all benchmarks --- yielding the highest entry on the KITTI image-based 3D object detection leaderboard at the time of submission.

Root
    | PIXOR
    | PointRCNN

We provide end-to-end modification on pointcloud-based detector(PointRCNN) and voxel-based detector(PIXOR).

The PIXOR folder contains implementation of Quantization as described in Section3.1 of the paper. Also it contains our own implementation of PIXOR.

The PointRCNN folder contains implementation of Subsampling as described in Section3.2 of the paper. It is developed based on the codebase of Shaoshuai Shi.

Data Preparation

This repo is based on the KITTI dataset. Please download it and prepare the data as same as in Pseudo-LiDAR++. Please refer to its readme for more details.

Training and evaluation

Please refer to each subfolder for details.

Questions

This repo is currently maintained by Rui Qian and Yurong You. Please feel free to ask any question.

You can reach us by put an issue or email: [email protected], [email protected]

simple_shot's People

Contributors

Stargazers

Watchers

simple_shot's Issues

Some questions about ResNet.

It is a really nice job! We tried to use resnet as a backbone just like what you did in our own experiment. However, it didn't improve performance as we expected and even performed worse than 4-conv. I wonder if there are some tricks when you trained your model with a resnet. Thank you very much.

The idea of SimpleShot is almost the same as that of ProtoNet

ProtoNet also uses the mean of each support class as its center and then implement Euclidean distance to train the network. The difference between SimpleShot and ProtoNet is that SimpleShot just added normalization and centering to the feature vectors before computing the Euclidean distance, is that correct?

inconsistency of enlarge between validation and test

For tiered imagenet, I noticed that in the validation step during training, the args.enlarge is set to False/None because there's no '--enlarge' in the training configs. However, in the evaluation/testing command lines shown in readme, there's a '--enlarge'. Just wondering is this a typo or something else?

Thanks!

ResNet 50 performance

Thank you for your great work!

I am trying to rerun the models and see their performances. However, I realized that the performance of models with the ResNet50 backbone is significantly low. Below you may find the code output. Do you think this result is expected?

Best

Meta Test: LAST
feature UN L2N CL2N
GVP 1Shot 0.5127(0.0020) 0.5303(0.0020) 0.5161(0.0020)
GVP_5Shot 0.7143(0.0018) 0.7467(0.0016) 0.7165(0.0018)

Performance discrepancy when training from scratch w/ PyTorch 1.4

Hi,
Thanks for the nice work and for sharing the code.

I have tried replicating the results of the paper (training from scratch) but with no luck.
I have followed the instructions of the readme and tried both pytorch/cuda-toolkit 1.4/10.0.
For ResNet-10 and ResNet-18, on miniImageNet I am getting a discrepancy between 2% and 3% (absolute).

Thanks in advance for the help!

Nice work and some questions

Congratulations， very effective work. I have some questions about your work, hope you can give some advice for your convenience.

Why only subtruct mean of gallery, do you have a try on subtructing the mean of query or gallery+query ?
Why on meta-iNat, CL2N not perform well than L2N？Have you tried to analyze the essential reasons?

Thanks!

The introduction part is so weird

First of all, thanks for the interesting work. The experimental results are still competitive to the current SOTA works.
However, the introduction makes me confused and doubt the results. For example, in paragraph 3, you mentioned 'Prior studies suggest that using meta-learning outperformes "vanilla" nearest neighbor classification [26, 30]'. In fact, references 26 and 30 were exactly the methods that used meta-learning, as they tackled the FSL problem within a meta-learning setting. Also, in paragraph 4, you mentioned that your method achieve SOTA performance without using meta-learning, which is the weirdest part because you were already using meta-learning when you used N-way K-shot setting.

Achieve only 0.62 test accuracy with same training procedure.

We trained resnet18 following the descriptions in the paper. The parameters are set same as your code. But we only achieved 0.62 test accuracy on mini-imagenet. What do you think might be the reason? Thanks.

is it possible to use simple_shot with CPU only?

Unfortunately on my MacBook, the evaluation command is failing:

python ./src/train.py -c ./configs/mini/softmax/conv4.config --evaluate --enlarge

Traceback (most recent call last):
  File "./src/train.py", line 554, in <module>
    main()
  File "./src/train.py", line 51, in main
    model = torch.nn.DataParallel(model).cuda()
  File "/Users/seb/.virtualenvs/kiss/lib/python3.7/site-packages/torch/nn/modules/module.py", line 305, in cuda
[...]
  File "/Users/seb/.virtualenvs/kiss/lib/python3.7/site-packages/torch/cuda/__init__.py", line 95, in _check_driver
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Would it be possible to update the Tiered-ImageNet dataset links or provide alternative sources for accessing the required data?

The example code of the experiments runing on Cifar-100

In your paper, I noticed that you trained the simple-shot model on Cifar-100 dataset, while there is no corresponding code in the github repository. Would you please make this part of code public?

mileyan / simple_shot Goto Github PK

simple_shot's Introduction

End-to-end Pseudo-LiDAR for Image-Based 3D Object Detection

Citation

Contents

Data Preparation

Training and evaluation

Questions

simple_shot's People

Contributors

Stargazers

Watchers

Forkers

simple_shot's Issues

Recommend Projects

Recommend Topics

Recommend Org