Coder Social home page Coder Social logo

yurongyou / hindsight Goto Github PK

View Code? Open in Web Editor NEW
37.0 6.0 4.0 16.67 MB

Code release for "Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception" [ICLR 2022]

License: MIT License

Python 85.23% C++ 5.26% Cuda 8.83% C 0.42% Shell 0.27%
point-cloud pytorch autonomous-driving object-detection 3d-detection hindsight

hindsight's Introduction

Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception

This is the official code release for

[ICLR 2022] Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception.

by Yurong You, Katie Z Luo, Xiangyu Chen, Junan Chen, Wei-Lun Chao, Wen Sun, Bharath Hariharan, Mark Campbell, and Kilian Q. Weinberger

Video | Paper

Figure

Abstract

Self-driving cars must detect vehicles, pedestrians, and other traffic participants accurately to operate safely. Small, far-away, or highly occluded objects are particularly challenging because there is limited information in the LiDAR point clouds for detecting them. To address this challenge, we leverage valuable information from the past: in particular, data collected in past traversals of the same scene. We posit that these past data, which are typically discarded, provide rich contextual information for disambiguating the above-mentioned challenging cases. To this end, we propose a novel end-to-end trainable Hindsight framework to extract this contextual information from past traversals and store it in an easy-to-query data structure, which can then be leveraged to aid future 3D object detection of the same scene. We show that this framework is compatible with most modern 3D detection architectures and can substantially improve their average precision on multiple autonomous driving datasets, most notably by more than 300% on the challenging cases.

Citation

@inproceedings{you2022hindsight,
  title = {Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception},
  author = {You, Yurong and Luo, Katie Z and Chen, Xiangyu and Chen, Junan and Chao, Wei-Lun and Sun, Wen and Hariharan, Bharath and Campbell, Mark and Weinberger, Kilian Q.},
  booktitle = {Proceedings of the International Conference on Learning Representations (ICLR)},
  year = {2022},
  month = apr,
  url = {https://openreview.net/forum?id=qsZoGvFiJn1}
}

Environment

conda create --name hindsight python=3.8
conda activate hindsight
conda install pytorch=1.9.0 torchvision torchaudio cudatoolkit=11.1 -c pytorch -c nvidia
pip install opencv-python matplotlib wandb scipy tqdm easydict scikit-learn

# ME
git clone https://github.com/NVIDIA/MinkowskiEngine.git
cd MinkowskiEngine
git checkout c854f0c # 0.5.4
python setup.py install

for OpenPCDet, follow downstream/OpenPCDet/docs/INSTALL.md to install except you should install the spconv with the code in third_party/spconv.

Data Pre-processing

Please refer to data_preprocessing/lyft/LYFT_PREPROCESSING.md and data_preprocessing/nuscenes/NUSCENES_PREPROCESSING.md.

Training and Evaluation

We implement the computation of SQuaSH as a submodule in OpenPCDet (as sparse_query) and modify the KITTI dataloader / augmentor to load the history traversals.

We include the corresponding configs of four detection models in downstream/OpenPCDet/tools/cfgs/lyft_models and downstream/OpenPCDet/tools/cfgs/nuscenes_boston_models. Please use them to train/evaluate corresponding base-detectors/base-detectors+Hindsight models.

Train:

We use 4 GPUs to train detection models by default.

cd downstream/OpenPCDet/tools
OMP_NUM_THREADS=6 bash scripts/dist_train.sh 4 --cfg_file <cfg> --merge_all_iters_to_one_epoch --fix_random_seed

Evaluation:

cd downstream/OpenPCDet/tools
OMP_NUM_THREADS=6 bash scripts/dist_test.sh 4 --cfg_file <cfg> --ckpt <ckpt_path>

Checkpoints

Lyft experiments

Model Checkpoint Config file
PointPillars link cfg
PointPillars+Hindsight link cfg
SECOND link cfg
SECOND+Hindsight link cfg
PointRCNN link cfg
PointRCNN+Hindsight link cfg
PV-RCNN link cfg
PV-RCNN+Hindsight link cfg

nuScenes experiments

Model Checkpoint Config file
PointPillars link cfg
PointPillars+Hindsight link cfg
PointRCNN link cfg
PointRCNN+Hindsight link cfg

License

This project is under the MIT License. We use OpenPCDet and spconv in this project and they are under the Apache-2.0 License. We list our changes here.

Contact

Please open an issue if you have any questions about using this repo.

Acknowledgement

This work uses OpenPCDet, MinkowskiEngine and spconv. We thank them for open-sourcing excellent libraries for 3D understanding tasks. We also use the scripts from 3D_adapt_auto_driving for converting Lyft and nuScenes dataset into KITTI format.

hindsight's People

Contributors

yurongyou avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

hindsight's Issues

application hindsight

If I want to apply hindsight to my project, do I need to prepare my dataset just like lyft dataset?

ValueError: cannot reshape array of size 265728 into shape (5)

Thanks your work! when Download Lyft dataset and convert it to KITTI format, i meet this problem:
File "/home/biaoli/anaconda3/envs/hindsight/lib/python3.8/site-packages/lyft_dataset_sdk/utils/data_classes.py", line 283, in from_file
points = scan.reshape((-1, 5))[:, : cls.nbr_dims()]
ValueError: cannot reshape array of size 265728 into shape (5)
24%|█████████▏ | 5471/22680 [03:31<11:04, 25.90it/s]
is this a properly solution ?
lyft/nuscenes-devkit#89
thanks you !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.