Implementation for Monocular 3D Object Detection with Sequential Feature Association and Depth Hint Augmentation. For now, we only release the testing code and the pretrained model that yields the reported performance on KITTI3D test set. The training codes and more pretrained models will be released. Some of the codes in this repository are borrowed from SMOKE, CenterNet, and 3D BoundingBox.
- NVIDIA RTX 2080Ti
- Ubuntu=16.04, cuda=10.0
- python=3.6
- torch==1.6.0, torchvision==0.7.0
- tqdm, easydict, scikit-image
- Download KITTI3D dataset and organize the images and labels as follows:
--kitti --training --calib --image_2 --label_2 --testing --calib --image_2
-
Clone the repository.
$ git clone https://github.com/gtzly/FADNet
-
Establish a soft link to KITTI3D dataset.
$ cd FADNet
$ ln -s path_to_kitti ./data
-
Install DCNv2.
$ cd lib/layers/DCNv2
$ bash ./make.sh
-
Download the pretrained model and put it under a folder named as 'checkpoint'.
$ cd ../../..
$ mkdir checkpoint
$ mv final.pth checkpoint
-
Running FADNet on KITTI3D test set with the pretrained model.
$ python scripts/test.py
-
Visualize the detection results.
$ python scripts/visualize.py
-
The performance on KITTI3D test set:
Benchmark Easy Moderate Hard Car (Detection) 96.15 % 90.49 % 80.71 % Car (Orientation) 95.89 % 89.84 % 79.98 % Car (3D Detection) 16.37 % 9.92 % 8.05 % Car (Bird's Eye View) 23.00 % 14.22 % 12.56 %
Note: The above statistics are accessible on the KIITI official website. Readers can submit the test results obtained by the provided pretrained model to the KITTI3D benchmark to verify the reported performance. However, this is strongly discouraged by the KITTI official team to prevent the abuse of test set.
- Check the qualitative results under ./output/vis_test