Coder Social home page Coder Social logo

trendingtechnology / visualdet3d Goto Github PK

View Code? Open in Web Editor NEW

This project forked from owen-liuyuxuan/visualdet3d

0.0 2.0 0.0 24.74 MB

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection

Home Page: https://owen-liuyuxuan.github.io/papers_reading_sharing.github.io/3dDetection/GroundAwareConvultion/

License: Apache License 2.0

Shell 0.80% Python 82.39% C++ 7.17% Cuda 9.64%

visualdet3d's Introduction

Visual 3D Detection Package:

This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from the current directory, and treat ./visualDet3D as a package that we could modify and test directly instead of a library. Several useful scripts are provided in the main directory for easy usage.

We believe that visual tasks are interconnected, so we make this library extensible to more experiments. The package uses registry to register datasets, models, processing functions and more, allowing easy inserting of new tasks/models while not interfere with the existing ones.

Related Paper:

This repo contains the official implementation of 2021 RAL & ICRA paper Ground-aware Monocular 3D Object Detection for Autonomous Driving. Arxiv Page. Pretrained model can be found at release pages.

@ARTICLE{9327478,
  author={Y. {Liu} and Y. {Yuan} and M. {Liu}},
  journal={IEEE Robotics and Automation Letters}, 
  title={Ground-aware Monocular 3D Object Detection for Autonomous Driving}, 
  year={2021},
  doi={10.1109/LRA.2021.3052442}}

Also the official implementation of 2021 ICRA paper YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection. Pretrained model can be found at release pages.

@inproceedings{liu2021yolostereo3d,
  title={YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection},
  author={Yuxuan Liu and Lujia Wang and Ming, Liu},
  booktitle={2021 International Conference on Robotics and Automation (ICRA)},
  year={2021},
  organization={IEEE}
}

We further incorperate an Unofficial re-implementation of Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training (KM3D) as a reference on how to integrate with other frameworks. (Notice that the codes are from the originally official repo, and we DO NOT guarantee a complete re-implementation).

Update (2021.07.02): We provide an Unofficial re-implementation of Objects are Different: Flexible Monocular 3D Object Detection (MonoFlex) with few additional codes, based on the KM3D structure. Many of the core codes are from original official repo. We did not implement the edge merge operation and the corner loss, but we manage to maintain most of the performance based on the proposed depth fusion methods(validation AP reaches 15%).

Key Features

  • SOTA Performance State of the art result on visual 3D detection.
  • Modular Design Modular design for dataset, network and running pipelines.
  • Support Various Task Compatible with the training and testing of mono/stereo 3D detection and depth prediction.
  • Distributed & Single GPU Support training with multiple GPUs.
  • Installation-Free Setup The setup process only build operations and does not require installation to keep the environment clean.
  • Global Path-based IMDB Do not need data placed inside the folder, convienient for managing data and code separately.

We provide start-up solutions for Mono3D, Stereo3D, Depth Predictions and more (until further publication).

Reference: this repo borrows codes and ideas from retinanet, mmdetection, M3D-RPN, DORN, EdgeNets, det3

Setup

Environment setup.

pip3 install -r requirement.txt

or manually check dependencies.

# build ops (deform convs and iou3d), We will not install operations into the system environment
./make.sh

Start Training

Please check the corresponding task: Mono3D, Stereo3D, Depth Predictions. More demo will be available through contributions and further paper submission.

Config and Path setup.

Please modify the path and other parameters in config/*.py. config/*_example files are templates.

Notice: *_examples are NOT utilized by the code and *.py under /config is ignored by .gitignore.

The content of the selected config file will be recorded in tensorboard at the beginning of training.

important paths to modify in config :

  1. cfg.path.data_path: Path to KITTI training data. We expect calib, image_2, image_3, label_2 being the subfolder (directly unzipping the downloaded zips will be fine)
  2. cfg.path.test_path: Path to KITTI testing data. We expect calib, image_2 being the subfolder.
  3. cfg.path.visualDet3D_path: Path to the "visualDet3D" directorty of the current repo
  4. cfg.path.project_path: Path to the workdirs of the projects (will have temp_outputs, log, checkpoints)

Please check the template's comments and other comments in codes to fully exploit the repo.

Further Info and Bug Issues

  1. Open issues on the repo if you meet troubles or find a bug or have some suggestions.
  2. Email to [email protected]

Other Resources

Related Codes

visualdet3d's People

Contributors

owen-liuyuxuan avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.