Coder Social home page Coder Social logo

luuthienxuan / complex-yolov4-pytorch Goto Github PK

View Code? Open in Web Editor NEW

This project forked from maudzung/complex-yolov4-pytorch

0.0 1.0 0.0 6.71 MB

The PyTorch Implementation based on YOLOv4 of the paper: Complex-YOLO: Real-time 3D Object Detection on Point Clouds

Home Page: https://arxiv.org/pdf/1803.06199.pdf

Python 99.71% Shell 0.29%

complex-yolov4-pytorch's Introduction

Complex YOLOv4

python-image pytorch-image

The PyTorch Implementation based on YOLOv4 of the paper: Complex-YOLO: Real-time 3D Object Detection on Point Clouds


Demo

demo

Features

2. Getting Started

2.1. Requirement

pip install -U -r requirements.txt

For mayavi and shapely libraries, please refer to the installation instructions from their official websites.

2.2. Data Preparation

Download the 3D KITTI detection dataset from here.

The downloaded data includes:

  • Velodyne point clouds (29 GB): input data to the Complex-YOLO model
  • Training labels of object data set (5 MB): input label to the Complex-YOLO model
  • Camera calibration matrices of object data set (16 MB): for visualization of predictions
  • Left color images of object data set (12 GB): for visualization of predictions

Please make sure that you construct the source code & dataset directories structure as below.

For 3D point cloud preprocessing, please refer to the previous works:

2.3. Complex-YOLO architecture

architecture

This work has been based on YOLOv4 for 2D object detection. Please refer to the original paper of YOLOv4 and the Pytorch implementation which is the great work from Tianxiaomo

2.4. How to run

2.4.1. Visualize the dataset (both BEV images from LiDAR and camera images)

cd src/data_process
python kitti_dataloader.py --batch_size 1 --num_workers 1

2.4.2. Inference

python test.py --gpu_idx 0 --pretrained_path <paths>

The trained model will be provided soon. Please watch the repo to get notifications for next update.

2.4.3. Training

2.4.3.1. Single machine, single gpu
python train.py --gpu_idx 0 --multiscale_training
2.4.3.2. Multi-processing Distributed Data Parallel Training

We should always use the nccl backend for multi-processing distributed training since it currently provides the best distributed training performance.

  • Single machine (node), multiple GPUs
python train.py --dist-url 'tcp://127.0.0.1:29500' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0
  • Two machines (two nodes), multiple GPUs

First machine

python train.py --dist-url 'tcp://IP_OF_NODE1:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 0

Second machine

python train.py --dist-url 'tcp://IP_OF_NODE2:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 1

To reproduce the results, you can run the bash shell script

./train.sh

2.5. Evaluation

python eval_mAP.py

The comparison of this implementation with Complex-YOLOv2, Complex-YOLOv3 will be updated soon.

mAP Comparison (min 0.50 IoU)

Model/Class Car Pedestrian Cyclist Average
Complex-YOLO-v2
Complex-YOLO-v3
Complex-YOLO-v4

2.6. List of usage for Bag of Freebies (BoF) & Bag of Specials (BoS) in this implementation

Backbone Detector
BoF [x] Dropblock
[x] Random rescale, rotation (global)
[x] Cross mini-Batch Normalization
[x] Dropblock
[x] Random traing shapes
BoS [x] Mish activation
[x] Cross-stage partial connections (CSP)
[x] Multi-input weighted residual connections (MiWRC)
[x] Mish activation
[x] SPP-block
[x] SAM-block
[x] PAN path-aggregation block
[ ] CIoU/GIoU loss

Contact

If you think this work is useful, please give me a star!
If you find any errors or have any suggestions, please contact me (Email: [email protected]).
Thank you!

Citation

@article{Complex-YOLO,
  author = {Martin Simon, Stefan Milz, Karl Amende, Horst-Michael Gross},
  title = {Complex-YOLO: Real-time 3D Object Detection on Point Clouds},
  year = {2018},
  journal = {arXiv},
}

@article{YOLOv4,
  author = {Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
  title = {YOLOv4: Optimal Speed and Accuracy of Object Detection},
  year = {2020},
  journal = {arXiv},
}

Folder structure

${ROOT}   
└── dataset/    
    └── kitti
        ├──ImageSets/
        |   ├── train.txt
        |   ├── val.txt
        ├── training/
        |   ├── image_2/ <-- for visualization
        |   ├── calib/
        |   ├── label_2/
        |   ├── velodyne/
        └── testing/  
        |   ├── image_2/ <-- for visualization
        |   ├── calib/
        |   ├── velodyne/ 
└── src/
    └── config/
    └── data_process/
    └── models/
    └── utils/
    └── demo.py
    └── eval_mAP.py
    └── test.py
    └── train.py
    └── train.sh
├── README.md 
├── requirements.txt

Usage

usage: train.py [-h] [--seed SEED] [--saved_fn FN] [-a ARCH] [--cfgfile PATH]
                [--pretrained_path PATH] [--img_size IMG_SIZE]
                [--multiscale_training] [--no-val] [--num_samples NUM_SAMPLES]
                [--num_workers NUM_WORKERS] [--batch_size BATCH_SIZE]
                [--subdivisions SUBDIVISIONS] [--print_freq N]
                [--tensorboard_freq N] [--checkpoint_freq N] [--start_epoch N]
                [--num_epochs N] [--lr LR] [--minimum_lr MIN_LR]
                [--momentum M] [-wd WD] [--optimizer_type OPTIMIZER]
                [--lr_type SCHEDULER] [--burn_in N]
                [--steps [STEPS [STEPS ...]]] [--world-size N] [--rank N]
                [--dist-url DIST_URL] [--dist-backend DIST_BACKEND]
                [--gpu_idx GPU_IDX] [--no_cuda]
                [--multiprocessing-distributed] [--evaluate]
                [--resume_path PATH]

The Implementation of Complex YOLOv4

optional arguments:
  -h, --help            show this help message and exit
  --seed SEED           re-produce the results with seed random
  --saved_fn FN         The name using for saving logs, models,...
  -a ARCH, --arch ARCH  The name of the model architecture
  --cfgfile PATH        The path for cfgfile (only for darknet)
  --pretrained_path PATH
                        the path of the pretrained checkpoint
  --img_size IMG_SIZE   the size of input image
  --multiscale_training
                        If true, use scaling data for training
  --no-val              If true, dont evaluate the model on the val set
  --num_samples NUM_SAMPLES
                        Take a subset of the dataset to run and debug
  --num_workers NUM_WORKERS
                        Number of threads for loading data
  --batch_size BATCH_SIZE
                        mini-batch size (default: 64), this is the totalbatch
                        size of all GPUs on the current node when usingData
                        Parallel or Distributed Data Parallel
  --subdivisions SUBDIVISIONS
                        subdivisions during training
  --print_freq N        print frequency (default: 10)
  --tensorboard_freq N  frequency of saving tensorboard (default: 10)
  --checkpoint_freq N   frequency of saving checkpoints (default: 3)
  --start_epoch N       the starting epoch
  --num_epochs N        number of total epochs to run
  --lr LR               initial learning rate
  --minimum_lr MIN_LR   minimum learning rate during training
  --momentum M          momentum
  -wd WD, --weight_decay WD
                        weight decay (default: 1e-6)
  --optimizer_type OPTIMIZER
                        the type of optimizer, it can be sgd or adam
  --lr_type SCHEDULER   the type of the learning rate scheduler (steplr or
                        ReduceonPlateau)
  --burn_in N           number of burn in step
  --steps [STEPS [STEPS ...]]
                        number of burn in step
  --world-size N        number of nodes for distributed training
  --rank N              node rank for distributed training
  --dist-url DIST_URL   url used to set up distributed training
  --dist-backend DIST_BACKEND
                        distributed backend
  --gpu_idx GPU_IDX     GPU index to use.
  --no_cuda             If true, cuda is not used.
  --multiprocessing-distributed
                        Use multi-processing distributed training to launch N
                        processes per node, which has N GPUs. This is the
                        fastest way to use PyTorch for either single node or
                        multi node data parallel training
  --evaluate            only evaluate the model, not training
  --resume_path PATH    the path of the resumed checkpoint

complex-yolov4-pytorch's People

Contributors

maudzung avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.