Coder Social home page Coder Social logo

yolo_v3's Introduction

YoloV3 in Pytorch and Jupyter Notebook

This repository aims to create a YoloV3 detector in Pytorch and Jupyter Notebook. I'm trying to take a more "oop" approach compared to other existing implementations which constructs the architecture iteratively by reading the config file at Pjreddie's repo. The notebook is intended for study and practice purpose, many ideas and code snippets are taken from various papers and blogs. I will try to comment as much as possible. You should be able to just Shift-Enter to the end of the notebook and see the results.

Requirements

  • Python 3.6.4
  • Pytorch 0.4.1
  • Jupyter Notebook 5.4.0
  • OpenCV 3.4.0
  • imgaug 0.2.6
  • Pycocotools
  • Cuda Support

Instructions

$ git clone https://github.com/ydixon/yolo_v3
$ cd yolo_v3
Download Yolo v3 weights
$ wget https://pjreddie.com/media/files/yolov3.weights
Download Darknet53 weights
$ wget https://pjreddie.com/media/files/darknet53.conv.74
Download COCO_ydixon weights
$ cd weights/COCO_ydixon
$ # Download weights from https://drive.google.com/drive/folders/1HPxEA7kyJxLTu0mWYu5Vlcz57NcuD2U3?usp=sharing
Download COCO
$ cd data/
$ bash get_coco_dataset.sh

How to use

These notebooks are intended to be self-sustained as possible as they could be, so you can just step through each cell and see the results. However, for the later notebooks, they will import classes that were built before. It's recommended to go through the notebooks in order.

yolo_detect.ipynb view

This notebook takes you through the steps of building the darknet53 backbone network, yolo detection layer and all the way up to objection detection from scratch.

• Conv-bn-Relu Blocks				• Residual Blocks
• Darknet53					• Upsample Blocks
• Yolo Detection Layer				• Letterbox Transforms
• Weight Loading				• Bounding Box Drawing
• IOU - Jaccard Overlap				• Non-max suppression (NMS)

Data_Augmentation.ipynb view

Show case augmentations used by the darknet cfg file including hue, saturation, exposure, jitter parameters. Also demo additional augmentations that could be used for different kinds of datasets such as rotation, shear, zoom, Gaussian noises, blurring, sharpening effect, etc. Most of the augmentations would be powered by the imgaug library. This notebook will also show how to integrate these augmentations into Pytorch datasets.

Augmentation Description Parameter
Random Crop +/- 30% (top, right, bottom, left) jitter
Letterbox Keep aspect ratio resize, pad with gray color N/A
Horizontal Flip 50% chance N/A
HSV Hue Add +/- 179 * hue hue
HSV Saturation Multiply 1/sat ~ sat saturation
HSV Exposure Multiply 1/exposure ~ exposure exposure

Deterministic_data_loading.ipynb view

Pytorch's Dataset and DataLoader class are easy and convenient to use. It does a really good job in abstracting the multiprocessing behind the scenes. However, the design also poses certain limitations when users try to add more functionalities. This notebook aims to address some of these concerns:

  1. Resume-able between batches
  2. Deterministic - results reproducible whether it has been paused/resume/one go.
  3. Reduced time for first batch - by default the Dataloader would need to iterate up to all the batches that came before the 'To-be-resumed batch' and that could take hours for long datasets.
  4. Cyclic - pick up left over samples that were not sufficient enough to form a batch and combine them with samples from the next epoch.

The goals are acheived by creating a controller/wrapper class around Dataset and DataLoader. This wrapper class is named as DataHelper. It act as an batch iterator and also stores information regarding the running iteration.

COCODataset.ipynb view

Shows how to parse the COCO dataset that follows the format that was used in the original darknet implementation .

• Generate labels			• Image loading
• Convert labels and image to Tensors	• Box coordinates transforms
• Build Pytorch dataset			• Draw

yolo_train.ipynb view

Building up on previous notebooks, this notebook implements the back-propagation and training process of the network. The most important part is figuring out how to convert labels to target masks and tensors that could be trained against. This notebook also modifies YoloNet a little bit to accommodate the changes.

• Multi-box IOU 			• YoloLoss
• Build truth tensor			• Generate masks
• Loss function				• Differential learning rates
• Intermediate checkpoints		• Train-resuming

Updated to use mseloss for tx, ty. This should improve training performance.

yolo_train_short.ipynb view

Minimal version of yolo_train.ipynb. You can use this notebook if you are only interested in testing with different datasets/augmentations/loss functions.

CVATDataset.ipynb view

After using CVAT to create labels, this notebook will parse the CVAT label format(xml) and convert it to readable format by the network. We will also start using openCV to draw and save image because openCV deals with pixels instead of DPI compared to PLT library which is more convenient.

cvat_data_train.ipynb view

Data is obtained by extracting images from a clip in Star Wars: Rogue One with ffmpeg. There are around 300 images and they are annotated by using CVAT. The notebook will simply overfit the model with custom data while using the darknet53 as feature extraction.
P.S I used this notebook as sanity test for yolo_train.ipynb while I was experimenting with the loss function

evaluate.ipynb view

mAP is an important metric to determine the performance of objection detection tasks. It's difficult to tell whether the model is doing good by looking at either the precision or recall value only. This notebook shows you the steps of creating the ground truth/detection annotations for your own dataset and obtain mAP with the official COCO API.

map_official_weights

Training

The model is trained from scratch with darknet53 as backbone. The model ended up reaching mAP of 54.4% on the 5K validation set. However, I still think the original yolov3 weights performs a lot better than this model, but I believe this is a good starting point. The weights at various stages could be downloaded in this google drive link. Feel free to test/experiment with them.

Epoch Batch [email protected] weights
50 91612 37.6% link
100 183225 44.5% link
150 274837 46.7% link
200 366450 49.1% link
250 458062 54.1% link
273 500200 53.2 link
COCO_ydixon 510099 54.5% link
darknet - 54.7% N/A

mAP_ydixon

Update Notes

2018/8/30:
Uploaded data/annotations for custom_data_train.ipynb. All notebooks should be working now.
2018/9/11:
Adapt data augmentations.
2018/9/30:
New loss function. Adapt darknet cfg augmentations parameters.
2018/11/04:
Accumlated gradients. Support use of subdivisions for GPU with less memory.
2018/12/15:
Multi-scale training. New DataHelper class for batch scheduling. custom_data_train.ipynb replaced by cvat_data_train.ipynb. Deterministic data loading with Pytorch's dataset/dataloader. Training now resume-able between batches instead of epochs while maintaining deterministic behavior.
2019/1/23:
Add mAP evaluation. NMS speed improvment by reducing operations in loops. Support up to Pytorch 0.4.1.
2019/7/2:
It's been awhile since last update. I've actually fixed the loss function few months ago, but I was held up by other projects.

  • Update loss function
    • Small objects gets larger gradient.
    • Each ground truth object is only assigned to 1 anchor across 3 layers.
  • Training
    • use sum of errors instead of averaging when dealing with subdivisions
    • display loss of individual batch instead of EWMA loss
  • Update test code
    • Add correct_yolo_boxes
      • detection boxes output from network are either letterboxed or scaled. This function reverse these transformations.
  • COCO_ydixon weights
    • Download weights to ./weights/COCO_ydixon
    • Test/Verify them in yolo_train_short.ipynb, evaluate.ipynb

TODO:

  1. Implement backhook for YoloNet branching
  2. Make command line API
  3. Feed Video to detector

References

  1. YOLOv3_: An Incremental Improvement. Joseph Redmon, Ali Farhadi
  2. Darknet Github Repo
  3. AlexeyAB Darknet Github Repo
  4. COCO API
  5. Fastai
  6. Pytorch Implementation of Yolo V3
  7. Deep Pyramidal Residual Networks
  8. eriklindernoren's Repo
  9. BobLiu20's Repo

yolo_v3's People

Contributors

ydixon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolo_v3's Issues

importing functions from transformers.py

Hi ydixon,

I'm trying to run train custom data. In this notebook , some functions are imported from transformers.py and they are using somewhere else. But these functions are not exists in transformers.py.

For example : BoundingBoxFormatConvert,ToX1y1x2y2Abs

I think you moved these functions. Can you update repo ?

Thank you ..

transforms.py

from transforms import BoundingBoxFormatConvert,ToX1y1x2y2Abs, ToCxcywhRel, ToIaa, iaa_hsv_aug, iaa_random_crop, iaa_letterbox, \
                       IaaAugmentations, ToNp, IaaLetterbox, ToTensor, Compose

some classes and functions in "transforms.py" module (such as "BoundingBoxFormatConvert","ToX1y1x2y2Abs", "ToCxcywhRel", "ToIaa", "ToNP") seems not to be defined.

AttributeError: Can't pickle local object 'get_trans_fn.<locals>.getTransformByDim'

Running cvat_data_train.ipynb
in
Net_Batch Epoch loss_x loss_y loss_w loss_h loss_conf loss_cls loss_total recall

AttributeError Traceback (most recent call last)
in
18 model_id=model_id, weight_dir=weight_dir,
19 checkpoint=None, checkpoint_interval=checkpoint_interval,
---> 20 use_gpu=True)

C:\AI\yolo_v3\train.py in train(data, net, optimizer, recorder, model_id, weight_dir, checkpoint, checkpoint_interval, use_gpu)
30 train_impl(data, net, optimizer, recorder, None,
31 model_id, weight_dir, checkpoint_interval,
---> 32 use_gpu)
33
34 def train_impl(data, net, optimizer, recorder, scheduler,

C:\AI\yolo_v3\train.py in train_impl(data, net, optimizer, recorder, scheduler, model_id, weight_dir, checkpoint_interval, use_gpu, debug_log)
43
44 # data will generate mini-batches of sample
---> 45 for sample in data:
46 # batch - mini-batch index, net_batch - net batch index, epoch - epoch index
47 batch, net_batch, epoch = data.get_batch(), data.get_net_batch(), data.get_epoch()

C:\AI\yolo_v3\dataset.py in gen(self)
352 def gen(self):
353 while self.current_batch < self.max_batches:
--> 354 for i in self.dataloader:
355 yield(i)
356 self.current_batch += 1

~\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in iter(self)
277 return _SingleProcessDataLoaderIter(self)
278 else:
--> 279 return _MultiProcessingDataLoaderIter(self)
280
281 @Property

~\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py in init(self, loader)
717 # before it starts, and del tries to join but will get:
718 # AssertionError: can only join a started process.
--> 719 w.start()
720 self._index_queues.append(index_queue)
721 self._workers.append(w)

~\Anaconda3\lib\multiprocessing\process.py in start(self)
110 'daemonic processes are not allowed to have children'
111 _cleanup()
--> 112 self._popen = self._Popen(self)
113 self._sentinel = self._popen.sentinel
114 # Avoid a refcycle if the target function holds an indirect

~\Anaconda3\lib\multiprocessing\context.py in _Popen(process_obj)
221 @staticmethod
222 def _Popen(process_obj):
--> 223 return _default_context.get_context().Process._Popen(process_obj)
224
225 class DefaultContext(BaseContext):

~\Anaconda3\lib\multiprocessing\context.py in _Popen(process_obj)
320 def _Popen(process_obj):
321 from .popen_spawn_win32 import Popen
--> 322 return Popen(process_obj)
323
324 class SpawnContext(BaseContext):

~\Anaconda3\lib\multiprocessing\popen_spawn_win32.py in init(self, process_obj)
87 try:
88 reduction.dump(prep_data, to_child)
---> 89 reduction.dump(process_obj, to_child)
90 finally:
91 set_spawning_popen(None)

~\Anaconda3\lib\multiprocessing\reduction.py in dump(obj, file, protocol)
58 def dump(obj, file, protocol=None):
59 '''Replacement for pickle.dump() using ForkingPickler.'''
---> 60 ForkingPickler(file, protocol).dump(obj)
61
62 #

AttributeError: Can't pickle local object 'get_trans_fn..getTransformByDim'

imgaug

import imgaug as ia
from imgaug import augmenters as iaa

where is imgaug?

yolo_tiny

It sounds like the code does not work with yolo_tiny from pjreddie

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.