Coder Social home page Coder Social logo

yolov3_pytorch's Introduction

YOLOv3

Full implementation of YOLOv3 in PyTorch.

Overview

YOLOv3: An Incremental Improvement

[Paper]
[Original Implementation]

Why this project

  • Implement YOLOv3 and darknet53 without original darknet cfg parser.
  • It is easy to custom your backbone network. Such as resnet, densenet...

Installation

Environment
  • pytorch >= 0.4.0
  • python >= 3.6.0
Get code
git clone https://github.com/BobLiu20/YOLOv3_PyTorch.git
cd YOLOv3_PyTorch
pip3 install -r requirements.txt --user
Download COCO dataset
cd data/
bash get_coco_dataset.sh

Training

Download pretrained weights
  1. See weights readme for detail.
  2. Download pretrained backbone wegiths from Google Drive or Baidu Drive
  3. Move downloaded file darknet53_weights_pytorch.pth to wegihts folder in this project.
Modify training parameters
  1. Review config file training/params.py
  2. Replace YOUR_WORKING_DIR to your working directory. Use for save model and tmp file.
  3. Adjust your GPU device. see parallels.
  4. Adjust other parameters.
Start training
cd training
python training.py params.py
Option: Visualizing training
#  please install tensorboard in first
python -m tensorboard.main --logdir=YOUR_WORKING_DIR   

Evaluate

Download pretrained weights
  1. See weights readme for detail.
  2. Download pretrained yolo3 full wegiths from Google Drive or Baidu Drive
  3. Move downloaded file official_yolov3_weights_pytorch.pth to wegihts folder in this project.
Start evaluate
cd evaluate
python eval_coco.py params.py

Quick test

pretrained weights

Please download pretrained weights official_yolov3_weights_pytorch.pth or use yourself checkpoint.

Start test
cd test
python test_images.py params.py

You can got result images in output folder.

Measure FPS

pretrained weights

Please download pretrained weights official_yolov3_weights_pytorch.pth or use yourself checkpoint.

Start test
cd test
python test_fps.py params.py
Results
  • Test in TitanX GPU with different input size and batch size.
  • Keep in mind this is a full test in YOLOv3. Not only backbone but also yolo layer and NMS.
Imp. Backbone Input Size Batch Size Inference Time FPS
Paper Darknet53 320 1 22ms 45
Paper Darknet53 416 1 29ms 34
Paper Darknet53 608 1 51ms 19
Our Darknet53 416 1 28ms 36
Our Darknet53 416 8 17ms 58

Credit

@article{yolov3,
	title={YOLOv3: An Incremental Improvement},
	author={Redmon, Joseph and Farhadi, Ali},
	journal = {arXiv},
	year={2018}
}

Reference

yolov3_pytorch's People

Contributors

bobliu20 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolov3_pytorch's Issues

Training error: RuntimeError: invalid argument 2: size '[1 x 3 x 6 x 13 x 13]' is invalid for input with 43095 elements at /pytorch/aten/src/TH/THStorage.c:41

Traceback (most recent call last):
  File "training.py", line 224, in <module>
    main()
  File "training.py", line 221, in main
    train(config)
  File "training.py", line 93, in train
    _loss_item = yolo_losses[i](outputs[i], labels)
  File "/home/adam/.virtualenvs/yolov3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/adam/yolov3/YOLOv3_PyTorch/training/../nets/yolo_loss.py", line 33, in forward
    self.bbox_attrs, in_h, in_w).permute(0, 1, 3, 4, 2).contiguous()
RuntimeError: invalid argument 2: size '[1 x 3 x 6 x 13 x 13]' is invalid for input with 43095 elements at /pytorch/aten/src/TH/THStorage.c:41
# prediction = input.view(bs,  self.num_anchors, self.bbox_attrs, in_h, in_w).permute(0, 1, 3, 4, 2).contiguous()
ipdb> input.size()
torch.Size([1, 255, 13, 13])
ipdb> bs, in_h, in_w, self.num_anchors, self.bbox_attrs
(1, 13, 13, 3, 6)
ipdb> self.num_classes
1

Data folder structure for fine-tuning

Hi, thank you for this work. I wonder how I shall organize the images and annotations in the 'data' folder for fine-tuning on my own dataset? Cheers.

Hi,I have some problem in test the model?

I trained the model in the voc dataset,and the loss is in 0.4-0.7.But when I test the trained model use image,the output is nothing,there only have the origin image,but not have the bbox in the image.Can you tell me this is why?And how can I solve this problem?Thank you very much!

Confused with your mAP computing!

@BobLiu20 Thanks for your code, but your mAP computing is actually recall at specific confidence threshold, which is totally different from mAP.
Here is my test results on your models with coco-evaluate tools.

Input size: 416x416

YOLOv3 (convert from paper)
mAP: 0.291(0.31 for paper) mAP(0.5): 0.532 (0.553 for paper)

YOLOv3 (our train 20 epochs)
mAP:0.239 mAP(0.5): 0.461

error during training

Hi. During training with other dataset, I found the following error:
error

This error occurs everytime but in different steps, sometimes 400, sometimes 3800. Could you figure out what is the problem?

How to train on my own dataset

Hi, thank you for your great contribution. I want to train my own dataset which is a single-class dataset. Would you please give me some suggestions on that? Thanks~

About the loss calculation

Maybe I think you could set

        self.mse_loss = nn.MSELoss()
        self.bce_loss = nn.BCELoss()

to

        self.mse_loss = nn.MSELoss(size_average=False)
        self.bce_loss = nn.BCELoss(size_average=False)

and then calculate losses like this way:

        n_mask = torch.sum(mask)
        n_noobj_mask = torch.sum(noobj_mask)

        loss_x = self.bce_loss(x * mask, tx * mask) / n_mask
        loss_y = self.bce_loss(y * mask, ty * mask) / n_mask
        loss_w = self.mse_loss(w * mask, tw * mask) / n_mask
        loss_h = self.mse_loss(h * mask, th * mask) / n_mask
        loss_conf = self.bce_loss(conf * mask, 1.0 * mask) / n_mask + \
            0.5 * self.bce_loss(conf * noobj_mask, 0.0 * noobj_mask) / n_noobj_mask
        loss_cls = self.bce_loss(pred_cls[mask == 1], tcls[mask == 1]) / n_mask

请问这个repo怎么测试demo?

没有看到demo.py,有的yolo3的repo给出了视频检测和图片检测的demo,我可以使用其他repo的demo.py代码来对于模型进行效果测试吗?

Box overlapping & false positive issue

Hi, thanks for providing the code and script. :-)

I trained with coco dataset using your training code.
I only changed "parallels : [0,1,2,3]" to "paralles : [0]" in params.py and others are default set.
Unlike what you mentioned, test results in 10, 20 epoch are not good.
There are a lot of box overlapping problems and false positives as shown below.
Do you know why this is happening?
How can I get the detection performance shown in the script with your training code?
Thank you.

0_6
0_5
0_8
0_9

Implementing different backbone to this YOLOv3

Hello.

First of all, thank you for this awesome repo!

I have one question.
I want to change the backbone from darknet to something else.
After inspected and tested the code, I found out that the things that I need to do are (roughly):

  1. Make sure that the new backbone net output three tensors from different layers in forward method
  2. While referring to nets/backbone/darknet.py, add some needed implementation details to the new backbone (like attributes and methods)
  3. Set the new backbone in params.py

Is this correct? Or is there any other things that I need to do or be careful of?
Thanks.

CUDA out of memory, any tips?

I've gotten image recognition to work at multiple frames/second, using a GTX 1060 with 6GB of memory. Now I'm trying to train a custom classifier but I keep running out of memory.
Running on the darknet implementation, I can train using the yolov3-tiny.cfg file but not the yolov3.cfg file, which I guess is probably expected behavior given my hardware limitations. Now I'm trying to train with this implementation.

What parameters could I tweak in training/params.py to reduce my memory consumption?
Is there an equivalent param in this implementation for subdivisions in the darknet implementation?

Problem in grid generating

Hi, it occurs to me when i test on my own dataset sized at (1248, 416).

  File "/home/kuro/dev/virtualenvs/pytorch37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/kuro/dev/projects/YOLOv3_PyTorch/test/../nets/yolo_loss.py", line 74, in forward
    bs * self.num_anchors, 1, 1).view(x.shape).type(FloatTensor)
RuntimeError: invalid argument 2: size '[1 x 3 x 13 x 39]' is invalid for input with 4563 elements at /pytorch/aten/src/TH/THStorage.cpp:84

I then found it working correctly when I modified line 73 - 76 of yolo_loss.py as follow:

            grid_x = torch.linspace(0, in_w-1, in_w).repeat(in_h, 1).repeat(
                bs * self.num_anchors, 1, 1).view(x.shape).type(FloatTensor)
            grid_y = torch.linspace(0, in_h-1, in_h).repeat(in_w, 1).t().repeat(
                bs * self.num_anchors, 1, 1).view(y.shape).type(FloatTensor)

I exchanged position of repeat(in_h, 1) and repeat(in_w,1) and found it working well.
I wonder if it's a bug, please checkout.

Prediction

Thanks for your excellent and easily understanding script.

My simple question is how to predict, because there are no prediction module, modifying train.py or eval.py, I have no idea, hope to be answered.

Batch_size changed to 64

Thank you for your sharing. I want to change the batch size to. But the memory is out of size. In the Darknet, there's a parameter called "subdivisions". Do you have any suggessions to divide the one mini-batch in order to increase the maximum batch size?

About the training process and question

Hi @BobLiu20 , thanks for your code of YOLO. I'm training under the COCO but there is no detection with the trained model(2 epochs 19000steps, loss=0.41).
(1) Is this loss too high or something goes wrong? At the begining the loss is 5.1 and then quickly goes down to 0.5. Then in following 1 or 2 epochs the downward trend of loss is almost disappeared, always about 0.7-0.4. Is this normal?
(2) The detection is none or has many faults. I'm not sure if it's because the training is not enough & sufficient. Maybe COCO is hard to train and needs many train epochs? Could you tell me?
Danke for your reply!

A possible bug in training.py

Dear @BobLiu20 , thanks for sharing your code. It's wonderful, however I think there might be a bug in the loss related block of training.py:

losses = [[]] * len(losses_name)
for i in range(3):
     _loss_item = yolo_losses[i](outputs[i], labels)
     for j, l in enumerate(_loss_item):
        losses[j].append(l)
losses = [sum(l) for l in losses]
loss = losses[0]

I suppose the losses should be a 7 * 3 list, but actually it appear to be of the size 7 * 21. Every time losses[j].append(l) is executed, l is appended to each sublist of losses. And after losses = [sum(l) for l in losses] is done, all the elements of 'losses' are identical. I doubt that this behavior is not like what the code is supposed to act. If I'm right, I think this issue can be addressed by changing losses = [[]] * len(losses_name) into:

losses = []
for i in range(len(losses_name)):
    losses.append([])

or something like that.

Details about training code

Hi @BobLiu20, thanks for your excellent code! Can you please provide more details about training? Can it be done without the pretrained weights? How long would it take (how many epochs and with what batch size)? I am not able to get a graph like the one you have shown, the loss is only oscillating around the initial value but not reducing.

我有一个疑问

在代码的yolo_loss.py中的get_target函数中为什么是noobj_mask[b,anch_ious>ignore_threshold]=0,而不是noobj_mask[b,anch_ious>ignore_threshold,gi,gj]=0,不应该只在负责预测的网格上gt和anchor的overlap才可能超过阈值吗,如果你那样写岂不是所有网格的那个anchor在计算置信度损失的时候都被忽略了。

is the loss_conf right in loss function?

Thanks for your work.
i have a question.
loss_conf = .. + 0.5 * self.bce_loss(conf*noobj_mask,0)
why there is a zero?
shouldn't it beloss_conf = .. + 0.5 * self.bce_loss(conf*noobj_mask,tconf*noobj_mask)?
@BobLiu20
And for classification, why dont use a CrossEntropyLoss?

test_images.py Colormap ValueError

Thank you for your great code, when I run python test_images.py params.py, I encountered the error below, my matplotlib's version is 2.0.0. After I changed tab20b to Vega20b arbitrarily, I succeeded in running the test_images.py, so I wonder if you may change tab20b to make the code more robust.

Traceback (most recent call last):
  File "test_images.py", line 30, in <module>
    cmap = plt.get_cmap('tab20b')
  File "/usr/share/Anaconda3/lib/python3.6/site-packages/matplotlib/cm.py", line 176, in get_cmap
    % (name, ', '.join(sorted(cmap_d.keys()))))
ValueError: Colormap tab20b is not recognized. Possible values are: Accent, Accent_r, Blues, Blues_r, BrBG, BrBG_r, BuGn, BuGn_r, BuPu, BuPu_r, CMRmap, CMRmap_r, Dark2, Dark2_r, GnBu, GnBu_r, Greens, Greens_r, Greys, Greys_r, OrRd, OrRd_r, Oranges, Oranges_r, PRGn, PRGn_r, Paired, Paired_r, Pastel1, Pastel1_r, Pastel2, Pastel2_r, PiYG, PiYG_r, PuBu, PuBuGn, PuBuGn_r, PuBu_r, PuOr, PuOr_r, PuRd, PuRd_r, Purples, Purples_r, RdBu, RdBu_r, RdGy, RdGy_r, RdPu, RdPu_r, RdYlBu, RdYlBu_r, RdYlGn, RdYlGn_r, Reds, Reds_r, Set1, Set1_r, Set2, Set2_r, Set3, Set3_r, Spectral, Spectral_r, Vega10, Vega10_r, Vega20, Vega20_r, Vega20b, Vega20b_r, Vega20c, Vega20c_r, Wistia, Wistia_r, YlGn, YlGnBu, YlGnBu_r, YlGn_r, YlOrBr, YlOrBr_r, YlOrRd, YlOrRd_r, afmhot, afmhot_r, autumn, autumn_r, binary, binary_r, bone, bone_r, brg, brg_r, bwr, bwr_r, cool, cool_r, coolwarm, coolwarm_r, copper, copper_r, cubehelix, cubehelix_r, flag, flag_r, gist_earth, gist_earth_r, gist_gray, gist_gray_r, gist_heat, gist_heat_r, gist_ncar, gist_ncar_r, gist_rainbow, gist_rainbow_r, gist_stern, gist_stern_r, gist_yarg, gist_yarg_r, gnuplot, gnuplot2, gnuplot2_r, gnuplot_r, gray, gray_r, hot, hot_r, hsv, hsv_r, inferno, inferno_r, jet, jet_r, magma, magma_r, nipy_spectral, nipy_spectral_r, ocean, ocean_r, pink, pink_r, plasma, plasma_r, prism, prism_r, rainbow, rainbow_r, seismic, seismic_r, spectral, spectral_r, spring, spring_r, summer, summer_r, terrain, terrain_r, viridis, viridis_r, winter, winter_r

test

Traceback (most recent call last):
File "test_images.py", line 158, in
main()
File "test_images.py", line 154, in main
test(config)
File "test_images.py", line 126, in test
plt.text(x1, y1, s=classes[int(cls_pred)], color='white',
IndexError: list index out of range

cannot download coco dataset

(Pytorch) bash-3.2$ bash get_coco_dataset.sh
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
get_coco_dataset.sh: line 7: cd: coco: No such file or directory
mkdir: images: File exists
get_coco_dataset.sh: line 13: wget: command not found
get_coco_dataset.sh: line 14: wget: command not found
unzip: cannot find or open train2014.zip, train2014.zip.zip or train2014.zip.ZIP.
unzip: cannot find or open val2014.zip, val2014.zip.zip or val2014.zip.ZIP.
get_coco_dataset.sh: line 23: wget: command not found
get_coco_dataset.sh: line 24: wget: command not found
get_coco_dataset.sh: line 25: wget: command not found
get_coco_dataset.sh: line 26: wget: command not found
tar: Error opening archive: Failed to open 'labels.tgz'
unzip: cannot find or open instances_train-val2014.zip, instances_train-val2014.zip.zip or instances_train-val2014.zip.ZIP.
get_coco_dataset.sh: line 31: 5k.part: No such file or directory
paste: 5k.part: No such file or directory
get_coco_dataset.sh: line 32: trainvalno5k.part: No such file or directory
paste: trainvalno5k.part: No such file or directory

Running out of memory when calling `.cuda()`

Hey! Been trying to use this repo as it looks to be the best pytorch implementation of YOLOv3 just by reading the code a bit.

For some reason I am running out of memory when trying to load the model into the GPU. I tried just running the eval.py script with batch size 1, and also just calling .cuda() on the model from an ipython session and I always get the dreaded cuda runtime error (2) : out of memory at /pytorch/aten/src/. I am using a GTX1080Ti which has around 11GB of memory which should be more than enough for yolo. Any ideas of what could be wrong?

Cheers!

my dataset,why?

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 15 and 16 in dimension 2 at c:\a\w\1\s\tmp_conda_3.7_110206\conda\conda-bld\pytorch_1550401474361\work\aten\src\thc\generic/THCTensorMath.cu:83

CUDA driver version is insufficient.

I get this when I try to run the evals.py file. I have tried this on both the CUDA 8.0 and the CUDA 9.0 versions of pytorch, no dice.

RuntimeError: cuda runtime error (35) : CUDA driver version is insufficient for CUDA runtime version at /opt/conda/conda-bld/pytorch_1524586445097/work/aten/src/THC/THCGeneral.cpp:70

How long does it take to train the COCO?

I have a 1080ti and each epoch takes around 2 hours. Suppose the network would need 100 epoch to complete that would mean 200 hrs ~ 8 days. And yes I'm using darknet53 as backbone already. Does everyone else have similar speed?

Ran some profiling on a GTX1080Ti

Hey, I modified the eval script a bit to run some predictions and test the FPS a bit.

I was getting much lower FPS than what is quoted in https://github.com/eriklindernoren/PyTorch-YOLOv3 so I decided to do some profiling. The core of the model (the part that runs on the GPU) runs at about 90FPS which is great, but when I add the rest of the algorithm such as NMS and input image re-scaling the FPS drops to around 15.

Am I doing something wrong? Have you tried the FPS on your setup?

Cheers!

What's the meaning of w and h in your loss function?

When you calculate loss, the code about tw and th like this:

# Width and height
tw[b, best_n, gj, gi] = math.log(gw/anchors[best_n][0] + 1e-16)
th[b, best_n, gj, gi] = math.log(gh/anchors[best_n][1] + 1e-16)

I don't understand why you use log,and in the paper, the loss function about w and h is
image

Can you explain it ? THX.

about trained weights

I used your code to train my own data. The loss drops normally. But when I used the trained weights to detect, no detection can be made.
I also tried VOC dataset on this training code, but still there is no detection. Seems there are some issues in the training process.

KeepAspect() use

Hi...can you tell me how to use KeepAspect() function and what parameters to set in params.py for "img_h": ?,
"img_w": ?,
to make it work for rectangular input.Thanks

Weights in checkpoint are on a different namespace than the weights the network expects.

Hi, been trying to load the weights in the checkpoint you provide, but I get the following (abbreviated) error message:

RuntimeError: Error(s) in loading state_dict for ModelMain:
Missing key(s) in state_dict: "backbone.conv1.weight", "backbone.bn1.weight", "backbone.bn1.bias", "backbone.bn1.running_mean", "backbone.bn1.running_var", "backbone.layer1.ds_conv.weight", "backbone.layer1.ds_bn.weight", "backbone.layer1.ds_bn.bias", "backbone.layer1.ds_bn.running_mean", "backbone.layer1.ds_bn.running_var", "backb
one.layer1.residual_0.conv1.weight", "backbone.layer1.residual_0.bn1.weight", "backbone.layer1.residual_0.bn1.bias", "backbone.layer1.residual_0.bn1.running_mean", "backbone.layer1.residual_0.bn1.running_var", "backbone.layer1.residual_0.conv2.weight", "backbone.layer1.residual_0.bn2.weight", "backbone.layer1.residual_0.bn2.bias", "backbon
e.layer1.residual_0.bn2.running_mean"...

Unexpected key(s) in state_dict: "module.backbone.conv1.weight", "module.backbone.bn1.weight", "module.backbone.bn1.bias", "module.backbone.bn1.running_mean", "module.backbone.bn1.running_var", "module.backbone.layer1.ds_conv.weight", "module.backbone.layer1.ds_bn.weight", "module.backbone.layer1.ds_bn.bias", "module.backbone.layer1.ds_bn.running_mean", "module.backbone.layer1.ds_bn.running_var", "module.backbone.layer1.residual_0.conv1.weight", "module.backbone.layer1.residual_0.bn1.weight", "module.backbone.layer1.residual_0.bn1.bias", "module.backbone.layer1.residual_0.bn1.running_mean", "module.backbone.layer1.residual_0.bn1.running_var", "module.backbone.layer1.residual_0.conv2.weight", "module.backbone.layer1.residual_0.bn2.weight", "module.backbone.layer1.residual_0.bn2.bias", "module.backbone.layer1.residual_0.bn2.running_mean"...

It looks as if the checkpoint has all of its weights inside the module namespace, while the model doesn't, is there a particular reason for this?

Thanks!

mAP problem

I test the mAP result:
1: official_yolov3_weights_pytorch: (AP,IoU=0.5): 0.540
2: yolov3_weights_pytorch: (AP,IoU=0.5): 0.469, which is lower than what you says in 'weights/readme':0.5966

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.