Coder Social home page Coder Social logo

tianxiaomo / pytorch-yolov4 Goto Github PK

View Code? Open in Web Editor NEW
4.4K 52.0 1.5K 2.39 MB

PyTorch ,ONNX and TensorRT implementation of YOLOv4

License: Apache License 2.0

Python 74.18% Makefile 0.71% Cuda 1.04% C++ 24.06%
yolov4 pytorch darknet2pytorch darknet2onnx tensorrt onnx pytorch-yolov4 yolov4-tiny yolov3

pytorch-yolov4's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-yolov4's Issues

batch size > 1时get_region_boxes1函数无法正常运行,导致nms出现问题

问题描述

当我一次性输入多张图像时,经过get_region_boxes1产生的bboxs无法被正常进行NMS,但传入一张图像时却可以正常NMS.但使用get_region_boxes函数时无论batch大小都可以正常进行NMS.

无法正常NMS代码如下非正常效果

anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]
        num_anchors = 9
        anchor_masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
        strides = [8, 16, 32]
        anchor_step = len(anchors) // num_anchors
        detections = [detection.cpu().data.numpy() for detection in detections]
        bboxs_for_imgs = []
        for i in range(3):
            masked_anchors = []
            for m in anchor_masks[i]:
                masked_anchors += anchors[m * anchor_step:(m + 1) * anchor_step]
            masked_anchors = [anchor / strides[i] for anchor in masked_anchors]
            bboxs_for_imgs.append(get_region_boxes1(detections[i], 0.6, 80, masked_anchors, len(anchor_masks[i])))

        bboxs_for_imgs = [
            bboxs_for_imgs[0][index] + bboxs_for_imgs[1][index] + bboxs_for_imgs[2][index]
            for index in range(self.batch_size)]
        # 分别对每一张图片的结果进行nms
        detections = [nms(bboxs, self.nms_thres) for bboxs in bboxs_for_imgs]
        detections = [np.array(bboxs) for bboxs in detections]

可以正常NMS代码如下正常效果

anchors = [12, 16, 19, 36, 40, 28, 36, 75, 76, 55, 72, 146, 142, 110, 192, 243, 459, 401]
        num_anchors = 9
        anchor_masks = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
        strides = [8, 16, 32]
        anchor_step = len(anchors) // num_anchors
        bboxs_for_imgs = []
        for i in range(3):
            masked_anchors = []
            for m in anchor_masks[i]:
                masked_anchors += anchors[m * anchor_step:(m + 1) * anchor_step]
            masked_anchors = [anchor / strides[i] for anchor in masked_anchors]
            bboxs_for_imgs.append(get_region_boxes(detections[i], 0.6, 80, masked_anchors, len(anchor_masks[i])))
        bboxs_for_imgs = [
            bboxs_for_imgs[0][index] + bboxs_for_imgs[1][index] + bboxs_for_imgs[2][index]
            for index in range(self.batch_size)]
        # 分别对每一张图片的结果进行nms
        detections = [nms(bboxs, self.nms_thres) for bboxs in bboxs_for_imgs]
        detections = [np.array(bboxs) for bboxs in detections]

而且虽然get_region_boxes可以使用,但耗时却是get_region_boxes1函数的两倍以上

Some mistakes

首先感谢兄弟的代码。
models.py 里面的代码:论文写的是Neck,不是Neek
/tool/ 文件夹下是不是少的了一个字母:coco_annotatin.py -> coco_annotation.py ???
dataset.py 中数据集有少部分生成数据的值会大于255,这是允许的嘛?在rgb转hsv增强后再转回rgb时产生的,这是预期的情况嘛?
train.py 中val_loader实例少写了collate_fn在enumerate(val_loader)时会报错。

===Google Translate===
First of all thanks to the brothers for the code.
Code in models.py: The paper is written for Neck, not Neek
Is there a letter missing in the / tool / folder: coco_annotatin.py-> coco_annotation.py ???
A small part of the data set in dataset.py will have a value greater than 255. Is this allowed? It is generated when rgb is converted to hsv and then back to rgb. Is this the expected situation?
The val_loader instance in train.py is less written. collate_fn will report an error when enumerate (val_loader).

pad=0 or pad=1

我看源码上的无论过滤器大小是3还是1,pad都是等于1,我想问一下你的网络里pad有两种参数,这会有问题吗?

Getting detection output frame by frame

Hi,

Does anyone know of a way to get the detection information from the model frame by frame and then save that information to a file? I have been trying to figure this out but have not found a good solution yet. Would love to hear your ideas.

Warm Regards

更新

你好,小模:
我想请问你一下,你大概多久会更新一下yolov4代码?

训练时遇到的问题

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [4, 3, 19, 19, 85]], which is output 0 of AsStridedBackward, is at version 6; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
我训练自己的数据集遇到这个问题,之前重来没有遇到过,请问有解决办法吗?

demo tensorflow

你好,我已经实现了 .onnx -> .pb 的转换,并使用tensorflow2.2.0的compat.v1图实现目标检测了。请问如何操作呢?谢谢!

=== Google translate ===
Hello, I have implemented the conversion of .onnx-> .pb and implemented the target detection using the compat.v1 graph of tensorflow2.2.0. How do i do it? Thank you!

当我训练时会自动停止,报错如下

Traceback (most recent call last):
File "train.py", line 429, in
device=device, )
File "train.py", line 305, in train
bboxes_pred = model(images)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 417, in forward
d3 = self.down3(d2)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 173, in forward
x1 = self.conv1(input)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 58, in forward
x = l(x)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/root/pytorch-YOLOv4-master/models.py", line 10, in forward
x = x * (torch.tanh(torch.nn.functional.softplus(x)))
File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 4647) is killed by signal: Killed.

我的cuda版本为10.2,显存9G,设置num_works=1,还是会自己中断?

darknet2pytorch.py文件中有些地方不是很明白

tool文件夹下的darknet2pytorch.py的169行紧跟在continue语句的下面,那这些代码是不是不会被执行?
if self.loss:
self.loss = self.loss + self.models[ind](x)
else:
self.loss = self.models[ind](x)
上面这段代码是不是应该放在176行yolo层下面?

output bbox not correct

Thanks for your great work

但我将darknet训练好自己数据集模型放在您的repo使用
output出来的bbox不太对

只会出现下面这样一条短直线
但是在darknet是没问题的

sendpix5

Residual over dense?

Hello, I've read your code and I managed to find some things, where refactoring is needed and which are not as in official yolo network. The first one downsamples from DownSample 2 to DownSample 5 are the same code, I offer you to use one module to all of them, the CSP in network is nice. Second, I have not found DENSENET layer in your implementation, it should be between downsampling and SPP. I hope we will find the way of dealing with this problem.
Best regards,
Vadim.

Incorrect YOLO head

Hello!
At first, sorry for DDOS : ), but there is one more thing, which makes me curious. It is SAM.
image As you can see, there is a modified sam in YOLO layer, which you haven't implemented yet. So, as I understand it should be used in x2, x10 and x18 in your implementation, because they are empty not used now.
Best regards,
Vadims.

get nan in box list when inference

My dataset only has two classes so I changed the channel num from 255 to 21 in the head, is this the only part you need to take care when train your own dataset?

Max pooling issue

Sorry for bothering you again :-)
nn.MaxPool2d is already enough to handle your cases.
In line 237 of darknet2pytorch.py
Change

            elif block['type'] == 'maxpool':
                pool_size = int(block['size'])
                stride = int(block['stride'])
                if stride > 1:
                    model = nn.MaxPool2d(pool_size, stride)
                else:
                    model = MaxPoolStride1(pool_size)

to

            elif block['type'] == 'maxpool':
                pool_size = int(block['size'])
                stride = int(block['stride'])
                model = nn.MaxPool2d(kernel_size=pool_size, stride=stride, padding=pool_size//2)

代码运行报错

/pytorch/torch/csrc/autograd/python_anomaly_mode.cpp:57: UserWarning: Traceback of forward call that caused the error:
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 453, in
device=device, )
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 329, in train
loss, loss_xy, loss_wh, loss_obj, loss_cls, loss_l2 = criterion(bboxes_pred, bboxes)
File "/home/qw/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 210, in forward
output[..., np.r_[0:4, 5:n_ch]] *= tgt_mask

Epoch 1/500: 0%| | 0/103 [00:05<?, ?img/s]
Traceback (most recent call last):
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 453, in
device=device, )
File "/home/qw/github/ObjectDetect/pytorch-YOLOv4-master/train.py", line 331, in train
loss.backward()
File "/home/qw/.local/lib/python3.6/site-packages/torch/tensor.py", line 118, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/qw/.local/lib/python3.6/site-packages/torch/autograd/init.py", line 93, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [1, 3, 19, 19, 7]], which is output 0 of torch::autograd::CopySlices, is at version 7; expected version 4 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

-dir 参数问题

(1)此问题已解决,-dir路径+train.txt路径 = 图片路径,train.txt里的路径不全,不能imread到图像,调试就可以发现。
(2)此程序可以训练自己的数据集,需要修改models.py,修改如darknet下cfg修改方式。coco_annotatin.py需要修改,修改成自己图片路径及名称。如果单运行train.py的话,cfg.dataset_dir设成自己的路径。

about CosineAnnealingWarmRestarts

I found scheduler of CosineAnnealingWarmRestarts in your train code, but I don't know why you have no use this scheduler. Or it be implenment in other place in code?

Please replace as many torch.tensor instances to numpy arrays as possible

I have got many problems trying to convert your model into optimized models for inferences.
I have found many places in pytorch-YOLOv4/utils/utils.py using torch.tensor that are problematic for torch.trace to handle.
For example, in method get_region_boxes(), please convert torch.tensor into numpy arrays because this method is NOT directly related to the YOLOv4 model itself.
Please replace as many these tensor instances by numpy arrays as possible in pytorch-YOLOv4/utils/utils.py.
不要用tensor去实现各种和model无关的逻辑,尽量用numpy
Thank you very much.

demo.py issue with darknet weights

I'm using weights I trained on the original darknet repo, and just updated the yolov3-tiny.cfg file with 3 classes (and changed the filters in the prior layer to 24), but I am getting a mismatched tensor error. I printed out x1 and x2 and I get the respective values:
x1: torch.Size([1, 128, 30, 30])
x2: torch.Size([1, 256, 27, 27])

Any advice?

Traceback (most recent call last):
  File "demo.py", line 213, in <module>
    detect(cfgfile, weightfile,imgfile)
  File "demo.py", line 46, in detect
    boxes = do_detect(m, sized, 0.5, 0.4, use_cuda)
  File "/Users/vkrd/Documents/Projects/pytorch-YOLOv4/tool/utils.py", line 420, in do_detect
    list_boxes = model(img)
  File "/Users/vkrd/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/vkrd/Documents/Projects/pytorch-YOLOv4/tool/darknet2pytorch.py", line 144, in forward
    x = torch.cat((x1, x2), 1)
RuntimeError: Sizes of tensors must match except in dimension 1. Got 30 and 27 in dimension 2

your training files (train.txt and val.txt)

First, thank you so much for your great work.
I am currently trying to learn YOLOv4 using your implementation.
I am interested in training.
Can you share your train.txt and val.txt for me to reproduce your work?
Cfg.train_label = 'data/train.txt'
Cfg.val_label = 'data/val.txt'

Thanks!!

There is something wrong in loading dataset when training my own dataset.

File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 375, in getitem
cut_y, i, left_shift, right_shift, top_shift, bot_shift)
File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 215, in blend_truth_mosaic
bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)
File "F:/PYTHON_project/pytorch-YOLOv4/dataset.py", line 184, in filter_truth
bboxes[:, 0] -= dx
IndexError: too many indices for array

Who has solved such a problem?Can you tell me?Thank you!

bad effect

Why doesn't pytorch-yolov4 recognize the small object in the upper right corner in dog.jpg

Potential bug risk in data enhancement / processing

def blend_truth_mosaic(out_img, img, bboxes, w, h, cut_x, cut_y, i_mixup, left_shift, right_shift, top_shift, bot_shift):

if i_mixup == 0:
    bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)

    out_img[:cut_y, :cut_x] = img[top_shift:top_shift + cut_y, left_shift:left_shift + cut_x]
if i_mixup == 1:
    bboxes = filter_truth(bboxes, cut_x - right_shift, top_shift, w - cut_x, cut_y, cut_x, 0)

    out_img[:cut_y, cut_x:] = img[top_shift:top_shift + cut_y, cut_x - right_shift:w - right_shift]
if i_mixup == 2:
    bboxes = filter_truth(bboxes, left_shift, cut_y - bot_shift, cut_x, h - cut_y, 0, cut_y)

    out_img[cut_y:, :cut_x] = img[cut_y - bot_shift:h - bot_shift, left_shift:left_shift + cut_x]
if i_mixup == 3:
    bboxes = filter_truth(bboxes, cut_x - right_shift, cut_y - bot_shift, w - cut_x, h - cut_y, cut_x, cut_y)

    out_img[cut_y:, cut_x:] = img[cut_y - bot_shift:h - bot_shift, cut_x - right_shift:w - right_shift]

return out_img, bboxes

===========================================================================
Suggested to be revised as:
def blend_truth_mosaic(out_img, img, bboxes, w, h, cut_x, cut_y, i_mixup, left_shift, right_shift, top_shift, bot_shift):

left_shift = min(left_shift, w - cut_x)
top_shift = min(top_shift, h - cut_y)
right_shift = min(right_shift, cut_x)
bot_shift = min(bot_shift, cut_y)

if i_mixup == 0:
    bboxes = filter_truth(bboxes, left_shift, top_shift, cut_x, cut_y, 0, 0)

    out_img[:cut_y, :cut_x] = img[top_shift:top_shift + cut_y, left_shift:left_shift + cut_x]
if i_mixup == 1:
    bboxes = filter_truth(bboxes, cut_x - right_shift, top_shift, w - cut_x, cut_y, cut_x, 0)

    out_img[:cut_y, cut_x:] = img[top_shift:top_shift + cut_y, cut_x - right_shift:w - right_shift]
if i_mixup == 2:
    bboxes = filter_truth(bboxes, left_shift, cut_y - bot_shift, cut_x, h - cut_y, 0, cut_y)

    out_img[cut_y:, :cut_x] = img[cut_y - bot_shift:h - bot_shift, left_shift:left_shift + cut_x]
if i_mixup == 3:
    bboxes = filter_truth(bboxes, cut_x - right_shift, cut_y - bot_shift, w - cut_x, h - cut_y, cut_x, cut_y)

    out_img[cut_y:, cut_x:] = img[cut_y - bot_shift:h - bot_shift, cut_x - right_shift:w - right_shift]

return out_img, bboxes

Do you mind inviting me to development of your repository?

I think you have already contributed a lot into the opensource community on behalf of YOLOv4.
But I think we'd better en-tighten relations between your repository and the ONNX standard. This would also contribute to popularity of your repository .

I will add some python scripts here:

  1. A script to convert pytorch into ONNX
  2. An inference demo running ONNX

激活函数

我看到在cfg文件里,有一些卷积层的激活函数使用的是linear,但似乎在您的demo文件中我并没有看到

input images for training need to be normalized

I noticed that the input image for inference in do_detect() is normalized (dividing each element by 255.0), but the input images for training are not normalized.
After normalizing the input images, fine-tuning seems working better.

Basically, I loaded the yolov4.pth and fine-tune the model for 1 epoch w/ few images (w/ no data augmentation and very small lr) to verify the functionality. Without the input normalization, it couldn't predict at all, but w/ normalization it does predict similar to the ones from converted yolov4.pth w/o fine-tuning.

About the provided .pth

Dear Author:

Thank you. May I know what is the difference between:
yolov4.pth and yolov4.conv.137.pth?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.