ayooshkathuria / pytorch-yolo-v3 Goto Github PK

A PyTorch implementation of the YOLO v3 object detection algorithm

Python 100.00%

pytorch-yolo-v3's Introduction

A PyTorch implementation of a YOLO v3 Object Detector

[UPDATE] : This repo serves as a driver code for my research. I just graduated college, and am very busy looking for research internship / fellowship roles before eventually applying for a masters. I won't have the time to look into issues for the time being. Thank you.

This repository contains code for a object detector based on YOLOv3: An Incremental Improvement, implementedin PyTorch. The code is based on the official code of YOLO v3, as well as a PyTorch port of the original code, by marvis. One of the goals of this code is to improve upon the original port by removing redundant parts of the code (The official code is basically a fully blown deep learning library, and includes stuff like sequence models, which are not used in YOLO). I've also tried to keep the code minimal, and document it as well as I can.

Tutorial for building this detector from scratch

If you want to understand how to implement this detector by yourself from scratch, then you can go through this very detailed 5-part tutorial series I wrote on Paperspace. Perfect for someone who wants to move from beginner to intermediate pytorch skills.

Implement YOLO v3 from scratch

As of now, the code only contains the detection module, but you should expect the training module soon. :)

Requirements

Python 3.5
OpenCV
PyTorch 0.4

Using PyTorch 0.3 will break the detector.

Detection Example

Running the detector

On single or multiple images

Clone, and cd into the repo directory. The first thing you need to do is to get the weights file This time around, for v3, authors has supplied a weightsfile only for COCO here, and place

the weights file into your repo directory. Or, you could just type (if you're on Linux)

wget https://pjreddie.com/media/files/yolov3.weights 
python detect.py --images imgs --det det

--images flag defines the directory to load images from, or a single image file (it will figure it out), and --det is the directory to save images to. Other setting such as batch size (using --bs flag) , object threshold confidence can be tweaked with flags that can be looked up with.

python detect.py -h

Speed Accuracy Tradeoff

You can change the resolutions of the input image by the --reso flag. The default value is 416. Whatever value you chose, rememeber it should be a multiple of 32 and greater than 32. Weird things will happen if you don't. You've been warned.

python detect.py --images imgs --det det --reso 320

On Video

For this, you should run the file, video_demo.py with --video flag specifying the video file. The video file should be in .avi format since openCV only accepts OpenCV as the input format.

python video_demo.py --video video.avi

Tweakable settings can be seen with -h flag.

Speeding up Video Inference

To speed video inference, you can try using the video_demo_half.py file instead which does all the inference with 16-bit half precision floats instead of 32-bit float. I haven't seen big improvements, but I attribute that to having an older card (Tesla K80, Kepler arch). If you have one of cards with fast float16 support, try it out, and if possible, benchmark it.

On a Camera

Same as video module, but you don't have to specify the video file since feed will be taken from your camera. To be precise, feed will be taken from what the OpenCV, recognises as camera 0. The default image resolution is 160 here, though you can change it with reso flag.

python cam_demo.py

You can easily tweak the code to use different weightsfiles, available at yolo website

NOTE: The scales features has been disabled for better refactoring.

Detection across different scales

YOLO v3 makes detections across different scales, each of which deputise in detecting objects of different sizes depending upon whether they capture coarse features, fine grained features or something between. You can experiment with these scales by the --scales flag.

python detect.py --scales 1,3

pytorch-yolo-v3's People

Contributors

Stargazers

Watchers

Forkers

xuehaouwa pandinosaurus danilopetrocelli lsheiba statml jxlijunhao willdamon suzhenghang cclauss felixmonkey fancyerii liyuanyaun sunshinezhihuo shuli163love zhjpqq jaimetang yuechengyin zgsxwsdxg roy-algoritm chenhongming cstorm125 lextoumbourou 3sunny xxradon liben2018 e-sha nirvanalan wanglixiagithub kazumasugawara lvyanxuan chunshuizhao stevevista zengjichen jsmith yunclouding lonestar686 pawopawo wzhang1 kaihuatang briando2005 momentum-tn shellchange dqwerty7 dreamway ewenwan wlwkgus skrish13 jiamim pavlvstc paranoth wynmew abecadel mhahn0106 gp1313 xuanzhangyang 17764591637 li-yapeng lith0613 chuckgithub luoboganer ttpro1995 slowbull queenjuliazxx chisyliu yannmjl hungmingwu boyi92 sharib-vision chnold dansonc xu-fang caojinpei kevinjunwei zacario-li wpfhtl suhoy901 renderedsafe rlshuhart sunshiding grant7788 zkangsen halhenke abeltianxiong jsmilemsj zhuhuilong yaduvendra elliottzheng thataigeek xinw1012 fireyourneurons jinyige debiff zkailinzhang littlecherry11 winjia tcc-monitoramentointeligente jhonata-antunes uptodiff lff5985 gitarya

pytorch-yolo-v3's Issues

How to load pre-trained model

Hi, I do like your code which help me understand the YOLO framework. But it is regretful that there are no training process. Now I'm trying to write the training codes by myself but I don't know how to load the pre-trained model given by the official YOLO authors darknet53.conv.74. I tried straightly use the load_weights() method of Darknet but failed. The error reports as RuntimeEroor: the given numpy array has zero-sized dimensions. Zero-sized dimensions are not supported in Pytorch. I guess this error happens when loading step encounter BN layer. Can you help me with this? Thanks.

where's your loss define?

How can i show the test images with yolo boxes on it ?

I'm on windows and i read that i had to change line 306 of detect.py , and that's what i'v done. , but how or where can i see the image with yolo boxes on it ?

what is the function of "img_ = img[:,:,::-1].transpose((2,0,1)).copy()" code in prep_image func

Hello, i found a line

img_ = img[:,:,::-1].transpose((2,0,1)).copy()

Is this code for transposing BGR to RGB ?

If so, it should be changed.

Let me know.

thanks

Issue with printing on windows

Hey I ran the detect.py script on windows machine and no images were being output. I found the error, it's on line 302 of detect.py:

det_names = pd.Series(imlist).apply(lambda x: "{}/det_{}".format(args.det,x.split("/")[-1]))

This works for linux and mac machines but not on windows, it's an easy fix though. I tried to push a PR but didnt realize I couldnt. The fix is simple, add import platform to the top of detect.py and change line 302 to

if platform.system() == 'Windows':
        det_names = pd.Series(imlist).apply(lambda x: "{}/det_{}".format(args.det,x.split("\\")[-1]))
    else:
        det_names = pd.Series(imlist).apply(lambda x: "{}/det_{}".format(args.det,x.split("/")[-1]))```

letter box image

Hi, I just need to use letter box function from the code. It is giving me error as I am confused what would be the inp_dim . Is it the shape of input image ? or any other value. It is giving error bothways

output issues between windows and linux

It totally works in linux, output images saved in det folder.
However, no output images saved in det folder on windows, even it detects objects in images.
it runs correctly in detection part, but save output images into folder failed.
How to solve this problem?

License missing

Hi, what's the license of this code? Is it under MIT as pytorch-yolo2 by marvis?

Does this repo support python2.7?if not ,how can I change code to adapt python2.7?

Pytorch's yolo v3 is faster than C language yolo v3?

only the first row of data in the names file showed

Hello, I have a little problem. I set count=4, the model used is tiny-yolo, I have 6 kinds of data. When I was testing, only the first row of data in the names file showed, why? My cfg file shown below：
`
[net]

Testing

#batch=1
#subdivisions=1

Training

batch=1
subdivisions=1
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.00001
burn_in=1000
max_batches = 500200
policy=steps
steps=400000,450000
scales=.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=1

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

###########

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=33
activation=linear

[yolo]
mask = 3,4,5
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
classes=6
num=6
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

[route]
layers = -4

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = -1, 8

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=33
activation=linear

[yolo]
mask = 0,1,2
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
classes=6
num=6
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1
small_object=1
`

The test results are shown below ：
1-NG.bmp predicted in 0.112 seconds
Objects Detected: ok ok ok ok ok ok ok ok ok ok ok ok ok ok

Please help me,thanks a lot

error in Different batch size in detect.py

It seems the code can not work when I make a change to the batch size of detect.py

Traceback (most recent call last):
  File "/home/dong/PycharmProjects/SIMD/pytorch-yolo-v3/detect.py", line 200, in <module>
    prediction = write_results(prediction, confidence, num_classes, nms = True, nms_conf = nms_thesh)
  File "/home/dong/PycharmProjects/SIMD/pytorch-yolo-v3/util.py", line 155, in write_results
    img_classes = unique(image_pred_[:,-1])
IndexError: too many indices for tensor of dimension 1

Sorry, I am a rookie, I downloaded your program, this problem occurs after running. Ask you for some question.Thank you very match!

Traceback (most recent call last):
File "D:/deep_learn/pytorch-yolo-v3-master/pytorch-yolo-v3-master/video_demo_half.py", line 100, in
model.load_weights(args.weightsfile)
File "D:\deep_learn\pytorch-yolo-v3-master\pytorch-yolo-v3-master\darknet.py", line 367, in load_weights
fp = open(weightfile, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'yolov3.weights'

When could you pull the train code to this repo?

When I run detect.py,I meet this problem.what should I do?

function create_modules() bug

the darknet.py line 209:
x["layers"] = x["layers"].split(',')
is not right.
if you print this function output you will get:
---> 59 x["layers"] = x["layers"].split(",")
AttributeError: 'list' object has no attribute 'split'

if split x["layers"], how to do following code:
start = int(x["layers"][0])
and
end = int(x["layers"][1])
no [0],[1] index.
I think this line should be delete, however, if delete this line, run thiese follow commond:
blocks = parse_cfg("cfg/yolov3.cfg")
print(create_modules(blocks))
the output gets error as well.
---> 61 start = int(x["layers"][0])
62 #end, if there exists one.
63 try:

ValueError: invalid literal for int() with base 10: '-'
can somebody help

RuntimeError: invalid argument 2: size '[1 x 18 x 1369]' is invalid for input with 23328 elements at ..\aten\src\TH\THStorage.cpp:84

I am stuck in this error please help...

2018-09-26 12:39:49 epoch 2, processed 200 samples, lr 0.000010
torch.Size([1, 18, 18, 18])
D:\Softwares\anacond33\lib\site-packages\torch\nn\modules\upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")
torch.Size([1, 18, 36, 36])

RuntimeError Traceback (most recent call last)
in ()
----> 1 train(2)

in train(epoch)
42
43
---> 44 output = model(data)
45
46

D:\Softwares\anacond33\lib\site-packages\torch\nn\modules\module.py in call(self, *input, **kwargs)
475 result = self._slow_forward(*input, **kwargs)
476 else:
--> 477 result = self.forward(*input, **kwargs)
478 for hook in self._forward_hooks.values():
479 hook_result = hook(self, input, result)

D:\Softwares\anacond33\lib\site-packages\torch\nn\parallel\data_parallel.py in forward(self, *inputs, **kwargs)
119 inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids)
120 if len(self.device_ids) == 1:
--> 121 return self.module(*inputs[0], **kwargs[0])
122 replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
123 outputs = self.parallel_apply(replicas, inputs, kwargs)

~\Aandom\yolo_pytorch\pytorch_yolo_darknet_arc.py in forward(self, x, CUDA)
386 x = x.data
387 #print(x.shape)
--> 388 x = predict_transform(x, inp_dim, anchors, num_classes, CUDA)
389
390 if type(x) == int:

~\Aandom\yolo_pytorch\utils.py in predict_transform(prediction, inp_dim, anchors, num_classes, CUDA)
72
73 print(prediction.shape)
---> 74 prediction = prediction.view(batch_size, bbox_attrsnum_anchors, grid_sizegrid_size)
75 prediction = prediction.transpose(1,2).contiguous()
76 prediction = prediction.view(batch_size, grid_sizegrid_sizenum_anchors, bbox_attrs)

RuntimeError: invalid argument 2: size '[1 x 18 x 1369]' is invalid for input with 23328 elements at ..\aten\src\TH\THStorage.cpp:84

yolov3-tiny doesnt work

i have the cfg and weights file. I tried to run, but it gives this error

File "object_detection/darknet.py", line 323, in forward
x = torch.cat((map1, map2), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 416 and 832 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:111

bounding box coordinate

Hi @ayooshkathuria ,
in the detect.py, I see that line 272 outputs the bounding box coordinate for every object detected.
what exactly is the order of the coordinate? is it [image width height x-center y center prob prob class]?

Thanks

Cannot capture source

Hi,I`m try to use your code to run the video ,my video name is chicken.avi.But when I code python video_demo.py --video chicken.avi,the result is
Traceback (most recent call last):
File "video_demo.py", line 119, in
assert cap.isOpened(), 'Cannot capture source'
AssertionError: Cannot capture source
Is it means that I have the wrong code or other question?Can anybody help me?Please!!

path consist of "\\" can cause bug

In the line 302, det_names = pd.Series(imlist).apply(lambda x: "{}/det_{}".format(args.det,x.split("/")[-1]))

this can be problem. In my case, it doesn't split path, because of "/".

but i works well by changing "/" to "\"

Please note this issue

Have problem running the demo.py

I run the code on a Linux server with Pytorch 0.4.0. And I get the following error:

I use the py4 branch. Please tell me how to run this.

model predict question

Hello, the model I trained with Darknet can't detect objects when running on your code. But the weight file given by official Darknet can be detected. What's the reason?

There are two bugs in detect.py

Unresolved reference 'load_classes'

Unresolved reference"write_results"

about the training code

Hello,When will the training code be published?

How to get bounding box dimensions along with the probabilities of detection?

SyntaxError: only named arguments may follow *expression

I tried running your repo on my system running Ubuntu 16.o4, with Python 3.5 and pytorch 0.4. I am running into the following error when I simply try running detect.py

nivedithak@nivii-Sum18:~/Downloads/pytorch-yolo-v3-master$ python detect.py 
  File "detect.py", line 29
    fwd = nn.Sequential(self.linear_1, *self.middle, self.output)
SyntaxError: only named arguments may follow *expression

What is the movie in readme?

What is the movie in readme?
Staring by anne hathaway.

Refactor for usability & maintainability

Love what you've done here (and especially the in-depth tutorial on the Paperspace blog!). I've been looking for a performant pytorch implementation of YOLOv3 and this seems to be the closest thing that fits the bill.

One suggestion: Right now the repo feels more like a proof of concept than a well organized library (one example: the simple task of figuring out how to plug in an image and get a list of predictions out to use in another python app isn't super straightforward).

Would love to see this progress towards a more refined library (and if I have some time I'll certainly try to help out with any refactoring I can).

Thanks for all your work on this!

Want to detect only cars

What shld i do if i want detect only cars

train code

@ayooshkathuria , hi, thanks a lot for your work. And I am waiting for the train code. Could you tell me when the train code is available, plz? Because I want to be lazy. Hope this quest not bother you.

Train branch

Hi,
What is the status of the train branch ? I'd be down for contributing if you wish ?

ValueError: not enough values to unpack (expected 2, got 1)

hi, I got a trouble during the tutorial chapter 2.
when I did this part,

Testing the code

You can test your code by typing the following lines at the end of darknet.py and running the file.

blocks = parse_cfg("cfg/yolov3.cfg")
print(create_modules(blocks))

there's an error occured.

File "darknet.py", line 147, in
blocks = parse_cfg("cfg/yolov3.cfg")
File "darknet.py", line 26, in parse_cfg
key, value = line.split("=")
ValueError: not enough values to unpack (expected 2, got 1)

REALLY NEED HELP, thanks

RuntimeError: Input type (CUDAFloatTensor) and weight type (CUDAHalfTensor) should be the same

When I run vedio_demo.py,I meet the problem.Can you teach me how to deal with it?

Training

Thanks for creating this, it looks great! I would love to use it, but I would need the possibility to train according to my needs - can you probably give an estimate when training will be available?

Thanks and kind regards
Ernst

why detect.py isn't saving predicted images with boxes in 'det'?

Though 'det' is automatically created but images with predicted boxes are not saved in it..can you plz help out..

is this detection only?

will it do the backward pass and train as well?

Would you please test code about single image testing?

RuntimeError: invalid argument 2: size '[18 x 256 x 1 x 1]' is invalid for input with 4607 elements

Hi, I'm trying to use your code to do some detection work.
I'm successful while using yolov3.cfg and yolov3.weights.
But when I change this cfg file and weights to my detection, error occurs.
I only change the classes num to 1 and the error is here:

Traceback (most recent call last):
  File "D:/GitHub/Graduation/YOLO_v3_self/detect.py", line 109, in <module>
    model.load_weights(args.weightsfile)
  File "D:\GitHub\Graduation\YOLO_v3_self\darknet.py", line 425, in load_weights
    conv_weights = conv_weights.view_as(conv.weight.data)
  File "C:\software\Anaconda3\envs\my_dev\lib\site-packages\torch\tensor.py", line 230, in view_as
    return self.view(tensor.size())
RuntimeError: invalid argument 2: size '[18 x 256 x 1 x 1]' is invalid for input with 4607 elements at ..\src\TH\THStorage.c:41

Is there some solution?
Thanks!

error in darknet.py

anchors = x["anchors"].split(",")
TypeError: string indices must be integers, not str

Error when trying to convert darknet.py to ONNX

Hi!

I'm trying to create a script to convert the darknet.py Pytorch model to ONNX model using CPU based on this example:

# onnx-convert.py

from darknet import Darknet

import torch 
import torch.onnx
from torch.autograd import Variable
import torchvision

cfgfile = './cfg/yolov3.cfg'
weightfile =  './yolov3.weights'
imgfile = './imgs/dog.jpg'
resolution = 416

model = Darknet(cfgfile)
model.load_weights(weightfile)

dummy_input = Variable(torch.randn(1, 3, resolution, resolution))
torch.onnx.export(model, dummy_input, "yolo.onnx")

But when I run python onnx-convert.py, the following error happens:

TypeError: forward() missing 1 required positional argument: 'CUDA'

I changed the Darknet#forward(self, x, CUDA) member signature to Darknet#forward(self, x, CUDA = False). Now the following error happens:

RuntimeError: invalid argument 2: size '[1 x 255 x 2809]' is invalid for input with 689520 elements at /pytorch/aten/src/TH/THStorage.c:41

I changed my script to include following condition:

from darknet import Darknet

...
model = Darknet(cfgfile)
model.net_info["height"] = resolution
...
torch.onnx.export(model, dummy_input, "yolo.onnx")

The following error happens:
RuntimeError: /pytorch/torch/csrc/jit/tracer.h:120: getTracingState: Assertion state failed.

Someone can help me?

Test questions

Why can you use your code to perform detection in the official weight file, while running the weight file that I trained myself can not detect it?

no outputs

S E:\condaDev\pytorch-yolo-v3-master> python detect.py --images imgs --det det
Loading network.....
Network successfully loaded
C:\Users\Max\Anaconda3\envs\Pytorch\lib\site-packages\torch\nn\modules\upsampling.py:122: UserWarning: nn.Upsampling is deprecated. Use nn.functional.interpolate instead.
warnings.warn("nn.Upsampling is deprecated. Use nn.functional.interpolate instead.")
E:\condaDev\pytorch-yolo-v3-master\imgs\dog.jpg predicted in 1.411 seconds
Objects Detected: bicycle truck dog

E:\condaDev\pytorch-yolo-v3-master\imgs\eagle.jpg predicted in 1.386 seconds
Objects Detected: bird

E:\condaDev\pytorch-yolo-v3-master\imgs\giraffe.jpg predicted in 1.414 seconds
Objects Detected: zebra giraffe giraffe

E:\condaDev\pytorch-yolo-v3-master\imgs\herd_of_horses.jpg predicted in 1.420 seconds
Objects Detected: horse horse horse horse

E:\condaDev\pytorch-yolo-v3-master\imgs\img1.jpg predicted in 1.406 seconds
Objects Detected: person dog

E:\condaDev\pytorch-yolo-v3-master\imgs\img2.jpg predicted in 1.410 seconds
Objects Detected: train

E:\condaDev\pytorch-yolo-v3-master\imgs\img3.jpg predicted in 1.407 seconds
Objects Detected: car car car car car car car truck traffic light

E:\condaDev\pytorch-yolo-v3-master\imgs\img4.jpg predicted in 1.402 seconds
Objects Detected: chair chair chair clock

E:\condaDev\pytorch-yolo-v3-master\imgs\messi.jpg predicted in 1.401 seconds
Objects Detected: person person person sports ball

E:\condaDev\pytorch-yolo-v3-master\imgs\person.jpg predicted in 1.542 seconds
Objects Detected: person dog horse

SUMMARY

Task : Time Taken (in seconds)

Reading addresses : 0.000
Loading batch : 1.614
Detection (11 images) : 15.668
Output Processing : 0.000
Drawing Boxes : 0.005
Average time_per_img : 1.571

no outputs found in det folder

Calling the model object?

Working on making your detect.py a reusable module, trying to work on having the script treat images passed as single images instead of lists of images from a directory. Going well so far, issue I'm on right now, what is this doing?

with torch.no_grad(): prediction = self.model(Variable(batch), self.CUDA)

What I'm confused about is what self.model(Variable(batch), self.CUDA) is doing. It looks like you're calling the self.model object, but i can't see what that does within the Darknet class that defines it.

Error when loading own trained weight file

Hi,

I tested loading my own trained weights which I got using darknet's official guide. Basically, I substituted yolov3.weights with my own yolov3-own.weights as the default --weights argument in the code.

This is console's output:

Loading network.....
Traceback (most recent call last):

  File "<ipython-input-3-1d1acba7b8a9>", line 1, in <module>
    runfile('/home/raggot/Projects/test/detect.py', wdir='/home/raggot/Projects/test')

  File "/usr/local/lib/python3.5/dist-packages/spyder/utils/site/sitecustomize.py", line 705, in runfile
    execfile(filename, namespace)

  File "/usr/local/lib/python3.5/dist-packages/spyder/utils/site/sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/home/raggot/Projects/test/detect.py", line 89, in <module>
    model.load_weights(args.weightsfile)

  File "/home/raggot/Projects/test/darknet.py", line 448, in load_weights
    conv_weights = conv_weights.view_as(conv.weight.data)

  File "/usr/local/lib/python3.5/dist-packages/torch/tensor.py", line 230, in view_as
    return self.view(tensor.size())

RuntimeError: invalid argument 2: size '[256 x 128 x 3 x 3]' is invalid for input with 37347 elements at /pytorch/aten/src/TH/THStorage.c:41

What is the problem here? What are the 37347 elements the error refers to?

Thanks.

can I use this code with yolov3-openimages.weights

thank you for releasing this repo. can use this code with yolov3-openimages.weights (available here - wget https://pjreddie.com/media/files/yolov3-openimages.weights)?

what would be the changes that I would need to make?

thanks,

Problem when running video_demo_half.py

At first, I found the video_demo.py support mp4 video surprisingly.
But when I run the video_demo_half.py, there is some mistake happened
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same
Maybe there is another weights of 16-bit which is not released?

Tensors have different dimensions

Hi,
firstly, thank you for your work on this repo. I tried to run your code, but I get this exception:

Traceback (most recent call last):
  File "/home/lukas/dev/pytorch-yolo-v3/detect.py", line 183, in <module>
    prediction = write_results(prediction, confidence, num_classes, nms = True, nms_conf = nms_thesh)
  File "/home/lukas/dev/pytorch-yolo-v3/util.py", line 189, in write_results
    output = torch.cat(seq, 0)
RuntimeError: invalid argument 0: Tensors must have same number of dimensions: got 1 and 2 at /tmp/pip-w41ywlv_-build/aten/src/THC/generic/THCTensorMath.cu:102

I tried to debug what is happening in the code, but it is not very clear to me. I only noticed that in my case, one tensor's size is 7 and the other is (7,1). I am also adding a screen.

So I have modified the 7 sized tensor into (7,1) dimension but then I get another exception from further code.

Traceback (most recent call last):
  File "/opt/pycharm-2018.1/helpers/pydev/pydevd.py", line 1664, in <module>
    main()
  File "/opt/pycharm-2018.1/helpers/pydev/pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/opt/pycharm-2018.1/helpers/pydev/pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/opt/pycharm-2018.1/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/lukas/dev/pytorch-yolo-v3/detect.py", line 213, in <module>
    objs = [classes[int(x[-1])] for x in output if int(x[0]) == im_id]
  File "/home/lukas/dev/pytorch-yolo-v3/detect.py", line 213, in <listcomp>
    objs = [classes[int(x[-1])] for x in output if int(x[0]) == im_id]
IndexError: list index out of range

Don't you please know what I am doing wrong? It is my first time working with PyTorch, so I am not very experienced and I don't know how to fix it myself.

Using Ubuntu 16.4, Python 3.6, CUDA 9.0, PyTorch 0.4.
Thanks

Number of Images inference

Hi,
Thank you for your work. It is really easier to use pytorch instead of darknet.
Nevertheless, I have tried to run the command on a folder with thousands of images
python3 detect.py --images img --det det --reso 320 --weights yolov3.weights
And got a killed error:

Loading network.....
Network successfully loaded
Killed

I don't know why it is working for few images and not for more. I have tried to use a batch size of 1 image but I still get the same error.
Thanks in advance

Images are labelled in compressed clusters when using own cfg and weights

Hi,

After I successfully loaded my own yolov3-voc-own.weights obtained from training the network with Pascal VOC, referred to the yolov3-voc.cfg I used for training file, and modified classes names and number (from coco.names to voc.names and from 80 to 20) I was surprised at the results.

Please have a look at the following images
Herd of horses: https://ibb.co/iO8WFo
Messi/Inista: https://ibb.co/nMNaao
Guy with poncho and dog: https://ibb.co/j5bU1T

In all of them, there is a pattern of identifying the right combination of images (incorrectly, but clearly finding something) but in the wrong location, and condensed. My guess is that the recognition happens in one of the 'zoomed' version of the image, and it's identified anchors are not correctly remapped to the original size of the image.

Is this any reasonable? Do you think this could be a bug from the 'interpretation' of the .cfg file? Or perhaps I'm missing another piece of the puzzle?

Another possible point of code that could possibly be related to this is the input argument of scales, where I feed 1,2,3 as default. Could it be that it should be matched with some properties of my network?

Thanks in advance!

ayooshkathuria / pytorch-yolo-v3 Goto Github PK

pytorch-yolo-v3's Introduction

A PyTorch implementation of a YOLO v3 Object Detector

Tutorial for building this detector from scratch

Requirements

Detection Example

Running the detector

On single or multiple images

Speed Accuracy Tradeoff

On Video

Speeding up Video Inference

On a Camera

Detection across different scales

pytorch-yolo-v3's People

Contributors

Stargazers

Watchers

Forkers

pytorch-yolo-v3's Issues

Testing

Training

hi, I got a trouble during the tutorial chapter 2. when I did this part,

blocks = parse_cfg("cfg/yolov3.cfg") print(create_modules(blocks))

there's an error occured.

File "darknet.py", line 147, in blocks = parse_cfg("cfg/yolov3.cfg") File "darknet.py", line 26, in parse_cfg key, value = line.split("=") ValueError: not enough values to unpack (expected 2, got 1)

E:\condaDev\pytorch-yolo-v3-master\imgs\eagle.jpg predicted in 1.386 seconds Objects Detected: bird

E:\condaDev\pytorch-yolo-v3-master\imgs\giraffe.jpg predicted in 1.414 seconds Objects Detected: zebra giraffe giraffe

E:\condaDev\pytorch-yolo-v3-master\imgs\herd_of_horses.jpg predicted in 1.420 seconds Objects Detected: horse horse horse horse

E:\condaDev\pytorch-yolo-v3-master\imgs\img1.jpg predicted in 1.406 seconds Objects Detected: person dog

E:\condaDev\pytorch-yolo-v3-master\imgs\img2.jpg predicted in 1.410 seconds Objects Detected: train

E:\condaDev\pytorch-yolo-v3-master\imgs\img3.jpg predicted in 1.407 seconds Objects Detected: car car car car car car car truck traffic light

E:\condaDev\pytorch-yolo-v3-master\imgs\img4.jpg predicted in 1.402 seconds Objects Detected: chair chair chair clock

E:\condaDev\pytorch-yolo-v3-master\imgs\messi.jpg predicted in 1.401 seconds Objects Detected: person person person sports ball

E:\condaDev\pytorch-yolo-v3-master\imgs\person.jpg predicted in 1.542 seconds Objects Detected: person dog horse

SUMMARY

Reading addresses : 0.000 Loading batch : 1.614 Detection (11 images) : 15.668 Output Processing : 0.000 Drawing Boxes : 0.005 Average time_per_img : 1.571

Recommend Projects

Recommend Topics

Recommend Org

hi, I got a trouble during the tutorial chapter 2.
when I did this part,

blocks = parse_cfg("cfg/yolov3.cfg")
print(create_modules(blocks))

File "darknet.py", line 147, in
blocks = parse_cfg("cfg/yolov3.cfg")
File "darknet.py", line 26, in parse_cfg
key, value = line.split("=")
ValueError: not enough values to unpack (expected 2, got 1)

E:\condaDev\pytorch-yolo-v3-master\imgs\eagle.jpg predicted in 1.386 seconds
Objects Detected: bird

E:\condaDev\pytorch-yolo-v3-master\imgs\giraffe.jpg predicted in 1.414 seconds
Objects Detected: zebra giraffe giraffe

E:\condaDev\pytorch-yolo-v3-master\imgs\herd_of_horses.jpg predicted in 1.420 seconds
Objects Detected: horse horse horse horse

E:\condaDev\pytorch-yolo-v3-master\imgs\img1.jpg predicted in 1.406 seconds
Objects Detected: person dog

E:\condaDev\pytorch-yolo-v3-master\imgs\img2.jpg predicted in 1.410 seconds
Objects Detected: train

E:\condaDev\pytorch-yolo-v3-master\imgs\img3.jpg predicted in 1.407 seconds
Objects Detected: car car car car car car car truck traffic light

E:\condaDev\pytorch-yolo-v3-master\imgs\img4.jpg predicted in 1.402 seconds
Objects Detected: chair chair chair clock

E:\condaDev\pytorch-yolo-v3-master\imgs\messi.jpg predicted in 1.401 seconds
Objects Detected: person person person sports ball

E:\condaDev\pytorch-yolo-v3-master\imgs\person.jpg predicted in 1.542 seconds
Objects Detected: person dog horse

Reading addresses : 0.000
Loading batch : 1.614
Detection (11 images) : 15.668
Output Processing : 0.000
Drawing Boxes : 0.005
Average time_per_img : 1.571