Comments (5)
Could you please provide me details as to how many anchors are you trying to predict. Maybe a screenshot of Runtime error details would help.
This line has to do with reshaping the prediction feature map into a better form so that the operations can be vectorised over all bounding boxes. This is bound to get screwed up if you switch to a different number of anchors since the code is written with the assumption you have three anchors per cell, otherwise the reshaping calculations won't work, and it'll throw an error. I've described it in Part 3 of the tutorial.
Why don't you gimme a detailed description, and we can figure something out :).
from yolo_v3_tutorial_from_scratch.
I don't think we can do much with predicting more than three anchors or using different anchors at all without retraining the algorithm. We actually use the official weights file which has weights as such that it works for anchors given in the cfg file.
Secondly, if we were to change the number of anchors, this is basically changing the number of bounding boxes a cell can predict. This is going to change the very depth of the prediction feature map which (B * (5 + C)) where B is the number of anchors or bounding boxes each cell predicts. In that case, we can't load the official weights file, as we have changed the architecture and therefore the number of weights in the network.
I'm working on the training code, though I don't get an awful lot of time owing to my undergraduate thesis up for presentation in may. I can consider making training with different number of anchors an option in the training module though. However, anchors are generated using dimension clustering method, and prolly you will have to run K-means clustering on your ground truth boxes in your dataset to generate a different number of anchors.
Do let me know if you can't grasp any part. I'd link up resources where you can read further about YOLO.
from yolo_v3_tutorial_from_scratch.
Hi thanks for the quick reply! The reason behind using more than three anchors is that it seems the default 6,7,8 correspond to the coarsest level and if we want to detect small objects, it may be better to use the anchors like 0,1,2 (correct me if I am wrong). If we want to detect a wide range of objects maybe using more than 3 anchors will help.
Your explanation above is very clear, I can see that for now it's best to use 3 anchors. It's a good idea to run kmeans if we want to use it for a specific dataset. Again, thanks so much for your tutorial!
from yolo_v3_tutorial_from_scratch.
@moshanATucsd If you look at the cfg file, there are three detection layers, which progressively larger sizes. The anchors 0,1,2 are used for detection at the 3rd detection layer (defined by mask
) , which is upsampled and concatenated with the layer 36, and it helps detect smaller objects. To give you an idea, this is how the final arch looks like.
Here's a blog post I wrote over at medium explaining the changes in YOLO v3. https://towardsdatascience.com/yolo-v3-object-detection-53fb7d3bfe6b.
In this post, I've made a comparison on what different detection layers detect. And, this repo only works for the tutorial. So, I've kept it short, and it won't be updated with training code. The ever evolving code for YOLO v3 lives in my other repo Here. In the readme, you can find a --scales
flag to be used with detect.py
, and with that you can choose which detection layer you want to use for detections, and perhaps isolate if you wanna see what each layer predicts.
from yolo_v3_tutorial_from_scratch.
@ayooshkathuria Thanks for your detailed explanation and the blog post! It makes things more clear to me now. I really appreciate your help!
from yolo_v3_tutorial_from_scratch.
Related Issues (20)
- In function prep_image
- What is loading batch?
- yolov3-tiny model image dimensions error
- how to run detect.py
- google colab
- testing object detector
- bounding boxes not correct HOT 1
- How to solve this runtime error problem? HOT 1
- darknet spp maxpool
- 'NoneType' object has no attribute 'shape' HOT 1
- cv2.imwrite doesn't output image HOT 1
- the problem of function "predict_transform" HOT 1
- Why does batch norm layer has the parameter of weight and bias? HOT 2
- Why do we reverse the final dim of the image in "prep_image"? HOT 1
- Quick question
- RuntimeError: Expected object of device type cuda but got device type cpu for argument #1 'self' in call to _thnn_conv2d_forward
- OpenCV(4.5.2) :-1: error: (-5:Bad argument) in function 'rectangle' HOT 4
- Anchors scaling for each feature map output HOT 1
- Object detection algorithms
- why no images saved in the directory named 'det' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from yolo_v3_tutorial_from_scratch.