Coder Social home page Coder Social logo

Comments (5)

ayooshkathuria avatar ayooshkathuria commented on May 14, 2024

Could you please provide me details as to how many anchors are you trying to predict. Maybe a screenshot of Runtime error details would help.

This line has to do with reshaping the prediction feature map into a better form so that the operations can be vectorised over all bounding boxes. This is bound to get screwed up if you switch to a different number of anchors since the code is written with the assumption you have three anchors per cell, otherwise the reshaping calculations won't work, and it'll throw an error. I've described it in Part 3 of the tutorial.

Why don't you gimme a detailed description, and we can figure something out :).

from yolo_v3_tutorial_from_scratch.

ayooshkathuria avatar ayooshkathuria commented on May 14, 2024

I don't think we can do much with predicting more than three anchors or using different anchors at all without retraining the algorithm. We actually use the official weights file which has weights as such that it works for anchors given in the cfg file.

Secondly, if we were to change the number of anchors, this is basically changing the number of bounding boxes a cell can predict. This is going to change the very depth of the prediction feature map which (B * (5 + C)) where B is the number of anchors or bounding boxes each cell predicts. In that case, we can't load the official weights file, as we have changed the architecture and therefore the number of weights in the network.

I'm working on the training code, though I don't get an awful lot of time owing to my undergraduate thesis up for presentation in may. I can consider making training with different number of anchors an option in the training module though. However, anchors are generated using dimension clustering method, and prolly you will have to run K-means clustering on your ground truth boxes in your dataset to generate a different number of anchors.

Do let me know if you can't grasp any part. I'd link up resources where you can read further about YOLO.

from yolo_v3_tutorial_from_scratch.

moshanATucsd avatar moshanATucsd commented on May 14, 2024

Hi thanks for the quick reply! The reason behind using more than three anchors is that it seems the default 6,7,8 correspond to the coarsest level and if we want to detect small objects, it may be better to use the anchors like 0,1,2 (correct me if I am wrong). If we want to detect a wide range of objects maybe using more than 3 anchors will help.

Your explanation above is very clear, I can see that for now it's best to use 3 anchors. It's a good idea to run kmeans if we want to use it for a specific dataset. Again, thanks so much for your tutorial!

from yolo_v3_tutorial_from_scratch.

ayooshkathuria avatar ayooshkathuria commented on May 14, 2024

@moshanATucsd If you look at the cfg file, there are three detection layers, which progressively larger sizes. The anchors 0,1,2 are used for detection at the 3rd detection layer (defined by mask) , which is upsampled and concatenated with the layer 36, and it helps detect smaller objects. To give you an idea, this is how the final arch looks like.

image

Here's a blog post I wrote over at medium explaining the changes in YOLO v3. https://towardsdatascience.com/yolo-v3-object-detection-53fb7d3bfe6b.

In this post, I've made a comparison on what different detection layers detect. And, this repo only works for the tutorial. So, I've kept it short, and it won't be updated with training code. The ever evolving code for YOLO v3 lives in my other repo Here. In the readme, you can find a --scales flag to be used with detect.py, and with that you can choose which detection layer you want to use for detections, and perhaps isolate if you wanna see what each layer predicts.

from yolo_v3_tutorial_from_scratch.

moshanATucsd avatar moshanATucsd commented on May 14, 2024

@ayooshkathuria Thanks for your detailed explanation and the blog post! It makes things more clear to me now. I really appreciate your help!

from yolo_v3_tutorial_from_scratch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.