Coder Social home page Coder Social logo

loss function about synthtext HOT 9 CLOSED

ankush-me avatar ankush-me commented on August 27, 2024
loss function

from synthtext.

Comments (9)

ankush-me avatar ankush-me commented on August 27, 2024

Hi,
The loss function is exactly the same as the one in eq 2 of the following version of the YOLO paper:
https://arxiv.org/pdf/1506.02640v1.pdf

from synthtext.

sebastiangonsal avatar sebastiangonsal commented on August 27, 2024

But that does not take into account the orientation of boxes in your paper?

from synthtext.

ankush-me avatar ankush-me commented on August 27, 2024

Those are additional two L2 loss terms on the sine and cosine of the orientation angle.

from synthtext.

sebastiangonsal avatar sebastiangonsal commented on August 27, 2024

Thanks. My loss oscillates a lot during iterations. Was your loss small enough in the starting iterations already? I am looking at a loss of 150 per image after 170 iterations of batches of size 16 each.

from synthtext.

MansourTrabelsi avatar MansourTrabelsi commented on August 27, 2024

hello please can you help me
i have this error
Traceback (most recent call last):
File "gen.py", line 19, in
from synthgen import *
File "/home/mansour/SynthText/synthgen.py", line 20, in
import text_utils as tu
File "/home/mansour/SynthText/text_utils.py", line 12, in
from pygame import freetype
ImportError: cannot import name freetype

from synthtext.

sebastiangonsal avatar sebastiangonsal commented on August 27, 2024

@MansourTrabelsi this is unrelated to this issue

from synthtext.

sebastiangonsal avatar sebastiangonsal commented on August 27, 2024

@ankush-me I am using batchsize of 16, 512 sized images, grid of 32 for each image. I am using weight of 0.01 increasing to 1.0 for grid boxes that do not have object. I increased the weight in a geometric fashion as you have noted in another issue. However, I still end up with zero probabilities for all boxes in the image. Anything that you can think of that can help me ?

from synthtext.

ankush-me avatar ankush-me commented on August 27, 2024

@sebastiangonsal I did not notice oscillation. First of all, it helps to also monitor the performance during training, by calculating the precision/recall of the predictions in every batch.
Second, if the probability is going to zero, there could be two reasons:

(1) it is a common error in implementing FCRN, that the axes of the labels and the images do not correspond, i.e., the x and y axes are either transposed or flipped. I would first make sure that this is not the case.
(2) the rate of increase is too high. I do not recall at what rate I increased it, but may be try decreasing the rate -- I do remember, if the rate was too high, the network would collapse to all zeros in the beginning itself and wouldn't recover (i.e., recall would go to zero). Are you experiencing that? If so, try lowering the initial 0.01 further. If recall is non-zero in the beginning, then try increasing the weight at a lower rate.

from synthtext.

ankush-me avatar ankush-me commented on August 27, 2024

@MansourTrabelsi Please check #24.

from synthtext.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.