can Fully-Convolutional Regression Network (FCRN) algorithm be open source? about synthtext HOT 15 CLOSED

ankush-me commented on July 24, 2024

can Fully-Convolutional Regression Network (FCRN) algorithm be open source?

from synthtext.

Comments (15)

cjnolet commented on July 24, 2024

+1 would certainly be nice to see this code. It is described in the paper but I wasn't able to extract enough detail to recreate it so I've been using jaderberg's approach completely.

…

On Jan 11, 2017 1:33 AM, "Jayhello" ***@***.***> wrote: can push FCRN algorithm to git ? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#10>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABL1YDVeJJhKf-U4HAl_qn10vjGuhythks5rRHe0gaJpZM4LgPZH> .

from synthtext.

Jayhello commented on July 24, 2024

@cjnolet how about jaderberg's approach? where to find it ? thank you。。

from synthtext.

cjnolet commented on July 24, 2024

I've implemented it myself using Keras and Ankush's scripts. Unfortunately, i don't have it in open source yet but it's in the works. Jaderberg does do a great job at explaining his CNNs in his dissertation. I found better performance by using the same network topology for the false positive filter that Jaderberg used for his regression CNN (except I applied softmax to the output and used binary cross entropy for the optimizer).

…

On Jan 11, 2017 1:47 AM, "Jayhello" ***@***.***> wrote: @cjnolet <https://github.com/cjnolet> how about jaderberg's approach? where to find it ? thank you。。 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABL1YHf9GzzwMztE0-bsBStKdyxMNIR3ks5rRHsVgaJpZM4LgPZH> .

from synthtext.

biggerlambda commented on July 24, 2024

@cjnolet please consider posting synopsis of your steps on github. Would be helpful.

from synthtext.

cjnolet commented on July 24, 2024

Ankush, I just re-read your paper and after doing the regression CNN for the bounding box refinement, I think I have a better grasp on the CNN that you've built for doing the initial bounding box proposals. If I understand correctly, the input image is passed in all at once and the convolutional filters holistically generate outputs for each cell in the image all at the same time. Each filter map basically proposes both the likelihood of the map (cell) containing a word and the relative dimensions/coordinates/rotation of the word. Also, if i understand correctly, you use a standard mean squared error loss function modified slightly to ignore the first 6 outputs if the 'c' output is a negative exemplar (actually, could I get away with just zeroing out the other 6 parameters instead of providing my own modified loss function)? If this is correct, I don't thik it would take me long to build and train on my own. I'm having some major performance issues (not recall but speed) using the EdgeBoxes and SelectiveSearch and if I can achieve good recall with your FCRN, I think that's definitely the way to go. Thanks again for your input. Your work has been phenomenal thus far and very important to my daily tasking currently. I'm hoping my company can give me the proper approval to open source the stuff I've done thus far (which is basically utilizing your work and Max's work but with unicode character sets).

…

On Wed, Jan 11, 2017 at 4:40 PM, captain sparrow ***@***.***> wrote: @cjnolet <https://github.com/cjnolet> please consider posting synopsis of your steps on github. Would be helpful. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABL1YAPWq6AmgNfUV6rbpWDK_-LJZ95Aks5rRUxKgaJpZM4LgPZH> .

from synthtext.

cjnolet commented on July 24, 2024

Ankush,

I believe I have this network coded up in Keras how you described it in the paper. One thing that would be nice would be to know if I coded up the loss layer properly.

Would you mind providing some quick pseudocode in this ticket for how you determined the final loss and the weighted loss? I believe your final loss was multiplied by 0.01 if c == 0 (and also only the c values were used in the calculation of the loss by that point) where the weighting was increased during each epoch. If c == 1, I believe you calculated the loss as usual. Does this sound right?

Thank you!

from synthtext.

ankush-me commented on July 24, 2024

@Jayhello : We are in the process of sharing the FCRN; it might still take a few weeks.

@cjnolet : Your understanding of FCRN and the loss is correct. FCRN is a fully-convolutional network which takes in the whole image, and for each cell regresses a confidence score and bounding box dimensions.

The loss is again as you describe in your second comment: L2 loss on all parameters (confidence + bb-params); No loss on bb-params if c==0. Further, c==0 confidence score losses are weighed down initially (for balancing negatives against the small number of positives) and this weight is gradually increased to 1 during training. L2-loss on both confidence + bb-params for the positive cells.

from synthtext.

Jayhello commented on July 24, 2024

thank you very much! if you implement it , can you give us a noice?

from synthtext.

ankush-me commented on July 24, 2024

Certainly.

from synthtext.

cjnolet commented on July 24, 2024

I got the approval to release my Keras code for my FCRN + Loss Function and I've posted it here:

https://github.com/cjnolet/textbox_proposals_fcrn

I'm hoping maybe some of you that are wanting to use it could let me know if you think the math (given how Theano does the calculation of the gradients behind the scenes) looks correct. Ankush, I don't know if you are familiar with Keras but the loss function is right underneath the imports and it would be nice to get quick feedback from you as well.

If you want to use this, it assumes you have a directory of h5py dbs each with a group called "/data" and where each record's data is the numpy array of the (greyscaled) 512x512 input image and an attribute called "label" has a 16x16x7 tensor representing the training label. (Note it's 16x16x7 just because it was easier for me to create this and transpose it then it was to create the 7x16x16 from the start).

from synthtext.

ankush-me commented on July 24, 2024

Hi @cjnolet, thank you for this contribution!

I looked through the loss function; I haven't worked with Keras before, so I can only guess that you are discounting the loss on bounding box coordinates for the negative cells here. Is this correct?
Only the "score" loss on the negatives should be discounted, and the bounding box loss for negatives should be ignored.

Again, thanks for sharing.

from synthtext.

cjnolet commented on July 24, 2024

Ankush, Unfortunately, since Keras uses the Theano framework underneath, i was limited to implementing the loss function as a bunch of lazy tensor transformation functions rather than being able to apply a conditional directly. I think this is nice because it allows Theano to figure out how to deal with the gradient calculation, however, it does make the operations very hard to read. So basically the 6th feature map in the y_true tensor is the 'c' matrix. I slice out the c-matrix, which should only contain values of 0 or 1, and I multiply it (element-wise) by feature maps 0-5 in the tensor containing the squared loss values. This zeros out all the pose loss values that corresponded with negative c-values and keeps all the pose loss values that correspond to positive c-values in the c-matrix. Next, I change all the 0's in the c-matrix to the discount amount (0.01 for the first iteration, to increase in subsequent iterations) and I multiply the 6th feature map of the calculated squared loss tensor by the new c-matrix such that all the calculated loss values that correspond to a positive confidence value are kept unchanged while the calculated loss values that correspond to negative confidence values are discounted. Then the feature maps are concatenated back together into a 16x16x7 tensor and returned. I *believe* this is correct. I've been training throughout the day and it looks like it's learning and getting decent validation accuracy. I'll report back if I have any significant issues. Thanks again, Ankush, for checking this out! I hope others might be able to use this. I mostly just wanted to get this into github so if anybody finds errors, they can be fixed.

…

On Thu, Jan 26, 2017 at 5:55 PM, Ankush Gupta ***@***.***> wrote: Hi @cjnolet <https://github.com/cjnolet>, Thank you for this contribution! I looked through the loss function; I haven't worked with Keras before, so I can only guess that you are discounting the loss on bounding box coordinates for the negative cells here <https://github.com/cjnolet/textbox_proposals_fcrn/blob/master/train_model.py#L53>. Is this correct? Only the "score" loss on the negatives should be discounted, and the bounding box loss for negatives should be ignored. Again, thanks for sharing. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABL1YLRGPp8_b5dkgTpvc-kKjx8KXJDBks5rWSRigaJpZM4LgPZH> .

from synthtext.

ankush-me commented on July 24, 2024

Thanks @cjnolet. Good to know that it is training with this loss.

Can you please help me understand the code again:

c = y_true[i, 6, :,:].reshape((1, delta, delta))   # The last feature map in the true vals is the 'c' matrix
final_c = (c * loss[i,6,:,:])
c = T.set_subtensor(c[(c<=0.0).nonzero()], d.get_value())
# Element-wise multiply of the c feature map against all feature maps in the loss
final_loss_parts = [(c * loss[i, j, :, :].reshape((1, delta, delta))).reshape((1, delta, delta)) for j in range(0, 6)]

On line 2 above, you multiply c with the final channel of the loss (i.e., score), which zeros out score-loss for the negatives. Shouldn't you multiply with the discount factor (d in your code) instead ? Note that the negatives in c are set to the discount factor only later in line 3 above.
On line 4 you multiply the bounding-box losses corresponding to the negatives with d. Shouldn't you multiply with 0 (or just c)?

from synthtext.

cjnolet commented on July 24, 2024

Ankush, You are exactly right. That was an oversight on my part, lol. My solution is already confusing as it is but, as you pointed out, I was doing multiplying by the wrong 'c' matrices. My understanding was correct but my implementation was not. I will fix this asap. Thanks again for looking at this code! I really appreciate you taking the time to foster a community around this!

…

On Sun, Jan 29, 2017 at 2:18 PM, Ankush Gupta ***@***.***> wrote: Thanks @cjnolet <https://github.com/cjnolet>. Good to know that it is training with this loss. Can you please help me understand the code again: c = y_true[i, 6, :,:].reshape((1, delta, delta)) # The last feature map in the true vals is the 'c' matrix final_c = (c * loss[i,6,:,:]) c = T.set_subtensor(c[(c<=0.0).nonzero()], d.get_value())# Element-wise multiply of the c feature map against all feature maps in the loss final_loss_parts = [(c * loss[i, j, :, :].reshape((1, delta, delta))).reshape((1, delta, delta)) for j in range(0, 6)] 1. On line 2 above, you multiply c with the final channel of the loss (i.e., score), which zeros out score-loss for the negatives. Shouldn't you multiply with the discount factor (d in your code) instead ? Note that the negatives in c are set to the discount factor only later in line 3 above. 2. On line 4 you multiply the bounding-box losses corresponding to the negatives with d. Shouldn't you multiply with 0 (or just c)? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#10 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABL1YA9TRiBq-jYjE8QcvdJY915BeWCyks5rXOYCgaJpZM4LgPZH> .

from synthtext.

ankush-me commented on July 24, 2024

Thank you for the clarification, and again for sharing your code!

from synthtext.

can Fully-Convolutional Regression Network (FCRN) algorithm be open source? about synthtext HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent