Coder Social home page Coder Social logo

recurrent-attention-cnn's Issues

Is the code and model link still available?

I tried to get the model source code from the link in the README, but the link was not accessible. If someone has gotten the source code, would you mind give me a backup?

The question about training with rank loss .

@Jianlong-Fu Hi, Jianlong, I have your train_val_cnn.prototxt and code ( include rank_loss2_layer), but when I make triaining, it do not work..
Has been staying such as:
I0524 11:50:19.434263 13879 net.cpp:242] This network produces output accuracy
I0524 11:50:19.434267 13879 net.cpp:242] This network produces output loss1
I0524 11:50:19.434268 13879 net.cpp:242] This network produces output loss2
I0524 11:50:19.434273 13879 net.cpp:242] This network produces output loss3
I0524 11:50:19.434274 13879 net.cpp:242] This network produces output rank
I0524 11:50:19.434320 13879 net.cpp:255] Network initialization done.
I0524 11:50:19.434532 13879 solver.cpp:56] Solver scaffolding done.
I0524 11:50:19.438776 13879 caffe.cpp:249] Starting Optimization
I0524 11:50:19.438781 13879 solver.cpp:273] Solving RA_CNN
I0524 11:50:19.438783 13879 solver.cpp:274] Learning Rate Policy: step

What's the problem, please? I really need you help. Thank you very much!

Vanishing gradient issue in APN

I am trying to re-implement this experiment in pytorch.
However, weights of APN(Attention Proposal Network) aren't updated because of extremely low gradients.
I think this issue is from logistic function of eq(5). It looks like a flat region of logistic function makes gradients almost zero.

In the paper, authors pretrained APN using last cnn features. Did you record the performance without this initialization?

Thank you.

Implementation in pytorch

Hi

i am working on implementation to reproduce this paper with pytorch.
But stuck in the pre-train a APN network.

Original code doesn't give the details about learning a APN network, step2.
Also condition about convergence. if loss fluctuate forever, when should i stop to train?

Anyone progress in reproducing this? Test code are 100% useless to reproduce this results.
How can we try RACNN on other public dataset?

If anyone who interested in reproducing this, plz contact me. we can discuss further about training details

Anyone who achieve reported performance ? Run successfully, 0.78ccuracy gained

To prepare for the lmdb data, I first resize the short side to 448 with original ratio in Matlab, and save them on the disk. Then I use official caffe code to generate test_shortside448.lmdb. However, I failed at last.

I tried two different settings: class label starts from 0 and class label starts from 1. None of then worked. Is there any other detail I messed?


 I0427 20:00:13.632690  7424 net.cpp:255] Network initialization done.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1064471629
I0427 20:00:15.885128  7424 caffe.cpp:291] Running for 50 iterations.
I0427 20:00:16.072052  7424 caffe.cpp:314] Batch 0, accuracy1+2 = 0
I0427 20:00:16.072088  7424 caffe.cpp:314] Batch 0, accuracy1+2+3 = 0
...
I0427 20:00:18.361052  7424 caffe.cpp:314] Batch 49, accuracy1+2 = 0
I0427 20:00:18.361081  7424 caffe.cpp:314] Batch 49, accuracy1+2+3 = 0
I0427 20:00:18.361088  7424 caffe.cpp:319] Loss: 0
I0427 20:00:18.361097  7424 caffe.cpp:331] accuracy1+2 = 0
I0427 20:00:18.361104  7424 caffe.cpp:331] accuracy1+2+3 = 0.02```

original implement code

Thanks for your sharing.I have downloaded the file from google drive.It is too hard for a students in china. How can i get the train code .(I con‘t found it in file. can you tell me the location?I am not familiar with caffe. ) Thank you all the same.

Paper's VGG-19 accuracy question

Hi, first of all thanks for your great work!

In your paper you cite the VGG-19 [27] model and state that on the CUB-200-2011 dataset it achieves 77.8% accuracy. Can you please give some more info about this? Are you referring to the only Imagenet trained model? Or on the fine-tuned by you model? Or fine-tuned by someone else model? Is it the Caffe model?

And if you did train it can you share some of the details like batch size, learning rate, epochs of the training, data augmentation?

Thanks,
Andrea

Missing Windows under RA_CNN_caffe folder

Downloaded the relevant file, when compiling caffe will prompt error LNK2001: Unresolved external symbol "public: __cdecl caffe::SolverRegisterer::SolverRegisterer(class std::basic_string<char,struct std: :char_traits,class std::allocator > const &,class caffe::Solver * (__cdecl*)(class caffe::SolverParameter const &))" (??0?$SolverRegisterer@ M@caffe@@qeaa@AEBV?$basic_string@DU?$char_traits@D@std@@v?$allocator@D@2@@std@@P6APEAV?$Solver@M@1@AEBVSolverParameter@1@@z @z) D:\caffe\Recurrent-Attention-CNN\RA_CNN_caffe\windows\classification\adadelta_solver.obj classification error.
Is this because I don't have a related Windows file?

Implement RA-CNN in tensorflow

How to implement formular (4) in the paper?
If we have the three tensors, tx, ty and tl with the shape (None, 1) respectively, how do we get the corresponding mask M?

In python, we can use two loops as:
for x in tx:
for y in ty:
mask[x, y] = hx(x, y)

Then how to implement this in TensorFlow?
Thank you.

Equation 7 in the paper confused me

The Equation 7 in the paper confused me, If the || is interpreted as absolute value, then it means that the linear map coefficients gets larger as indicies i and j gets larger, How sholud I interpreter this equation?

The effect of softmax loss and rank loss

When training the APN layers, it seems that rank loss tries to make the three scale softmax losses listed in descending order and enlarge the gap among them. On the contrary, when training the convolutional/classification layers only with sum of softmax losses, softmax loss of every scale tends to be equal, which means that the gap among them is narrowed.

Is this reasonable for training? Although every training stage only updates corresponding parameters, I still have doubt whether these two stages will cancel each other.

论文流程理解

打算用tensorflow复现RACNN,但是中间有几步总是觉得难以实现,说一下我对全程的理解,希望能有人帮忙看看是否有误
1、用普通VGG进行分类,微调,直到分类效果不再提升
2、固定住VGG参数,输出最后一层卷积层的输出到APN里面,输出三个用于定位的参数,
3、裁剪图片,用1固定住的VGG继续做分类,将公式8计算出来的loss作为APN的loss(实际是由VGG计算出来的)进行优化,改变三个定位参数
4、重新循环2,3直至VGG分类效果不再提升
5、循环1234

some problem about APN

I have some question about the APNcrop , tx,ty is the Center point of APN,and tl is half of the APN
,and tl is the Upper left of the picture,br is lower right of the picture,The X axis of the upper left corner and the lower right corner of the region obtained by crop is (tx(tl)、tx(br)) y axis (ty(tl)、ty(br)):
tx(tl) =tx - tl ty(tl) = ty -tl
tx(br) =tx + tl ty(br) = ty +tl
but the real should be:
tx(tl) =tx - tl ty(tl) = ty +tl
tx(br) =tx + tl ty(br) = ty -tl
I don't know where is wrong,could you tell me where my thought is wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.