recurrent-attention-cnn's People
Forkers
fujianlong heliang10 yiliangnie lengzi wh-forker sinianyutian rongrong005 choiyeren panxipeng zbxzc35 lijuny ellinyang hangjie720 lzd0825 kekedan panna19951227 huanhuanzhang shubhampachori12110095 dreadlord1984 inmgjim yoyo-yun sailor01 yida88 zwcdp wayne980 lzh990711 undercontroller qianlinjun l475139105 swg209 denglingo bupttianlei topologyapplied yongqis gptcod amoonhappy wilisonking kelevis mia2mia apllolulu sitongzhen gaoziqiang mengzmd wy137190recurrent-attention-cnn's Issues
Is the code and model link still available?
I tried to get the model source code from the link in the README, but the link was not accessible. If someone has gotten the source code, would you mind give me a backup?
original implement code
Thanks for your sharing.I have downloaded the file from google drive.It is too hard for a students in china. How can i get the train code .(I con‘t found it in file. can you tell me the location?I am not familiar with caffe. ) Thank you all the same.
Some questions about training and testing
when will you open the solver.prototxt file for training and the test when the synset.txt file for cub-200 to test the model?
Vanishing gradient issue in APN
I am trying to re-implement this experiment in pytorch.
However, weights of APN(Attention Proposal Network) aren't updated because of extremely low gradients.
I think this issue is from logistic function of eq(5). It looks like a flat region of logistic function makes gradients almost zero.
In the paper, authors pretrained APN using last cnn features. Did you record the performance without this initialization?
Thank you.
The effect of softmax loss and rank loss
When training the APN layers, it seems that rank loss tries to make the three scale softmax losses listed in descending order and enlarge the gap among them. On the contrary, when training the convolutional/classification layers only with sum of softmax losses, softmax loss of every scale tends to be equal, which means that the gap among them is narrowed.
Is this reasonable for training? Although every training stage only updates corresponding parameters, I still have doubt whether these two stages will cancel each other.
Implementation in pytorch
Hi
i am working on implementation to reproduce this paper with pytorch.
But stuck in the pre-train a APN network.
Original code doesn't give the details about learning a APN network, step2.
Also condition about convergence. if loss fluctuate forever, when should i stop to train?
Anyone progress in reproducing this? Test code are 100% useless to reproduce this results.
How can we try RACNN on other public dataset?
If anyone who interested in reproducing this, plz contact me. we can discuss further about training details
论文流程理解
打算用tensorflow复现RACNN,但是中间有几步总是觉得难以实现,说一下我对全程的理解,希望能有人帮忙看看是否有误
1、用普通VGG进行分类,微调,直到分类效果不再提升
2、固定住VGG参数,输出最后一层卷积层的输出到APN里面,输出三个用于定位的参数,
3、裁剪图片,用1固定住的VGG继续做分类,将公式8计算出来的loss作为APN的loss(实际是由VGG计算出来的)进行优化,改变三个定位参数
4、重新循环2,3直至VGG分类效果不再提升
5、循环1234
Where can i download the original implement code?
The code website on onedrive is missing. Is there any other place can i get the code?
Paper's VGG-19 accuracy question
Hi, first of all thanks for your great work!
In your paper you cite the VGG-19 [27] model and state that on the CUB-200-2011 dataset it achieves 77.8% accuracy. Can you please give some more info about this? Are you referring to the only Imagenet trained model? Or on the fine-tuned by you model? Or fine-tuned by someone else model? Is it the Caffe model?
And if you did train it can you share some of the details like batch size, learning rate, epochs of the training, data augmentation?
Thanks,
Andrea
Equation 7 in the paper confused me
The Equation 7 in the paper confused me, If the || is interpreted as absolute value, then it means that the linear map coefficients gets larger as indicies i and j gets larger, How sholud I interpreter this equation?
some problem about APN
I have some question about the APNcrop , tx,ty is the Center point of APN,and tl is half of the APN
,and tl is the Upper left of the picture,br is lower right of the picture,The X axis of the upper left corner and the lower right corner of the region obtained by crop is (tx(tl)、tx(br)) y axis (ty(tl)、ty(br)):
tx(tl) =tx - tl ty(tl) = ty -tl
tx(br) =tx + tl ty(br) = ty +tl
but the real should be:
tx(tl) =tx - tl ty(tl) = ty +tl
tx(br) =tx + tl ty(br) = ty -tl
I don't know where is wrong,could you tell me where my thought is wrong?
Implement RA-CNN in tensorflow
How to implement formular (4) in the paper?
If we have the three tensors, tx, ty and tl with the shape (None, 1) respectively, how do we get the corresponding mask M?
In python, we can use two loops as:
for x in tx:
for y in ty:
mask[x, y] = hx(x, y)
Then how to implement this in TensorFlow?
Thank you.
AttentionCrop Layer: Where can we find it?
Hi Jianlong,
You have used AttentionCrop layer, but, I think, you have not released it. Can you please make it available?
Thank you
Data augmentation?
Hello, can you tell us some details about data augmentation?
thanks.
code and model
I cant't open the address you has given us. @Jianlong-Fu . https://1drv.ms/u/s!Ak3_TuLyhThpkxifVPt-w8e-axc5
The question about training with rank loss .
@Jianlong-Fu Hi, Jianlong, I have your train_val_cnn.prototxt and code ( include rank_loss2_layer), but when I make triaining, it do not work..
Has been staying such as:
I0524 11:50:19.434263 13879 net.cpp:242] This network produces output accuracy
I0524 11:50:19.434267 13879 net.cpp:242] This network produces output loss1
I0524 11:50:19.434268 13879 net.cpp:242] This network produces output loss2
I0524 11:50:19.434273 13879 net.cpp:242] This network produces output loss3
I0524 11:50:19.434274 13879 net.cpp:242] This network produces output rank
I0524 11:50:19.434320 13879 net.cpp:255] Network initialization done.
I0524 11:50:19.434532 13879 solver.cpp:56] Solver scaffolding done.
I0524 11:50:19.438776 13879 caffe.cpp:249] Starting Optimization
I0524 11:50:19.438781 13879 solver.cpp:273] Solving RA_CNN
I0524 11:50:19.438783 13879 solver.cpp:274] Learning Rate Policy: step
What's the problem, please? I really need you help. Thank you very much!
Why we can not achieve the performance mentioned in original paper?
Missing Windows under RA_CNN_caffe folder
Downloaded the relevant file, when compiling caffe will prompt error LNK2001: Unresolved external symbol "public: __cdecl caffe::SolverRegisterer::SolverRegisterer(class std::basic_string<char,struct std: :char_traits,class std::allocator > const &,class caffe::Solver * (__cdecl*)(class caffe::SolverParameter const &))" (??0?$SolverRegisterer@ M@caffe@@qeaa@AEBV?$basic_string@DU?$char_traits@D@std@@v?$allocator@D@2@@std@@P6APEAV?$Solver@M@1@AEBVSolverParameter@1@@z @z) D:\caffe\Recurrent-Attention-CNN\RA_CNN_caffe\windows\classification\adadelta_solver.obj classification error.
Is this because I don't have a related Windows file?
Anyone who achieve reported performance ? Run successfully, 0.78ccuracy gained
To prepare for the lmdb
data, I first resize the short side to 448 with original ratio in Matlab, and save them on the disk. Then I use official caffe code to generate test_shortside448.lmdb
. However, I failed at last.
I tried two different settings: class label starts from 0 and class label starts from 1. None of then worked. Is there any other detail I messed?
I0427 20:00:13.632690 7424 net.cpp:255] Network initialization done.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 1064471629
I0427 20:00:15.885128 7424 caffe.cpp:291] Running for 50 iterations.
I0427 20:00:16.072052 7424 caffe.cpp:314] Batch 0, accuracy1+2 = 0
I0427 20:00:16.072088 7424 caffe.cpp:314] Batch 0, accuracy1+2+3 = 0
...
I0427 20:00:18.361052 7424 caffe.cpp:314] Batch 49, accuracy1+2 = 0
I0427 20:00:18.361081 7424 caffe.cpp:314] Batch 49, accuracy1+2+3 = 0
I0427 20:00:18.361088 7424 caffe.cpp:319] Loss: 0
I0427 20:00:18.361097 7424 caffe.cpp:331] accuracy1+2 = 0
I0427 20:00:18.361104 7424 caffe.cpp:331] accuracy1+2+3 = 0.02```
where is the code?
I am sorry, this link "https://1drv.ms/u/s!Ak3_TuLyhThpkxQE4tw96xNUiBbn" can only download model and deploy.prototxt, and this new net has a new layer name's "AttensionCrop",so if we haven't code,we cannot use it.It is best that author will public the source code. thank you.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.