Comments (10)
@yuleichin 他代码反向传播的时候用了那个sigmoid函数的梯度,因为直接crop是不可导的。前向传播的时候就无所谓了,可以直接crop
from recurrent-attention-cnn.
同学,不知道不实现没有,三个定位是用于当前batch的。
有个问题想想你请教下,在论文中Mask是为了用于求导,使用tensorflow相当于是自动求导的,感觉好像并不需要这个,不知道你是怎么实现的,能否交流下,谢谢。
from recurrent-attention-cnn.
最后我还是没有实现,不过确实你说的batch流程之前是搞错了。
你提到的mask是指哪一步?我可能没有注意到这是一个mask,我没有看过mask rcnn的内容,不知道你说的是否和那个mask一样
from recurrent-attention-cnn.
我有一个疑问,就是在使用sigmoid函数crop图像区域的时候,有的代码实现是根据预测的三个坐标的位置直接用[tx-tl:tx+tl,ty-tl:ty+tl]来点选原图像中的区域,然后再调用nn.resample函数来插值的。如果这样实现的话,为什么还需要论文中sigmoid函数相减确定crop区域呢?
就是说,这个crop操作是真的crop吗?如果是的话直接点选就好了?
如果只是把目标区域以外的点用boxcar function相乘都置为很小的值,然后直接根据目标区域的坐标来插值放大,那么网上的pytorch实现代码就有问题了。
from recurrent-attention-cnn.
@bluemandora soga! this's the key trick!
from recurrent-attention-cnn.
我的理解是,论文里面的sigmoid函数的x有个放大的系数k被设置得非常大,结果就是让sigmoid函数近似于阶梯函数,相当于使用soft attention的方式近似达到了hard attention的效果。M通过这个变式的sigmoid让框内的点的注意力系数接近1,而让框外的注意力系数接近0,这样的注意力作用于scale的输入图像上就产生了类似于crop的效果。
映射了注意力以后框外的位置都被映射为0,图像尺寸使用了双线性差值来恢复尺寸,估计要用到tx,ty和tl来恢复。
我没看懂的地方在于,step2里面说提取图像在VGG最后一层的激活值来确定初始的square,这里具体是怎么确定的?而且前面的哥们讨论的batch在论文里面也没有提到,所以模型到底是不是以batch为基础来训练的呢?
from recurrent-attention-cnn.
@Asichurter ,hey,哥们,我也对这里很迷,还有交替训练时,我是需要先每个尺度先预测遍,在融合fc的值,再预测遍?我现在还对这几个地方有点迷惑,能指点下吗?
from recurrent-attention-cnn.
@HiIcy 参照论文3.4的step3,应该是先固定住放大的APN网络(只是固定参数,但还是要前馈),用每个scale的分类损失优化VGG和fc。再固定住VGG和fc的参数,前馈一次,用rank loss来优化APN
from recurrent-attention-cnn.
@Asichurter 啊,不好意思!我大概理解,它是分开固定参数来优化,但论文里好像说的需要最后融合来预测,网上那个pytorch实现好像给那一行注释了,不知道你怎么看
from recurrent-attention-cnn.
我也很疑惑这一点,看的代码都是根据各个尺度做的预测结果,而原论文的意思是先把每个尺度做融合后再做预测。
from recurrent-attention-cnn.
Related Issues (20)
- Data augmentation? HOT 1
- Is the code and model link still available? HOT 6
- Anyone who achieve reported performance ? Run successfully, 0.78ccuracy gained HOT 8
- Vanishing gradient issue in APN HOT 1
- Implementation in pytorch HOT 22
- The question about training with rank loss . HOT 2
- code and model HOT 6
- some problem about APN HOT 1
- Why we can not achieve the performance mentioned in original paper?
- Missing Windows under RA_CNN_caffe folder
- Where can i download the original implement code? HOT 2
- original implement code
- where is the code? HOT 47
- Equation 7 in the paper confused me HOT 1
- AttentionCrop Layer: Where can we find it? HOT 4
- The effect of softmax loss and rank loss HOT 21
- Some questions about training and testing HOT 20
- Paper's VGG-19 accuracy question HOT 21
- Implement RA-CNN in tensorflow HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from recurrent-attention-cnn.