Comments (5)
Hi @chuong98 ,
The checkpoints of BatchNorm or ReLU are in poolformer_bn_s12.pth.tar, poolformer_relu_s12.pth.tar.
The reason why the identity token mixer still works is that sometimes local information is enough to predict. For example, for humans, our faces are largely different from other animals.
from poolformer.
Hi @chuong98 ,
There is a simple way to implement it. You just need to modify self.token_mixer = Pooling(pool_size=pool_size)
to self.token_mixer = nn.Identity()
.
The checkpoint is shown in:
poolformer_id_s12.pth.tar
from poolformer.
If the baseline implementation is:
def forward(self,x):
x = x + self.drop_path(self.token_mixer(self.norm1(x)))
x = x + self.drop_path(self.mlp(self.norm2(x)))
Is one of the following implementation correct for the Identity Case ?:
Case A:
def forward(self,x):
x = x + self.drop_path(self.norm1(x))
return x + self.drop_path(self.mlp(self.norm2(x)))
Case B:
def forward(self,x):
x = x + self.drop_path(self.norm1(x)-x)
return x + self.drop_path(self.mlp(self.norm2(x)))
Case C:
def forward(self,x):
return x + self.drop_path(self.mlp(self.norm2(x)))
from poolformer.
Wonderful! I evaluated the ckpt ang got 74.336.
But when I inspect the speed, the sp_12 is about 3x slower than ResNet 18. I inspect the model, and when I replace the GroupNorm with BatchNorm, the inference time reduces 1/2.
Would you mind release the ckpts of Using BatchNorm/and/or ReLU ? Thank you so much.
from poolformer.
Regarding to using Identity instead of pooling, I can't explain why it works. Because pooling is the only mechanism to learn the spatial information, and connect the neighbors. Now we even drop the pooling. Can you share your thoughts?
from poolformer.
Related Issues (20)
- Addition of the Organization on HuggingFace Transformers HOT 7
- count_include_pad=False HOT 1
- Invitation of making PR for OpenMMLab / MMSegmentation. HOT 5
- cvpr2022 call for demos HOT 1
- Inquiry about the Hybrid design HOT 2
- 1
- why use use_layer_scale HOT 5
- How to achieve the grad-CAM visualization? HOT 3
- How to measure MACs? HOT 5
- Aboutu the results graph HOT 3
- Design on positional embedding? HOT 4
- About MLN(Modified Layer Normalization) HOT 3
- s12 model Reproduction experiment HOT 1
- what makes pooling competitive performance or even more than attention? HOT 1
- No module named 'mmcv_ custom.runner.optimizer' HOT 1
- segmentation不使用分布式训练 HOT 1
- Pretrained weights for other versions HOT 8
- About clip_norm HOT 2
- How to check the number of parameters of both object detection and instance segmentation HOT 2
- Object detection training HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from poolformer.