han-jia / unicorn-maml Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of "How to Train Your MAML to Excel in Few-Shot Classification"
PyTorch implementation of "How to Train Your MAML to Excel in Few-Shot Classification"
Hi again,
I'm interested in using pretrained weights in meta learning, and found your MAML perforamce is quite higher than others.
I tried to reproduce your results in my own code (Conv4, Miniimagenet, 5way 1shot), but the performance hardly improved (around 47~48) even with pretrained weight (from here), increased number of adaptation steps, and higher learning rate.
So, I tried again with your code, but still got quite low performance even with the script you gave.
For ResNet, miniimagenet, 5 way 1 shot,
python train_fsl.py --max_epoch 100 --way 5 --eval_way 5 --lr_scheduler step --model_class MAML --lr_mul 10 --backbone_class Res12 --dataset MiniImageNet --gpu 0 --query 15 --step_size 20 --gamma 0.1 --para_init './saves/initialization/miniimagenet/Res12-pre.pth' --lr 0.001 --shot 1 --eval_shot 1 --temperature 0.5 --gd_lr 0.05 --inner_iters 15
This script gave me this results (60.92), which is quite lower than your paper (64.42 in your paper).
For ConvNet, miniimagenet, 5 way 1 shot.
python train_fsl.py --max_epoch 100 --way 5 --eval_way 5 --lr_scheduler step --model_class MAML --lr_mul 10 --backbone_class ConvNet --dataset MiniImageNet --gpu 9 --query 15 --step_size 20 --gamma 0.5 --para_init './saves/initialization/miniimagenet/con-pre.pth' --lr 0.002 --shot 1 --eval_shot 1 --temperature 0.5 --gd_lr 0.05 --inner_iters 15
This script gave me this results (45.84), which is quite lower than your paper (54.89 in your paper).
If you don't mind, please check the reproducibility of your results, and could you please release the code with ConvNet network?
i noticed that in the file fsl_train.py,you set running_dict both mean and var as copy mean.in the line 201-202.is that design on purpose or it is a mistake
Hi, again :)
Thank you for the wonderful work.
While reading your paper and seeing the implementation, it seems that this work used normal batchnorm (instead of transductive batchnorm that is usually used in other MAML-based papers)
Have you tried transductive batchnorm in your case? I was trying to use pre-trained checkpoints when training my MAML, but, I cannot reach the performance in your paper. I was just wondering whether the transductive batchnorm was the cause.
Thank you very much for your time!
Best,
Jihoon
Hi, thank you for your great work!
I already saw the past issue about batchnorm, but i'm still little confused.
First of all, before you evaluate, you set the encoder attribute 'is_training' to True and i don't know why you set that attribute to True and how it affects.
As i know, there is no attribute named 'is_training' in PyTorch Module (and also in your encoder) and setting 'is_training' to True does not change the mode of Module.
So if your purpose was to change the mode of encoder to 'train', i think using self.model.encoder.train() is the right way.
Secondly, if your purpose was to set the mode of encoder to 'train' and it works alright, then encoder will never use it's running mean and variance both training and evaluation phase. And then, there is no reason to store and reset those values.
UNICORN-MAML/model/trainer/fsl_trainer.py
Line 152 in c66b9f3
Hi, thanks for sharing code. It seems that zero_grad is usually before the backward. Is there any reason to zero_grad after the backward here? Or it really doesn't matter?
Hi, thanks for sharing this impressive work. I notice that in the implementation of dataloaders, the normalization of convnet and resnet are using different values? Are there any reasons for this?
I'm a huge fan of your excellent work!
Have you tried to train UNICORN MAML without using pre-training weights? I found that if I do not use the pre-training weights, the training result is only 26%+. Thus, could you please provide the parameter settings in this case ?
Many thanks~
报错显示ImportError: cannot import name 'FewShotModel' from 'model.models.base' (/content/UNICORN-MAML/model/models/base.py),应该怎么解决呢
Hi, thanks again for this impressive work. I really learned a lot. I have a question about the batchnorm in the implementation of convnet_maml. It seems that the variable self.training is always True. In the implementation of nn.BatchNorm2d, this variable is controlled by :
if self.training:
bn_training = True
else:
bn_training = (self.running_mean is None) and (self.running_var is None)
then F.batch_norm is applied.
https://pytorch.org/docs/stable/_modules/torch/nn/modules/batchnorm.html#BatchNorm2d
Since in the forward of convnet_maml, F.batch_norm is applied, and the self.training is alway True, the batch_norm is always in the training mode, isn't it?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.