han-jia / unicorn-maml Goto Github PK

View Code? Open in Web Editor NEW

36.0 1.0 7.0 4.06 MB

PyTorch implementation of "How to Train Your MAML to Excel in Few-Shot Classification"

Python 100.00%

meta-learning few-shot-learning maml iclr2022

unicorn-maml's People

Stargazers

Watchers

Forkers

gaoyi-byte suhyun777 mprzewie emerld2011 vieozhu wangfusehng

unicorn-maml's Issues

Reproducibility of the results

Hi again,

I'm interested in using pretrained weights in meta learning, and found your MAML perforamce is quite higher than others.

I tried to reproduce your results in my own code (Conv4, Miniimagenet, 5way 1shot), but the performance hardly improved (around 47~48) even with pretrained weight (from here), increased number of adaptation steps, and higher learning rate.

So, I tried again with your code, but still got quite low performance even with the script you gave.

For ResNet, miniimagenet, 5 way 1 shot,

python train_fsl.py --max_epoch 100 --way 5 --eval_way 5 --lr_scheduler step --model_class MAML --lr_mul 10 --backbone_class Res12 --dataset MiniImageNet --gpu 0 --query 15 --step_size 20 --gamma 0.1 --para_init './saves/initialization/miniimagenet/Res12-pre.pth' --lr 0.001 --shot 1 --eval_shot 1  --temperature 0.5 --gd_lr 0.05 --inner_iters 15

This script gave me this results (60.92), which is quite lower than your paper (64.42 in your paper).

For ConvNet, miniimagenet, 5 way 1 shot.

python train_fsl.py --max_epoch 100 --way 5 --eval_way 5 --lr_scheduler step --model_class MAML --lr_mul 10 --backbone_class ConvNet  --dataset MiniImageNet --gpu 9 --query 15 --step_size 20 --gamma 0.5 --para_init './saves/initialization/miniimagenet/con-pre.pth' --lr 0.002 --shot 1 --eval_shot 1 --temperature 0.5 --gd_lr 0.05 --inner_iters 15

This script gave me this results (45.84), which is quite lower than your paper (54.89 in your paper).

If you don't mind, please check the reproducibility of your results, and could you please release the code with ConvNet network?

issue for code

i noticed that in the file fsl_train.py,you set running_dict both mean and var as copy mean.in the line 201-202.is that design on purpose or it is a mistake

CIFAR-FS & FC100 pretrained model

Hi, thanks for sharing code. If we want to see the performance on cifar-fs and fc100,
can we get access to the pretrained weights on CIFAR-FS & FC100 datasets?

Or just simply train without pretrained model??

Question about the model architecture (the batchnorm)

Hi, again :)

Thank you for the wonderful work.

While reading your paper and seeing the implementation, it seems that this work used normal batchnorm (instead of transductive batchnorm that is usually used in other MAML-based papers)

Have you tried transductive batchnorm in your case? I was trying to use pre-trained checkpoints when training my MAML, but, I cannot reach the performance in your paper. I was just wondering whether the transductive batchnorm was the cause.

Thank you very much for your time!
Best,
Jihoon

Questions about batchnorm

Hi, thank you for your great work!

I already saw the past issue about batchnorm, but i'm still little confused.

First of all, before you evaluate, you set the encoder attribute 'is_training' to True and i don't know why you set that attribute to True and how it affects.
As i know, there is no attribute named 'is_training' in PyTorch Module (and also in your encoder) and setting 'is_training' to True does not change the mode of Module.
So if your purpose was to change the mode of encoder to 'train', i think using self.model.encoder.train() is the right way.

Secondly, if your purpose was to set the mode of encoder to 'train' and it works alright, then encoder will never use it's running mean and variance both training and evaluation phase. And then, there is no reason to store and reset those values.

Question about zero_grad

UNICORN-MAML/model/trainer/fsl_trainer.py

Line 152 in c66b9f3

self.model.zero_grad()

Hi, thanks for sharing code. It seems that zero_grad is usually before the backward. Is there any reason to zero_grad after the backward here? Or it really doesn't matter?

normalization of resnet12 and convnet

Hi, thanks for sharing this impressive work. I notice that in the implementation of dataloaders, the normalization of convnet and resnet are using different values? Are there any reasons for this?

Question about training without pretrained weight

I'm a huge fan of your excellent work!

Have you tried to train UNICORN MAML without using pre-training weights? I found that if I do not use the pre-training weights, the training result is only 26%+. Thus, could you please provide the parameter settings in this case ?

Many thanks~

在colab中运行

报错显示ImportError: cannot import name 'FewShotModel' from 'model.models.base' (/content/UNICORN-MAML/model/models/base.py)，应该怎么解决呢

batchnorm training mode in convnet_maml

UNICORN-MAML/model/networks/convnet_maml.py

Line 31 in c66b9f3

    
           running_var=self._modules['{0}_{1}'.format(i,1)].running_var, training = self.is_training)

Hi, thanks again for this impressive work. I really learned a lot. I have a question about the batchnorm in the implementation of convnet_maml. It seems that the variable self.training is always True. In the implementation of nn.BatchNorm2d, this variable is controlled by :
if self.training:
bn_training = True
else:
bn_training = (self.running_mean is None) and (self.running_var is None)

then F.batch_norm is applied.
https://pytorch.org/docs/stable/_modules/torch/nn/modules/batchnorm.html#BatchNorm2d

Since in the forward of convnet_maml, F.batch_norm is applied, and the self.training is alway True, the batch_norm is always in the training mode, isn't it?

han-jia / unicorn-maml Goto Github PK

unicorn-maml's People

Stargazers

Watchers

Forkers

unicorn-maml's Issues

Reproducibility of the results

issue for code

CIFAR-FS & FC100 pretrained model

Question about the model architecture (the batchnorm)

Questions about batchnorm

Question about zero_grad

normalization of resnet12 and convnet

Question about training without pretrained weight

在colab中运行

batchnorm training mode in convnet_maml

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent