Coder Social home page Coder Social logo

unicorn-maml's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

unicorn-maml's Issues

Reproducibility of the results

Hi again,

I'm interested in using pretrained weights in meta learning, and found your MAML perforamce is quite higher than others.

I tried to reproduce your results in my own code (Conv4, Miniimagenet, 5way 1shot), but the performance hardly improved (around 47~48) even with pretrained weight (from here), increased number of adaptation steps, and higher learning rate.

So, I tried again with your code, but still got quite low performance even with the script you gave.


For ResNet, miniimagenet, 5 way 1 shot,

python train_fsl.py --max_epoch 100 --way 5 --eval_way 5 --lr_scheduler step --model_class MAML --lr_mul 10 --backbone_class Res12 --dataset MiniImageNet --gpu 0 --query 15 --step_size 20 --gamma 0.1 --para_init './saves/initialization/miniimagenet/Res12-pre.pth' --lr 0.001 --shot 1 --eval_shot 1  --temperature 0.5 --gd_lr 0.05 --inner_iters 15

This script gave me this results (60.92), which is quite lower than your paper (64.42 in your paper).

image


For ConvNet, miniimagenet, 5 way 1 shot.

python train_fsl.py --max_epoch 100 --way 5 --eval_way 5 --lr_scheduler step --model_class MAML --lr_mul 10 --backbone_class ConvNet  --dataset MiniImageNet --gpu 9 --query 15 --step_size 20 --gamma 0.5 --para_init './saves/initialization/miniimagenet/con-pre.pth' --lr 0.002 --shot 1 --eval_shot 1 --temperature 0.5 --gd_lr 0.05 --inner_iters 15 

This script gave me this results (45.84), which is quite lower than your paper (54.89 in your paper).
image


If you don't mind, please check the reproducibility of your results, and could you please release the code with ConvNet network?

issue for code

i noticed that in the file fsl_train.py,you set running_dict both mean and var as copy mean.in the line 201-202.is that design on purpose or it is a mistake

CIFAR-FS & FC100 pretrained model

Hi, thanks for sharing code. If we want to see the performance on cifar-fs and fc100,
can we get access to the pretrained weights on CIFAR-FS & FC100 datasets?
image

Or just simply train without pretrained model??

Question about the model architecture (the batchnorm)

Hi, again :)

Thank you for the wonderful work.

While reading your paper and seeing the implementation, it seems that this work used normal batchnorm (instead of transductive batchnorm that is usually used in other MAML-based papers)

Have you tried transductive batchnorm in your case? I was trying to use pre-trained checkpoints when training my MAML, but, I cannot reach the performance in your paper. I was just wondering whether the transductive batchnorm was the cause.

Thank you very much for your time!
Best,
Jihoon

Questions about batchnorm

Hi, thank you for your great work!

I already saw the past issue about batchnorm, but i'm still little confused.

First of all, before you evaluate, you set the encoder attribute 'is_training' to True and i don't know why you set that attribute to True and how it affects.
As i know, there is no attribute named 'is_training' in PyTorch Module (and also in your encoder) and setting 'is_training' to True does not change the mode of Module.
So if your purpose was to change the mode of encoder to 'train', i think using self.model.encoder.train() is the right way.

Secondly, if your purpose was to set the mode of encoder to 'train' and it works alright, then encoder will never use it's running mean and variance both training and evaluation phase. And then, there is no reason to store and reset those values.

normalization of resnet12 and convnet

Hi, thanks for sharing this impressive work. I notice that in the implementation of dataloaders, the normalization of convnet and resnet are using different values? Are there any reasons for this?

Question about training without pretrained weight

I'm a huge fan of your excellent work!

Have you tried to train UNICORN MAML without using pre-training weights? I found that if I do not use the pre-training weights, the training result is only 26%+. Thus, could you please provide the parameter settings in this case ?

Many thanks~

在colab中运行

报错显示ImportError: cannot import name 'FewShotModel' from 'model.models.base' (/content/UNICORN-MAML/model/models/base.py),应该怎么解决呢

batchnorm training mode in convnet_maml

running_var=self._modules['{0}_{1}'.format(i,1)].running_var, training = self.is_training)

Hi, thanks again for this impressive work. I really learned a lot. I have a question about the batchnorm in the implementation of convnet_maml. It seems that the variable self.training is always True. In the implementation of nn.BatchNorm2d, this variable is controlled by :
if self.training:
bn_training = True
else:
bn_training = (self.running_mean is None) and (self.running_var is None)

then F.batch_norm is applied.
https://pytorch.org/docs/stable/_modules/torch/nn/modules/batchnorm.html#BatchNorm2d

Since in the forward of convnet_maml, F.batch_norm is applied, and the self.training is alway True, the batch_norm is always in the training mode, isn't it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.