cnguyen10 / few_shot_meta_learning Goto Github PK

View Code? Open in Web Editor NEW

238.0 2.0 29.0 414 KB

Implementations of many meta-learning algorithms to solve the few-shot learning problem in Pytorch

License: MIT License

Python 78.31% Jupyter Notebook 21.69%

meta-learning maml prototypical-network pytorch algorithms

few_shot_meta_learning's People

Contributors

Stargazers

Watchers

few_shot_meta_learning's Issues

test in Platius model

Hi Cuong,
Thanks for your great work.
I want to ask about the test function in Platius.py what is the meaning of eps_generator in test function

Question about the implementation of VAMPIRE

Hi Cuong,

I really appreciate your work for the VAMPIRE algorithm. There are some questions about the implementation in Vampire2.py.

Why is the KL loss in the inner loop implemented as the KL distribution between q(\theta) and the standard Gaussion, instead of KL(q||p)? I do not find the correspondence in the original paper of VAMPIRE.
Why does the global update (i.e., validation() function in the code) also need a KL loss?

Thank you.

when trying to run Platipus.py with the provided defaults:
python3 main.py --datasource SineLine --ml-algorithm platipus --num-models 4 --first-order --network-architecture FcNet --no-batchnorm --num-ways 1 --k-shot 5 --inner-lr 0.001 --meta-lr 0.001 --num-epochs 100 --resume-epoch 0 --train

the following error occurs:
Platipus.py, line 195: ValueError: Loss is NaN.

Do you have any idea what might cause this? Doesn't seem to be a config issue since both defaults and altered params don't get any other results.

Thanks!
Leon

Question about the initialization of theta0 in abml

If I understand it right, the distribution of theta0 is not a vanilla Normal distribution as it's written in the application details in the original paper and instead it's a distribution of a normal distribution multiplied by a gamma distribution. However in your code I am confused about how is that presents as it seems that you just initialize mean and logsigma of theta0. How did you implement this part? I am struggling implementing a similar algorithm like abml and really really hope you can give some help. thx:)

Regression code

Hi, thank you for the wonderful code!

Are there any plans to open the code of regression models?

Thanks!
Jihoon

getting NaN's in ABML at about epoch 14

Thanks for publishing these implementations. I was running ABML on Omniglot for (20/5, 20/1, 5/5, and 5/1) n-way k-shot learning problems. On all four of the above experiments I start getting NaN's for ABML around epoch 12-14. Its weird that they all fail at the same spot. I have looked through the code carefully and I cannot see anything directly which might cause this, but you may have a better idea since you implemented it...

Any ideas where this is coming from? Here are the flags I was using to train. I may have added some flags related to dataloading, but nothing that interferes with the core.

python main.py \
    --datasource=$DATASET \
    --ds-folder $ROOT \
    --run $RUN \
    --ml-algorithm=abml \
    --num-models=2 \
    --minibatch 16 \
    --no-batchnorm \
    --n-way=20 \
    --k-shot=5 \
    --v-shot=$VSHOT \
    --num-epochs=40 \
    --num-episodes-per-epoch 10000 \
    --resume-epoch=0 \
    --train

Some questions about this code.

When calculating the calibration, why it adds some noise to the target in regression?
Like this,
outputs = outputs + (self.noise_std2)output_noises*
And should it be
outputs = outputs + self.noise_std*output_noises
As we all know, MAML can only calculate one value for a input, then how can it calculate the reliability diagram like fig 2(c) in you paper?

Thank you very much for your kind consideration and I am looking forward to your early reply.

Potential Problem of the loss function in ABML

Hi Cuong,

I really appreciate your work, especially this is the only piece of implementation of ABML I could find.

However,
when you calculate the loss for updating the meta-parameters here, it seems that you left the
behind.

I hope you could check this out. Looking forward to your reply.

Loss function for implementation of BMAML

Hi. Thank you for uploading the code for all these algorithms.

For the implementation of bayesian maml (classification problem), you are using simple cross-entropy loss. But, bayesian maml uses the chaser loss. Specifically, for each task, I think we have to compute the chaser and leader using SVGD and then update the global parameters using the average of the difference over multiple particles (multiple models) (This might be much tricker).

Did I miss something in this implementation? Correct me if I am wrong or I missed something.

thanks,
Deep

Consultation about the code

Hello, I want to utilize BMAML and PLATIPUS to multi-label sequence(1D) classification, and the code is really helpful! The difficulty I face now is that there are too many files, and I don't know in what order should I modify the code. Could you please give me some advice and help?

error in Platipus model with sineline data source

Models not training

Hi,

I like your repository and the code you have implemented. I am facing a couple of issues:

The dataset - i cannot find the dataset in the format you have asked. I have tried using the dataloader from torchmeta to run with your code but the issue is that most of your code when run does not go beyond 20.00 accuracy. Do you have any advice on what I may do?

First order approximate typo?

few_shot_meta_learning/classification_meta_learning.py

Line 454 in a38e45c

if config['first_order']:

The inputs always set to q_params, no matter the first-order is true or false. Is this a typo?

            if config['first_order']:
                all_grads = torch.autograd.grad(
                    outputs=loss,
                    inputs=q_params,
                    retain_graph=config['train_flag']
                )
            else:
                all_grads = torch.autograd.grad(
                    outputs=loss,
                    inputs=q_params,
                    create_graph=config['train_flag']
                )

Platipus loss function potentially doesn't match paper

Hi,

thank you for the great implementation of meta learning algorithms.

We are trying to evaluate the PLATIPUS algorithm and we noted that the paper requires a Negative Log Likelihood loss (Page 3, 3 Preliminaries), yet your adaptation step by default is using an MSE loss.

We were wondering if this is an adaptation of the original paper on your part or if we’re missing a crucial step where the NLL is calculated from the MSE. Later the loss we’re suspecting to be an MSE loss is logged to Tensorboard as a NLL again(Platipus.py, 216). This seems especially important since in Platipus.py, 184 the gradient is calculated on this loss function. To our understanding, this gradient might differ significantly from the gradient intended in the paper since it is calculated on a different loss function.

Is this just a trick in the implementation to use the MSE instead of the NLL or are we missing something in the implementation?

Thanks a lot in advance,
Leon

NaN loss when training with sine

Hi.
Thanks you so much for sharing the code.
i cloned your repo and i tried to run abml.py with sine curve. I get nan loss and the code exits.
Please let me know if i need to change any hyper parameters for this task.
Thanks.

cnguyen10 / few_shot_meta_learning Goto Github PK

few_shot_meta_learning's People

Contributors

Stargazers

Watchers

Forkers

few_shot_meta_learning's Issues

Recommend Projects

Recommend Topics

Recommend Org