grypesc / seed Goto Github PK
View Code? Open in Web Editor NEWICLR2024 paper on Continual Learning
Home Page: https://arxiv.org/abs/2401.10191
License: MIT License
ICLR2024 paper on Continual Learning
Home Page: https://arxiv.org/abs/2401.10191
License: MIT License
Hello, Thank you for your amazing work and code.
I use the setting of SEED for 10 and 5 steps, but I got a significant lower results than those documented in the paper.
I modify the --num-tasks
and --nc-first-task
to run the experiment about T=6 (|C1|=50) and T=11 (|C1|=50) separately:
python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl --num-tasks 6 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet32 --extra-aug fetril --momentum 0.9 --exp-name exp_50+5x10 --seed 0
python src/main_incremental.py --approach seed --gmms 1 --max-experts 5 --use-multivariate --nepochs 200 --tau 3 --batch-size 128 --num-workers 4 --datasets cifar100_icarl --num-tasks 11 --nc-first-task 50 --lr 0.05 --weight-decay 5e-4 --clipping 1 --alpha 0.99 --use-test-as-val --network resnet32 --extra-aug fetril --momentum 0.9 --exp-name exp_50+10x5 --seed 0
I get avg_acc = 67.2 on T=6 (|C1|=50) and avg_acc = 66.6 on T=11 (|C1|=50), which is more lower than the results in the paper.
Could you please elaborate more details on what method you used in the paper? And how I can reproduce the results.
Thanks.
In Table 3, it is shown that SEED utilizes 3.2 million parameters, whereas ResNet18 has 11.7 million parameters. Could you clarify which part of the parameters is referred to by "#Params." in the table?
Dear authors,
Congratulations your accepted paper at ICLR 2024 and your effort to release the code for community. After reading your paper, I think it's a good start to investigate the application of ensembling experts for CL. However, I have a question related to how did you select the experts for finetuning a new task.
As depicted in the Fig.3, the new distribution of task 3's classes will be compared with the old tasks t1 and t2 by the KL divergence. As a result, we have to save the distribution of old tasks. However, as indicated in the text below, the distribution set Q_k only contains the distributions of the current task's classes from 1 to C_t and ignore all tasks from previous classes. Then, in Eq.(2), the KL divergence is computed with in the set Q_k since both q_ik and q_ik are in Q_k. Therefore, we don't have to take into account the class distributions from previous tasks. And I take a look at the code from line 188, it's seem like you are still consider the class distribution of prior tasks.
I was confused about how to interpret it correctly and I am looking forward to hearing from you for clarification. Feel free to correct me if I was wrong.
Best,
Cuong
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.