Coder Social home page Coder Social logo

gmvandeven / class-incremental-learning Goto Github PK

View Code? Open in Web Editor NEW
70.0 2.0 14.0 1.55 MB

PyTorch implementation of a VAE-based generative classifier, as well as other class-incremental learning methods that do not store data (DGR, BI-R, EWC, SI, CWR, CWR+, AR1, the "labels trick", SLDA).

License: MIT License

Shell 1.18% Python 98.82%
continual-learning class-incremental-learning generative-classifier generative-classification variational-autoencoder elastic-weight-consolidation synaptic-intelligence cwr cwr-plus ar1

class-incremental-learning's Introduction

Class-Incremental Learning with Generative Classifiers

A PyTorch implementation of the CVPRW-2021 paper "Class-Incremental Learning with Generative Classifiers" (published version, preprint version). Besides an implementation of the VAE-based generative classifier explored in this paper, this repository also provides implementations of all class-incremental learning mehods to which the generative classifier is compared (i.e., DGR, BI-R, EWC, SI, CWR, CWR+, AR1, the 'labels trick' & SLDA).

Installation & requirements

The current version of the code has been tested with Python 3.6.9 on a Linux operating system with the following versions of PyTorch and Torchvision:

  • pytorch 1.7.1
  • torchvision 0.8.2

Assuming Python and pip are set up, the Python-packages used by this code can be installed using:

pip install -r requirements.txt

The code in this repository itself does not need to be installed, but a number of scripts might need to be made executable:

chmod +x main_*.py compare_*.py commands.sh preprocess_core50.py

Loading and pre-processing the CORe50 dataset

The MNIST, CIFAR-10 and CIFAR-100 datasets will be automatically downloaded when their benchmarks are run for the first time. By default, they will be downloaded to the folder ./store/datasets, but this can be changed with the option --data-dir. To use CORe50, the following command should be run first in order to download and pre-process the CORe50 dataset:

./preprocess_core50.py

More information about the CORe50 dataset can be found here: https://vlomonaco.github.io/core50/.

Running comparisons from the paper

The script commands.sh provides step-by-step instructions for re-running the experiments reported in the paper.

Although it is possible to run this script as is, it will take long and it might be sensible to use fewer random seeds or to parallellize the experiments.

NOTE: there is an issue with the implementation of BI-R in this repository, which causes the performance of BI-R and BI-R+SI on Split CIFAR-100 to be somewhat lower than reported in the accompanying paper. I am trying to figure out why this is the case. For now, the results reported for BI-R and BI-R+SI on Split CIFAR-100 can be reproduced using this repository. Apologies!

Running custom experiments

With this code it is also possible to run custom class-incremental learning experiments.

Generative classifier

Individual experiments with the VAE-based generative classifier can be run with main_generative.py. The main options for this script are:

  • --experiment: which dataset? (MNIST|CIFAR10|CIFAR100|CORe50)
  • --iters: how many iterations per class?
  • --batch: what mini-batch size to use?

For information on further options: ./main_generative.py -h

Other class-incremental learning methods

Using main_cl.py, it is possible to run custom individual experiments with other class-incremental learrning methods. It is also possible to combine some of the methods together. The main options for this script are:

  • --experiment: which dataset? (MNIST|CIFAR10|CIFAR100|CORe50)
  • --tasks: how many tasks?
  • --iters: how many iterations per task?
  • --batch: what mini-batch size to use?

To run specific methods, the following can be used:

  • Synaptic intelligenc (SI): ./main_cl.py --si --c=0.1
  • Elastic weight consolidation (EWC): ./main_cl.py --ewc --lambda=5000
  • Deep Generative Replay (DGR): ./main_cl.py --replay=generative
  • Brain-Inspired Replay (BI-R): ./main_cl.py --replay=generative --brain-inspired (there is an issue with the implementation of BI-R in this repository; please instead use this repository or this repository to run BI-R)
  • CopyWeights with Re-init (CWR): ./main_cl.py --cwr --freeze-after-first --freeze-fcE --freeze-convE
  • CWR+: ./main_cl.py --cwr-plus --freeze-after-first --freeze-fcE --freeze-convE
  • AR1: ./main_cl.py --cwr-plus --si --reg-only-hidden --c=0.1 --omega-max=0.1
  • The 'labels trick': ./main_cl.py --neg-samples=current
  • Streaming LDA: ./main_cl.py --slda

For information on further options: ./main_cl.py -h

Note that this repository only supports class-incremental learning methods that do not store data. PyTorch-implementations for several methods relying on a memory buffer with stored data (e.g., Experience Replay, iCaRL, A-GEM) can be found here: https://github.com/GMvandeVen/continual-learning.

On-the-fly plots during training

It is possible to track progress during training with on-the-fly plots. This feature requires visdom. Before running the experiments, the visdom server should be started from the command line:

python -m visdom.server

The visdom server is now alive and can be accessed at http://localhost:8097 in your browser (the plots will appear there). The flag --visdom should then be added when calling ./main_generative.py or ./main_cl.py to run the experiments with on-the-fly plots.

For more information on visdom see https://github.com/facebookresearch/visdom.

Citation

Please consider citing the accompanying paper if you use this code in your research:

@inproceedings{vandeven2021class,
    title     = {Class-Incremental Learning With Generative Classifiers},
    author    = {van de Ven, Gido M. and Li, Zhe and Tolias, Andreas S.},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month     = {June},
    year      = {2021},
    pages     = {3611-3620}
}

Acknowledgments

The research project from which this code originated has been supported by the Lifelong Learning Machines (L2M) program of the Defence Advanced Research Projects Agency (DARPA) via contract number HR0011-18-2-0025 and by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DoI/IBC) contract number D16PC00003. Disclaimer: views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA, IARPA, DoI/IBC, or the U.S. Government.

class-incremental-learning's People

Contributors

gmvandeven avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

class-incremental-learning's Issues

Question regarding the online incremental learning

Hi,

Nice to meet you.
May I know is this code is applicable to online incremental learning? Because after i read your paper, the paper mentions online incremental learning.

Thanks if you could assist me :D

Question about model size

Hi Gido,

Does this method have a drawback of a too large model size for large label space like ImageNet?

Core50 result

flow<compare_all.py --experiment=CORe50 --n-seeds=10 --seed=11 --single-epochs --batch=1 --fc-layers=2 --z-dim=200 --fc-units=1024 --lr=0.0001 --c=10 --lambda=10 --omega-max=0.1 --ar1-c=1. --dg-prop=0. --bir-c=0.01 --si-dg-prop=0.6>

According to the code provided in the file, the result of running Core50 dataset is inconsistent with that in the article( BIR Table2)

About the task split when training the proposed method

In the paper, you mentiond that MNIST and CIFAR10 are both split into 5 tasks, but I can't find the argument for the number of tasks (i.e., something like add_argument("--tasks", type=int....)) in the file options_gen_classifier.py,
image
but I did find an argument called --tasks in the file options.py .
image
Could you please tell me the reason~? Thank you.

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

Got this error:

internals>", line 200, in argmax
File "/home/19mkn1/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 1242, in argmax
return _wrapfunc(a, 'argmax', axis=axis, out=out, **kwds)
File "/home/19mkn1/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 54, in _wrapfunc
return _wrapit(obj, method, *args, **kwds)
File "/home/19mkn1/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 43, in _wrapit
result = getattr(asarray(obj), method)(*args, **kwds)
File "/home/19mkn1/.local/lib/python3.8/site-packages/torch/_tensor.py", line 678, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
srun: error: aurora: task 0: Exited with exit code 1

This error occurs in a Python script that is attempting to convert a PyTorch tensor to a NumPy array. However, the tensor is located on a CUDA device (i.e., GPU) rather than on the CPU. PyTorch doesn't support directly converting CUDA tensors to NumPy arrays because NumPy operates on CPU memory.

The error message specifically suggests using Tensor.cpu() to copy the tensor to host memory first, before converting it to a NumPy array. Here's how you can fix it:

import torch

# Assuming 'cuda_tensor' is your PyTorch tensor on the CUDA device
cuda_tensor = torch.tensor([1, 2, 3]).cuda()

# Move the tensor to CPU
cpu_tensor = cuda_tensor.cpu()

# Now you can convert it to a NumPy array
numpy_array = cpu_tensor.numpy()

By first moving the tensor to the CPU using cpu(), you can then safely convert it to a NumPy array.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.