srebuffi / icarl Goto Github PK

License: MIT License

Python 100.00%

icarl's Introduction

iCaRL: Incremental Classifier and Representation Learning

Tensorflow and Theano + Lasagne codes for the paper https://arxiv.org/abs/1611.07725

Disclaimer

The code is now very out-dated and not supported by the current Tensorflow so this repo should be considered as an indication on how we coded iCaRL rather than a runnable code.

Abstract

A major open problem on the road to artificial intelligence is the development of incrementally learning systems that learn about more and more concepts over time from a stream of data. In this work, we introduce a new training strategy, iCaRL, that allows learning in such a class-incremental way: only the training data for a small number of classes has to be present at the same time and new classes can be added progressively. iCaRL learns strong classifiers and a data representation simultaneously. This distinguishes it from earlier works that were fundamentally limited to fixed data representations and therefore incompatible with deep learning architectures. We show by experiments on CIFAR-100 and ImageNet ILSVRC 2012 data that iCaRL can learn many classes incrementally over a long period of time where other strategies quickly fail.

If you consider citing us

@inproceedings{ rebuffi-cvpr2017,
   author = { Sylvestre-Alvise Rebuffi and Alexander Kolesnikov and Georg Sperl and Christoph H. Lampert },
   title = {{iCaRL:} Incremental Classifier and Representation Learning},
   booktitle = CVPR,
   year = 2017,
}

icarl's People

Contributors

Stargazers

Watchers

Forkers

codeaudit shenyuanyuan wuyuebupt syedaunn lorenzogatto dhw-master jinyu121 lzx325 hellojing89 tianyuanyu 517517517 hajourra terminiter semanticbeeng qiqika sheqi guptadivyansh killawhale2 shaikhahmad josephkj zhangrong1722 wkumagai sddai csjunxu noodlesz caihengyu520 halahalazoe sheikh-bari ankur219 zhangzhao156 romilbhardwaj stephenmaturrin tarungorli valeriomieuli yaolezju etshang still-rise hiteshvaidya lijuny ruihangao vikul-gupta kingkarlito lunnada mao-example eeshakumar li5811100 ttl518 eeshakumar-tum huangjianglu buzzit-jimmytse barrel-titor lifeng9472 vittoriopipoli soubanerjee dhivya-rav cuikaichina deepayansanyal93 pancakeawesome dubingzhu marcelomata kaanwang ioansamuila lxhtaciturnity wb11uo doer666 zhen-zohn-wang madezhi rachmadvwp honeypotter-gzy wanghd-mvp mrcaelumn gbrunofranco annadiadl suhongmoon threeyang ahaqxjl xaviercucurull wangqing-hub error030 augustoolucas sntc129 parvex cuonglannguyen wangliang233 noploop laihoangle szmmm aspect-empire ioan-samuila

icarl's Issues

meta.mat file not found

anyone please help

data

I was wondering if you can share the link of data which you used from image-net website. I tried 2014 version but it the format are not matching . It would be great help if you can provide data that we can re produce your great result as well
Thanks

IndexError: index 0 is out of bounds for axis 0 with size 0

When running iCaRL, I got this error:

Traceback (most recent call last):
  File "iCaRL/main_resnet_tf.py", line 277, in <module>
    np.where(files_iter == files_protoset[iteration2 * nb_cl + iter_dico][i])[0][0])
IndexError: index 0 is out of bounds for axis 0 with size 0

I think that THERE should be

ind_herding = []
for i in range(min(nb_protos_cl, len(files_protoset[iteration2 * nb_cl + iter_dico]))):
    try:
        ind_herding.append(
            np.where(files_iter == files_protoset[iteration2 * nb_cl + iter_dico][i])[0][0])
    except:
        pass
ind_herding = np.array(ind_herding)

ind_herding = np.where(np.in1d(files_iter, files_protoset[iteration2 * nb_cl + iter_dico]))

but the second one is sorted, which is slightly different as your code.

Error : iCaRL-TheanoLasagne : Unknown parameter type: <class 'theano.tensor.var.TensorVariable'>

running iCaRL-TheanoLasagne getting error Unknown parameter type: <class 'theano.tensor.var.TensorVariable'>

train_fn = theano.function([input_var,target_var], loss, updates=updates)

Help please

Information

def parse_devkit_meta(devkit_path):
12 meta_mat = scipy.io.loadmat(devkit_path+'/meta.mat')
13 labels_dic = dict((m[0][1][0], m[0][0][0][0]-1) for m in meta_mat['synsets'] if m[0][0][0][0] >= 1 and m[0][0][0][0] <= 1000)
14 label_names_dic = dict((m[0][1][0], m[0][2][0]) for m in meta_mat['synsets'] if m[0][0][0][0] >= 1 and m[0][0][0][0] <= 1000)
15 label_names = [tup[1] for tup in sorted([(v,label_names_dic[k]) for k,v in labels_dic.items()], key=lambda x:x[0])]
16 fval_ground_truth = open(devkit_path+'/data/ILSVRC2012_validation_ground_truth.txt','r')
17 validation_ground_truth = [[int(line.strip()) - 1] for line in fval_ground_truth.readlines()]
18 fval_ground_truth.close()

please for which purpose we use this function?

Questions about figure 4

Dear authors,

I have read iCaRL paper and fee it is quite interesting. I have a few questions regarding figure 4 and table1, I appreciate the response from the authors.

For Table1(a), what is the memory budget for the reported accuracy? For example, in the case of 10 classes, iCaRL accuracy is 64.1% and hybrid1 accuracy is 59.3, while in Fig4, when K=3000, the accuracy is roughly like this. Does that mean Table 1(a) is under the setting of K=3000?
In figure4, when K =2000, the accuracy of iCaRL is roughly 63%. While in Fig2(a) top right, the accuracy is roughly 50%, and Fig2 is also under the setting of K=2000. Thus, why there is a 13% accuracy gap in these 2 results? Please correct me if I am wrong.

Again, I would appreciate the authors for replying.

About Contruct Exemplar Set

iCaRL-Tensorflow/main_resnet_tf.py line 213, during constructing the exemplar set, why are we choosing the argmax? according to the paper isn't it supposed to be argmin?
ind_max = np.argmax(tmp_t)

problems abou tflite

Hello,
I have a little doubt, can the model trained in this way be converted into a tflite format suitable for microcontroller deployment? How to do it?

tensorflow.python.framework.errors_impl.InvalidArgumentError: Session was not created with a graph before Run()!

I met this error while running training code via python main_resnet_tf.py.

tensorflow.python.framework.errors_impl.InvalidArgumentError: Session was not created with a graph before Run()!

Traceback (most recent call last):
File "main_resnet_tf.py", line 159, in
sess.run(tf.global_variables_initializer())

Running environment:
Python 3.6.13
Tensorflow 1.1.0
scipy 0.19.1
cuda 11.4

Maybe too much time past after code released, I am looking forward someone help me. Thanks a lot. @srebuffi

what does devkit_path contains ?

The meaning of some arguments

Hi~

Could you please explain the meaning of "group"? I can not understand what the "group" for, and can not find the notation in the paper.

By my guess, you mean "if we have 24 classes in total, and I want to run 3 times, so the class numbers in each iteration are 1-8, 9-16 and 17-24".

And

Is there the hard-coded image net 1000 class? If I have a dataset contains 24 class, can I change 1000 to 24 here?

THX

Not getting 48% training accuracy in one epoch

Hello,

I'm trying to execute the code in this repository as suggested in the Nota Bene section of the readme, that is, changing the number of epochs to 1 and get the accuracy for the first two batches.
I'm only getting 24% accuracy for the first batch (readme says it should be 48%), while I'm getting the expected 20% accuracy for the second batch.
I did not change any of the code except the three folders.
To make sure we are using the same data, my training set is made of 1281167 images.
I wonder if I should change the learning rate, if the reported 48% is wrong, or if there is some other issue.
I attach the partial output, any help would be appreciated!
https://pastebin.com/k2PvAWz3

Thanks

Subset of Classes Used for ImageNet-100

Hello,
Which subset of classes of the full ImageNet-1000 did you use for ImageNet-100? Is it the first 100 classes in alphabetical order of the folder names? Thanks!

batchnorm is applied after relu in the tensorflow implementation of ResNet18

In the tensorflow version of ResNet18, the residual block is implemented as follows

layer = conv(inp, 'resconv1'+nom, size=3, strides=first_stride, out_channels=out_num_filters, alpha=alpha, padding='SAME')
layer = batch_norm(layer, 'batch_norm_resconv1'+nom, phase=phase)
layer = conv(layer, 'resconv2'+nom, size=3, strides=[1, 1, 1, 1], out_channels=out_num_filters, apply_relu=False,alpha=alpha, padding='SAME')
layer = batch_norm(layer, 'batch_norm_resconv2'+nom, phase=phase)

(in utils_resnet.py:110-113)
where the first batch_norm layer is appended to the first conv layer(which is apply_relu=True by default).
Can you tell me why we apply batch_norm after relu here, which contradicts the orinary practice that applies batch_norm before relu?

Questions about herding procedure

I am a little confused about the realization of exemplar selection with herding. In main_resnet_tf.py , from line 202 to 217:

    print('Exemplars selection starting ...')
    for iter_dico in range(nb_cl):
        ind_cl     = np.where(label_dico == order[iter_dico+itera*nb_cl])[0]
        D          = Dtot[:,ind_cl]
        files_iter = processed_files[ind_cl]
        mu         = np.mean(D,axis=1)
        w_t        = mu
        step_t     = 0
        while not(len(files_protoset[itera*nb_cl+iter_dico]) == nb_protos_cl) and step_t<1.1*nb_protos_cl:
            tmp_t   = np.dot(w_t,D)
            ind_max = np.argmax(tmp_t)
            w_t     = w_t + mu - D[:,ind_max]
            step_t  += 1
            if files_iter[ind_max] not in files_protoset[itera*nb_cl+iter_dico]:
              files_protoset[itera*nb_cl+iter_dico].append(files_iter[ind_max])

The key of these codes is w_t = w_t + mu - D[:,ind_max], which lets w_t move in the direction of mu - D[:,ind_max], away from the current nearest sample. However, following Algorithm 4 in the paper, the final object of herding is to select m samples, making the mean of the exemplars closest to the class mean in the feature space. I am not sure whether these codes achieve the same goal by using such an iteration way. Anyone can explain or prove this? Thanks in advance : )

Exampler set size

In the paper, its mentioned that K=20000, is used for Imagenet dataset, but it didn't clarify whether for both iILSVRC-small and iILSVRC-full K=20000 is used. In this repo it seems for 100 classes you have used K=2000. Could you please elaborate on this?