Coder Social home page Coder Social logo

zhmiao / openlongtailrecognition-oltr Goto Github PK

View Code? Open in Web Editor NEW
827.0 30.0 129.0 2.24 MB

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%
open-long-tail-recognition pytorch-implementation computer-vision long-tail open-set oltr cvpr2019 deep-learning

openlongtailrecognition-oltr's Introduction

Large-Scale Long-Tailed Recognition in an Open World

[Project] [Paper] [Blog]

Overview

Open Long-Tailed Recognition (OLTR) is the author's re-implementation of the long-tail recognizer described in:
"Large-Scale Long-Tailed Recognition in an Open World"
Ziwei Liu*Zhongqi Miao*Xiaohang ZhanJiayun WangBoqing GongStella X. Yu  (CUHK & UC Berkeley / ICSI)  in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2019, Oral Presentation

Further information please contact Zhongqi Miao and Ziwei Liu.

Update notifications

  • 03/04/2020: We changed all valirables named selfatt to modulatedatt so that the attention module can be properly trained in the second stage for Places-LT. ImageNet-LT does not have this problem since the weights are not freezed. We have updated new results using fixed code, which is still better than reported. The weights are also updated. Thanks!
  • 02/11/2020: We updated configuration files for Places_LT dataset. The current results are a little bit higher than reported, even with updated F-measure calculation. One important thing to be considered is that we have unfrozon the model weights for the first stage training of Places-LT, which means it is not suitable for single-GPU training in most cases (we used 4 1080ti in our implementation). However, for the second stage, since the memory and center loss do not support multi-GPUs currently, please switch back to single-GPU training. Thank you very much!
  • 01/29/2020: We updated the False Positive calculation in util.py so that the numbers are normal again. The reported F-measure numbers in the paper might be a little bit higher than actual numbers for all baselines. We will update it as soon as possible. We have updated the new F-measure number in the following table. Thanks.
  • 12/19/2019: Updated modules with 'clone()' methods and set use_fc in ImageNet-LT stage-1 config to False. Currently, the results for ImageNet-LT is comparable to reported numbers in the paper (a little bit better), and the reproduced results are updated below. We also found the bug in Places-LT. We will update the code and reproduced results as soon as possible.
  • 08/05/2019: Fixed a bug in utils.py. Update re-implemented ImageNet-LT weights at the end of this page.
  • 05/02/2019: Fixed a bug in run_network.py so the models train properly. Update configuration file for Imagenet-LT stage 1 training so that the results from the paper can be reproduced.

Requirements

Data Preparation

NOTE: Places-LT dataset have been updated since the first version. Please download again if you have the first version.

  • First, please download the ImageNet_2014 and Places_365 (256x256 version). Please also change the data_root in main.py accordingly.

  • Next, please download ImageNet-LT and Places-LT from here. Please put the downloaded files into the data directory like this:

data
  |--ImageNet_LT
    |--ImageNet_LT_open
    |--ImageNet_LT_train.txt
    |--ImageNet_LT_test.txt
    |--ImageNet_LT_val.txt
    |--ImageNet_LT_open.txt
  |--Places_LT
    |--Places_LT_open
    |--Places_LT_train.txt
    |--Places_LT_test.txt
    |--Places_LT_val.txt
    |--Places_LT_open.txt

Download Caffe Pre-trained Models for Places_LT Stage_1 Training

  • Caffe pretrained ResNet152 weights can be downloaded from here, and save the file to ./logs/caffe_resnet152.pth

Getting Started (Training & Testing)

ImageNet-LT

  • Stage 1 training:
python main.py --config ./config/ImageNet_LT/stage_1.py
  • Stage 2 training:
python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py
  • Close-set testing:
python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py --test
  • Open-set testing (thresholding)
python main.py --config ./config/ImageNet_LT/stage_2_meta_embedding.py --test_open
  • Test on stage 1 model
python main.py --config ./config/ImageNet_LT/stage_1.py --test

Places-LT

  • Stage 1 training (At this stage, multi-GPU might be necessary since we are finetuning a ResNet-152.):
python main.py --config ./config/Places_LT/stage_1.py
  • Stage 2 training (At this stage, only single-GPU is supported, please switch back to single-GPU training.):
python main.py --config ./config/Places_LT/stage_2_meta_embedding.py
  • Close-set testing:
python main.py --config ./config/Places_LT/stage_2_meta_embedding.py --test
  • Open-set testing (thresholding)
python main.py --config ./config/Places_LT/stage_2_meta_embedding.py --test_open

Reproduced Benchmarks and Model Zoo (Updated on 03/05/2020)

ImageNet-LT Open-Set Setting

Backbone Many-Shot Medium-Shot Few-Shot F-Measure Download
ResNet-10 44.2 35.2 17.5 44.6 model

Places-LT Open-Set Setting

Backbone Many-Shot Medium-Shot Few-Shot F-Measure Download
ResNet-152 43.7 40.2 28.0 50.0 model

CAUTION

The current code was prepared using single GPU. The use of multi-GPU can cause problems except for the first stage of Places-LT.

License and Citation

The use of this software is released under BSD-3.

@inproceedings{openlongtailrecognition,
  title={Large-Scale Long-Tailed Recognition in an Open World},
  author={Liu, Ziwei and Miao, Zhongqi and Zhan, Xiaohang and Wang, Jiayun and Gong, Boqing and Yu, Stella X.},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2019}
}

openlongtailrecognition-oltr's People

Contributors

953250587 avatar drcege avatar liuziwei7 avatar zhmiao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openlongtailrecognition-oltr's Issues

embedded_gaussian is not trained

if 'selfatt' not in param_name and 'fc' not in param_name:

Seeing this line, Stage1 config and Stage2 config, I noticed that embedded gaussian non-local filtering is not trained at all as "selfatt" is not present in ModulatedAttLayer. In Stage 1 we don't use the modulated attention

feature_param = {'use_modulatedatt': False, 'use_fc': True, 'dropout': None,

In Stage 2 we use it:

feature_param = {'use_modulatedatt': True, 'use_fc': True, 'dropout': None,

initialize it, but then fix all the convolution layers weights due to:

and the above if condition.

Should we change "selfatt" to "modulatedatt"

Training error for stage 1

Loading dataset from: /mnt/ImageryAnalysis/t0l/OpenLongTailRecognition-OLTR-master/OpenLongTailRecognition-OLTR-master/data/train_256_places365standard
{'criterions': {'PerformanceLoss': {'def_file': 'loss/SoftmaxLoss.py',
'loss_params': {},
'optim_params': None,
'weight': 1.0}},
'memory': {'centroids': False, 'init_centroids': False},
'networks': {'classifier': {'def_file': 'models/DotProductClassifier.py',
'optim_params': {'lr': 0.1,
'momentum': 0.9,
'weight_decay': 0.0005},
'params': {'dataset': 'Places_LT',
'in_dim': 512,
'num_classes': 365,
'stage1_weights': False}},
'feat_model': {'def_file': 'models/ResNet152Feature.py',
'fix': True,
'optim_params': {'lr': 0.01,
'momentum': 0.9,
'weight_decay': 0.0005},
'params': {'caffe': True,
'dataset': 'Places_LT',
'dropout': None,
'stage1_weights': False,
'use_fc': True,
'use_modulatedatt': False}}},
'training_opt': {'batch_size': 256,
'dataset': 'Places_LT',
'display_step': 10,
'feature_dim': 512,
'log_dir': './logs/Places_LT/stage1',
'num_classes': 365,
'num_epochs': 30,
'num_workers': 4,
'open_threshold': 0.1,
'sampler': None,
'scheduler_params': {'gamma': 0.1, 'step_size': 10}}}
Loading data from ./data/Places_LT/Places_LT_train.txt
Use data transformation: Compose(
RandomResizedCrop(size=(224, 224), scale=(0.08, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR)
RandomHorizontalFlip(p=0.5)
ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0)
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
No sampler.
Shuffle is True.
Loading data from ./data/Places_LT/Places_LT_val.txt
Use data transformation: Compose(
Resize(size=256, interpolation=PIL.Image.BILINEAR)
CenterCrop(size=(224, 224))
ToTensor()
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)
No sampler.
Shuffle is True.
Using 1 GPUs.
Loading Scratch ResNet 152 Feature Model.
Loading Caffe Pretrained ResNet 152 Weights.
Pretrained feature model weights path: ./logs/caffe_resnet152.pth
Freezing feature weights except for self attention weights (if exist).
Loading Dot Product Classifier.
Random initialized classifier weights.
Using steps for training.
Initializing model optimizer.
Loading Softmax Loss.
Phase: train
Traceback (most recent call last):

File "", line 1, in
runfile('/mnt/ImageryAnalysis/t0l/OpenLongTailRecognition-OLTR-master/OpenLongTailRecognition-OLTR-master/main.py', wdir='/mnt/ImageryAnalysis/t0l/OpenLongTailRecognition-OLTR-master/OpenLongTailRecognition-OLTR-master')

File "/home/t0l/anaconda3/envs/dl/lib/python3.5/site-packages/spyder_kernels/customize/spydercustomize.py", line 668, in runfile
execfile(filename, namespace)

File "/home/t0l/anaconda3/envs/dl/lib/python3.5/site-packages/spyder_kernels/customize/spydercustomize.py", line 108, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)

File "/mnt/ImageryAnalysis/t0l/OpenLongTailRecognition-OLTR-master/OpenLongTailRecognition-OLTR-master/main.py", line 62, in
training_model.train()

File "/mnt/ImageryAnalysis/t0l/OpenLongTailRecognition-OLTR-master/OpenLongTailRecognition-OLTR-master/run_networks.py", line 212, in train
phase='train')

File "/mnt/ImageryAnalysis/t0l/OpenLongTailRecognition-OLTR-master/OpenLongTailRecognition-OLTR-master/run_networks.py", line 137, in batch_forward
self.logits, self.direct_memory_feature = self.networks['classifier'](self.features, self.centroids)

File "/home/t0l/anaconda3/envs/dl/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)

File "/home/t0l/anaconda3/envs/dl/lib/python3.5/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])

File "/home/t0l/anaconda3/envs/dl/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)

File "models/DotProductClassifier.py", line 11, in forward
x = self.fc(x)

File "/home/t0l/anaconda3/envs/dl/lib/python3.5/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)

File "/home/t0l/anaconda3/envs/dl/lib/python3.5/site-packages/torch/nn/modules/linear.py", line 55, in forward
return F.linear(input, self.weight, self.bias)

File "/home/t0l/anaconda3/envs/dl/lib/python3.5/site-packages/torch/nn/functional.py", line 1024, in linear
return torch.addmm(bias, input, weight.t())

RuntimeError: size mismatch, m1: [256 x 2048], m2: [512 x 365] at /opt/conda/conda-bld/pytorch_1533672544752/work/aten/src/THC/generic/THCTensorMathBlas.cu:249

Code Error

Hello,
When I run python main.py --config ./config/Places_LT/stage_2_meta_embedding.py, there is an error.

File "./models/MetaEmbeddingClassifier.py", line 33, in forward
dist_cur = torch.norm(x_expand - centroids_expand, 2, 2)
RuntimeError: The size of tensor a (365) must match the size of tensor b (122) at non-singleton dimension 1

Here, I print the shape of x_expand and centroids_expand.

torch.Size([86, 365, 512])
torch.Size([86, 122, 512])

Could you give some advice to solve this problem?

The calculation of open-set F-measure

Hi, I wonder if true positive, false positive and false negative are counted correctly.

for i in range(len(labels)):
true_pos += 1 if preds[i] == labels[i] and labels[i] != -1 else 0
false_pos += 1 if preds[i] != labels[i] and labels[i] != -1 and preds[i] != -1 else 0
false_neg += 1 if preds[i] != labels[i] and labels[i] == -1 else 0

Here are some examples according to the above code:
(pairs of prediction and label)

  • class_a, class_a (TP)
  • class_b, class_a (FP)
  • -1, class_a (?)
  • class_a, -1 (FN)
  • -1, -1 (?)

I'm confused about

  • why the 2nd example is counted as FP rather than FN? (FP means the label is negative, but the prediction is positive, so what is positive here)
  • why the 3nd example is not counted as FN?
  • is the last example TN or TP?

Code Error for Training (Stage 1)

Hello,
When I run python main.py --config ./config/ImageNet_LT/stage_1.py, there is an error.

Loading Dot Product Classifier.
Traceback (most recent call last):
File "/media/Elements/OLTR/OpenLongTailRecognition-OLTR/main.py", line 55, in
training_model = model(config, data, test=False)
File "/media/Elements/OLTR/OpenLongTailRecognition-OLTR/run_networks.py", line 26, in init
self.init_models()
File "/media/Elements/OLTR/OpenLongTailRecognition-OLTR/run_networks.py", line 69, in init_models
self.networks[key] = source_import(def_file).create_model(*model_args)
File "./models/DotProductClassifier.py", line 16, in create_model
clf = DotProduct_Classifier(num_classes, feat_dim)
File "./models/DotProductClassifier.py", line 8, in init
self.fc = nn.Linear(feat_dim, num_classes)
File "/home/.local/lib/python3.5/site-packages/torch/nn/modules/linear.py", line 81, in init
self.reset_parameters()
File "/home/.local/lib/python3.5/site-packages/torch/nn/modules/linear.py", line 84, in reset_parameters
init.kaiming_uniform_(self.weight, a=math.sqrt(5))
File "/home/.local/lib/python3.5/site-packages/torch/nn/init.py", line 325, in kaiming_uniform_
std = gain / math.sqrt(fan)
ZeroDivisionError: float division by zero

Could you give some advice to solve this problem?

About The training time in ImageNet

It is a wonderful work! I am curious about the training time in the ImageNet. Would it be very long since it trains the model from scratch?

Question about centroids update

Thank you for releasing the code for this awesome work.
I have a question about the centroids update. I have read the code and find that centroids only calculated once at the beginning of the model initialization at stage 2. Could you help me to find out how the centroids update? And I wonder that is the centroids correct because the parameters of attention is just initialized and the features of centroids are not learned.

Some errors in equation (7)

I think the vn(meta) in equation(7) is wrong, otherwise, the networks of concept selector, Hallucinator will not be trained

What's the purpose of “train plain” phase

Except the “train”,“val”, there are also “train plain” phase, however what's the purpose of “train plain”? or what's actions are taken during "train plain"

Missing very important baseline

Hi! Thanks for sharing your work, I'm wondering have you tried the baseline of class balanced sampling and inverse frequecy sampling when train the model?

Running on windows

It seems there would be some problem running the code on windows, as discussed in pytorch/pytorch#5858 (comment) . By far we do not have any solutions to this issue, and it seems to be a pytorch issue. We will test it out when we have a windows machine.

Calculation of equation(6)

I find some differences between ex of the equation (6) in paper and the implementation in the code, may you help to check?

def forward(self, input, *args):
    norm_x = torch.norm(input, 2, 1, keepdim=True)
    ex = (norm_x / (1 + norm_x)) * (input / norm_x)
    ew = self.weight / torch.norm(self.weight, 2, 1, keepdim=True)
return torch.mm(self.scale * ex, ew.t())

ClassAwareSampler configuration

Hi, both in the Places and ImageNet, the batch_size and number_samples_cls are configured as 256 and 4, respectively. Does it mean each batch only will contain 256/4=64 categories?

What's the main purpose of training of stage_1.py

In my understanding, all of the modulated attention, dynamic meta-embedding and cosine classifier are not used in stage_1, so I have a question what's the main purpose of training of stage_1.py? Just in order to finetune the ResNet152?

Unable to reproduce the results of the paper

For ImageNet_LT,I just use default config in the code, but cannot reproduce the results in paper Table3(a).

  1. For stage1, my result is(some last logs when training complete):
    Epoch: [30/30] Step: 440 Minibatch_loss_performance: 2.833 Minibatch_accuracy_micro: 0.438
    Epoch: [30/30] Step: 450 Minibatch_loss_performance: 2.886 Minibatch_accuracy_micro: 0.379
    Phase: val
    100%|██████████| 79/79 [01:40<00:00, 1.34it/s]
    Phase: val
    Evaluation_accuracy_micro_top1: 0.220
    Averaged F-measure: 0.175
    Many_shot_accuracy_top1: 0.427 Median_shot_accuracy_top1: 0.113 Low_shot_accuracy_top1: 0.007
    Training Complete.
    Best validation accuracy is 0.220 at epoch 30

Few/Low shot acc 0.7% is better with 0.4% Plain model in Table3(a) .


[Below is IMPORTANT!!!!!]
2.However for stage2, my result is(some last logs when training complete):
Epoch: [60/60] Step: 440 Minibatch_loss_feature: 0.569 Minibatch_loss_performance: 2.938 Minibatch_accuracy_micro: 0.566
Epoch: [60/60] Step: 450 Minibatch_loss_feature: 0.567 Minibatch_loss_performance: 2.845 Minibatch_accuracy_micro: 0.539
Phase: val
100%|██████████| 79/79 [01:34<00:00, 1.02it/s]
Phase: val
Evaluation_accuracy_micro_top1: 0.340
Averaged F-measure: 0.324
Many_shot_accuracy_top1: 0.401 Median_shot_accuracy_top1: 0.334 Low_shot_accuracy_top1: 0.197
Training Complete.
Best validation accuracy is 0.341 at epoch 48

However Many, Median and Few/Low shot acc are 40.1%, 33.4% and 19.7%, which are a little diff with 43.2%, 35.1% and 18.5% in "Ours" model in Table3(a) .
And I retrained for several times, the Many-shot acc always some lower than 43.2%.


Are there any tricks not released?

Overall vs F-measre

For the datasets of ImageNet-LT and Places_LT, why only overall criterion is measured for close-set and only F-measure is measured for open-set other than these two indexes are used at the same time? Under such incomplete comparison, it is easy cause doubts that the proposed algorithm do not work well.

Code error when using Place dataset

When I use the Place dataset and implement the command “python3 main.py --config ./config/Places_LT/stage_1.py”,an error is occur, does anyone can help me? Thanks

Traceback (most recent call last):
File "main.py", line 55, in
training_model = model(config, data, test=False)
File "/home/huang/OpenLongTailRecognition-OLTR-master2/run_networks.py", line 26, in init
self.init_models()
File "/home/huang/OpenLongTailRecognition-OLTR-master2/run_networks.py", line 69, in init_models
self.networks[key] = source_import(def_file).create_model(*model_args)
File "./models/DotProductClassifier.py", line 17, in create_model
clf = DotProduct_Classifier(num_classes, feat_dim)
File "./models/DotProductClassifier.py", line 9, in init
self.fc = nn.Linear(feat_dim, num_classes)
File "/home/huang/.virtualenvs/OLTR/lib/python3.5/site-packages/torch/nn/modules/linear.py", line 76, in init
self.weight = Parameter(torch.Tensor(out_features, in_features))
TypeError: new() received an invalid combination of arguments - got (str, bool), but expected one of:

  • (torch.device device)
  • (torch.Storage storage)
  • (Tensor other)
  • (tuple of ints size, torch.device device)
    didn't match because some of the arguments have invalid types: (str, bool)
  • (object data, torch.device device)
    didn't match because some of the arguments have invalid types: (str, bool)

Multiple-GPU support?

@liuziwei7 @zhmiao thanks for your amazing work, from the CAUTION

The current code was prepared using single GPU. The use of multi-GPU can cause problems.

and the error is:

File "./models/MetaEmbeddingClassifier.py", line 48, in forward
    memory_feature = torch.matmul(values_memory, keys_memory)
RuntimeError: size mismatch, m1: [16 x 7], m2: [4 x 512] at /pytorch/aten/src/THC/generic/THCTensorMathBlas.cu:268
# for 2 GPUs:
torch.Size([16, 7])
torch.Size([3, 512])
torch.Size([16, 7])
torch.Size([4, 512])

# for 1 GPU:
torch.Size([32, 7])
torch.Size([7, 512])

Is there any idea to support Multiple-GPU?

Regarding the datasets

Hi,
Thank you again for your code release. I am puzzled by the following issues, which I'm hoping you can help me with :
-> Places-LT has 62.5K examples, differently from the reported 184.5K images in the paper. Is the mistake in the paper or in the released dataset ?
-> I am unable to reproduce the dataset statistics for ImageNet-LT and Places-LT using Zipf's law ( discrete Pareto distribution : https://en.wikipedia.org/wiki/Pareto_distribution, https://en.wikipedia.org/wiki/Zipf%27s_law ) with alpha=6 ( which seems rather high ). Moreover, the log-log plot is not completely linear in my opinion :
ImageNet-LT
Places-LT

One thing about the normalization

Thanks for sharing the code. I have one question about the squashing function + cross entropy loss. Do you have some experiments about using softmax + cross entropy loss? Or other normalization method

What's the difference of datasets used in calculating closed-set setting and open-set setting's Many、Mediumn and Few-shot

@zhmiao Dear, after carefully reading the code, I found both the train and test's known type data will be used to calculate the closed-set setting and open-set setting's Many、Mediumn and Few-shot, but why the results are slightly different? e.g. the many-shot in closed-set is 44.7 while is 44.6 in open-set, I really feel confused about it. May you help to explain it, thanks.

A question about the initialization of memory M.

Hi, brother, it was great to have a look at your paper recently,
what i confused is ,at the beginning of stage 2, you use average representation for each class as memory, how about just using the fc parameter trained in stage_1?(we can also use cosine classifier in stage_1) Maybe it's more conclusive? Does there some difference between the two ways?
Thanks~

about center loss implement

i am not very sure about center loss implement. Why do we need implement attracting loss using DiscCentroidsLossFunc(torch.autograd.function)? Can we just implement as repelling loss?

About visual memory M (centroids) update problem

Thanks for the awesome work.

But I do not find the related codes to update the centroids.
In line 45 of file run_networks.py :

if self.memory['init_centroids']:
    self.criterions['FeatureLoss'].centroids.data = self.centroids_cal(self.data['train_plain'])

these codes are utilized for centroids initialization.

In the paper, Section 3.1, Para Learning Visual Memory M, the centroids are updated in two steps.
Could you kindly give me more hints about how to realize the second step, which is the propagation step by alternatively updating the direct feature and the centroids.

Thanks a lot.

Can you explain the implementation of the disccentroidloss?

Hi, thanks for your work.
I am confusing about the implementation of the disccentroidloss, could you share the correct mathematical formulation of this loss function. I think the implementation is different from the paper details (equation. 9)

Thanks.

Code on MS1M-LT dataset

Hi zhmiao,

Thanks for your code.
I am wondering if you are planning to update the code (configs) for reproducing the results on the MS1M-LT dataset. Thanks.

Traing Time

Hello,

How long does the model need to train?

stronger model for imagenet

I'm wondering if you have tried stronger model on imagenet, as resNet10's performance is comparitively low than other larger models like resnet50?

“Modulated Attention” no use?

In ImageNet_LT, I tested stage1 with 'use_selfatt':True or False.
And training results are:
#True
Evaluation_accuracy_micro_top1: 0.219
Averaged F-measure: 0.175
Many_shot_accuracy_top1: 0.422 Median_shot_accuracy_top1: 0.114 Low_shot_accuracy_top1: 0.006
Training Complete.
Best validation accuracy is 0.219 at epoch 30
Done
ALL COMPLETED.

#Fasle
Evaluation_accuracy_micro_top1: 0.220
Averaged F-measure: 0.175
Many_shot_accuracy_top1: 0.427 Median_shot_accuracy_top1: 0.113 Low_shot_accuracy_top1: 0.007
Training Complete.
Best validation accuracy is 0.220 at epoch 30
Done
ALL COMPLETED.

Reproducing Plain Model Baseline Accuracies

Hi,
Thank you for releasing the code for your paper. Can you please clarify how to reproduce the accuracies for the plain model baseline on ImageNet-LT in Table 3 of your paper ? I'm running the following commands :
-> python main.py --config ./config/ImageNet_LT/stage_1.py
-> python main.py --config ./config/ImageNet_LT/stage_1.py --test
which gives me the following output :
Evaluation_accuracy_micro_top1: 0.119
Averaged F-measure: 0.108
Many_shot_accuracy_top1: 0.148 Median_shot_accuracy_top1: 0.112 Low_shot_accuracy_top1: 0.062

From the paper, the numbers should be :
Evaluation_accuracy_micro_top1: 0.209
Many_shot_accuracy_top1: 0.409 Median_shot_accuracy_top1: 0.107 Low_shot_accuracy_top1: 0.004

Stage 1 is simply the baseline resnet-10 training on the entire dataset, right, or am I missing something ?

Reproduce model results

Thanks for the inspiring work and code :)

I'm having trouble to reproduce the results (plain model as well as final model on both datasets.) I have used the default settings without any alterations. Can you shed some insights on the results (perhaps this is caused by the hyper-parameters) and maybe if it is OK for you to provide the trained models for both stage1 and stage2?

The results I have reproduced are as following:

  1. ImageNet-LT

Stage1(close-setting):
Evaluation_accuracy_micro_top1: 0.204
Averaged F-measure: 0.160
Many_shot_top1: 0.405; Median_shot_top1: 0.099; Low_shot_top1: 0.006

Stage1(open-setting):
Open-set Accuracy: 0.178
Evaluation_accuracy_micro_top1: 0.199
Averaged F-measure: 0.291
Many_shot_top1: 0.396; Median_shot_top1: 0.096; Low_shot_top1: 0.006

Stage2(close-setting):
Evaluation_accuracy_micro_top1: 0.339
Averaged F-measure: 0.322
Many_shot_top1: 0.411; Median_shot_top1: 0.330; Low_shot_top1: 0.167

Stage2(open-setting):
Open-set Accuracy: 0.245
Evaluation_accuracy_micro_top1: 0.327
Averaged F-measure: 0.455
Many_shot_top1: 0.398; Median_shot_top1: 0.318; Low_shot_top1: 0.159

  1. Places-LT

Stage1(close-setting):
Evaluation_accuracy_micro_top1: 0.268
Averaged F-measure: 0.248
Many_shot_top1: 0.442; Median_shot_top1: 0.221; Low_shot_top1: 0.058

Stage1(open-setting):
Open-set Accuracy: 0.018
Evaluation_accuracy_micro_top1: 0.267
Averaged F-measure: 0.373
Many_shot_top1: 0.441; Median_shot_top1: 0.219; Low_shot_top1: 0.057

Stage2(close-setting):
Evaluation_accuracy_micro_top1: 0.349
Averaged F-measure: 0.338
Many_shot_top1: 0.387; Median_shot_top1: 0.355; Low_shot_top1: 0.263

Stage2(open-setting):
Open-set Accuracy: 0.120
Evaluation_accuracy_micro_top1: 0.342
Averaged F-measure: 0.477
Many_shot_top1: 0.382; Median_shot_top1: 0.349; Low_shot_top1: 0.254

hyper-parameters

Hi, I'm curious if you could give some insights on how to decide the value of hyper-parameter scale (currently is set to 16) in the CosNormClassifier, is it set empirically or is there a theoretical way to set it? Also the parameter scale (currently is set to 10) in meta-embedding for reach-ability.

comparisons

Hello, reading carefully your code (for ImageNet-LT), it seems that the plain models was trained in 30 epochs while your own model in 90 epochs (30 stage1 + 60 stage2). Could you please confirm this? And moreover, was all the comparisons (focal loss, etc) performed with 30 or 90 epochs?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.