Coder Social home page Coder Social logo

wvangansbeke / unsupervised-classification Goto Github PK

View Code? Open in Web Editor NEW
1.3K 54.0 268.0 13.44 MB

SCAN: Learning to Classify Images without Labels, incl. SimCLR. [ECCV 2020]

Home Page: https://arxiv.org/abs/2005.12320

License: Other

Python 100.00%
unsupervised-learning image-classification self-supervised-learning clustering eccv-2020 eccv2020 representation-learning contrastive-learning simclr moco

unsupervised-classification's Introduction

Learning to Classify Images without Labels

This repo contains the Pytorch implementation of our paper:

SCAN: Learning to Classify Images without Labels

Wouter Van Gansbeke, Simon Vandenhende, Stamatios Georgoulis, Marc Proesmans and Luc Van Gool.

  • Accepted at ECCV 2020 (Slides). Watch the explanation of our paper by Yannic Kilcher on YouTube.

  • 🏆 SOTA on 4 benchmarks. Check out Papers With Code for Image Clustering or Unsup. Classification.

  • Related works:

    • 🆕 Interested in unsupervised semantic segmentation? Check out our new preprint: MaskDistill.
    • 🆕 Interested in representation learning? Check out our NeurIPS'21 paper and code.
    • 🆕 More on unsupervised semantic segmentation? Check out our ICCV'21 paper: MaskContrast.
    • 📜 Looking for influential papers in self-supervised learning? Check out this reading list.

PWC PWC PWC PWC

Contents

  1. Introduction
  2. Prior Work
  3. Installation
  4. Training
  5. Model Zoo
  6. Tutorial
  7. Citation

🆕 Tutorial section has been added, checkout TUTORIAL.md.

🆕 Prior work section has been added, checkout Prior Work.

Introduction

Can we automatically group images into semantically meaningful clusters when ground-truth annotations are absent? The task of unsupervised image classification remains an important, and open challenge in computer vision. Several recent approaches have tried to tackle this problem in an end-to-end fashion. In this paper, we deviate from recent works, and advocate a two-step approach where feature learning and clustering are decoupled.

We outperform state-of-the-art methods by large margins, in particular +26.6% on CIFAR10, +25.0% on CIFAR100-20 and +21.3% on STL10 in terms of classification accuracy. Our method is the first to perform well on ImageNet (1000 classes). Check out the benchmarks on the Papers-with-code website for Image Clustering and Unsupervised Image Classification.

Prior Work

  • Train set/test set: We would like to point out that most prior work in unsupervised classification use both the train and test set during training. We believe this is bad practice and therefore propose to only train on the train set. The final numbers should be reported on the test set (see table 3 of our paper). This also allows us to directly compare with supervised and semi-supervised methods in the literature. We encourage future work to do the same. We observe around 2% improvement over the reported numbers when including the test set.

  • Reproducibility: We noticed that prior work is very initialization sensitive. So, we don't think reporting a single number is therefore fair. We report our results as the mean and standard deviation over 10 runs.

Please follow the instructions underneath to perform semantic clustering with SCAN.

Installation

The code runs with recent Pytorch versions, e.g. 1.4. Assuming Anaconda, the most important packages can be installed as:

conda install pytorch=1.4.0 torchvision=0.5.0 cudatoolkit=10.0 -c pytorch
conda install matplotlib scipy scikit-learn   # For evaluation and confusion matrix visualization
conda install faiss-gpu                       # For efficient nearest neighbors search 
conda install pyyaml easydict                 # For using config files
conda install termcolor                       # For colored print statements

We refer to the requirements.txt file for an overview of the packages in the environment we used to produce our results.

Training

Setup

The following files need to be adapted in order to run the code on your own machine:

  • Change the file paths to the datasets in utils/mypath.py, e.g. /path/to/cifar10.
  • Specify the output directory in configs/env.yml. All results will be stored under this directory.

Our experimental evaluation includes the following datasets: CIFAR10, CIFAR100-20, STL10 and ImageNet. The ImageNet dataset should be downloaded separately and saved to the path described in utils/mypath.py. Other datasets will be downloaded automatically and saved to the correct path when missing.

Train model

The configuration files can be found in the configs/ directory. The training procedure consists of the following steps:

  • STEP 1: Solve the pretext task i.e. simclr.py
  • STEP 2: Perform the clustering step i.e. scan.py
  • STEP 3: Perform the self-labeling step i.e. selflabel.py

For example, run the following commands sequentially to perform our method on CIFAR10:

python simclr.py --config_env configs/your_env.yml --config_exp configs/pretext/simclr_cifar10.yml
python scan.py --config_env configs/your_env.yml --config_exp configs/scan/scan_cifar10.yml
python selflabel.py --config_env configs/your_env.yml --config_exp configs/selflabel/selflabel_cifar10.yml

Remarks

The provided hyperparameters are identical for CIFAR10, CIFAR100-20 and STL10. However, fine-tuning the hyperparameters can further improve the results. We list the most important hyperparameters of our method below:

  • Entropy weight: Can be adapted when the number of clusters changes. In general, try to avoid imbalanced clusters during training.
  • Confidence threshold: When every cluster contains a sufficiently large amount of confident samples, it can be beneficial to increase the threshold. This generally helps to decrease the noise. The ablation can be found in the paper.
  • Number of neighbors in SCAN: The dependency on this hyperparameter is rather small as shown in the paper.

Model Zoo

Pretext tasks

We perform the instance discrimination task in accordance with the scheme from SimCLR on CIFAR10, CIFAR100 and STL10. Pretrained models can be downloaded from the links listed below. On ImageNet, we use the pretrained weights provided by MoCo and transfer them to be compatible with our code repository.

Dataset Download link
CIFAR10 Download
CIFAR100 Download
STL10 Download

Clustering

We provide the following pretrained models after training with the SCAN-loss, and after the self-labeling step. The best models can be found here and we futher refer to the paper for the averages and standard deviations.

Dataset Step ACC NMI ARI Download link
CIFAR10 SCAN-loss 81.6 71.5 66.5 Download
Self-labeling 88.3 79.7 77.2 Download
CIFAR100 SCAN-loss 44.0 44.9 28.3 Download
Self-labeling 50.7 48.6 33.3 Download
STL10 SCAN-loss 79.2 67.3 61.8 Download
Self-labeling 80.9 69.8 64.6 Download
ImageNet-50 SCAN-loss 75.1 80.5 63.5 Download
Self-labeling 76.8 82.2 66.1 Download
ImageNet-100 SCAN-loss 66.2 78.7 54.4 Download
Self-labeling 68.9 80.8 57.6 Download
ImageNet-200 SCAN-loss 56.3 75.7 44.1 Download
Self-labeling 58.1 77.2 47.0 Download

Result ImageNet

We also train SCAN on ImageNet for 1000 clusters. We use 10 clusterheads and finally take the head with the lowest loss. The accuracy (ACC), normalized mutual information (NMI), adjusted mutual information (AMI) and adjusted rand index (ARI) are computed:

Method ACC NMI AMI ARI Download link
SCAN (ResNet50) 39.9 72.0 51.2 27.5 Download

Evaluation

Pretrained models from the model zoo can be evaluated using the eval.py script. For example, the model on cifar-10 can be evaluated as follows:

python eval.py --config_exp configs/scan/scan_cifar10.yml --model $MODEL_PATH 

Visualizing the prototype images is easily done by setting the --visualize_prototypes flag. For example on cifar-10:

Similarly, you might want to have a look at the clusters found on ImageNet (as shown at the top). First download the model (link in table above) and then execute the following command:

python eval.py --config_exp configs/scan/imagenet_eval.yml --model $MODEL_PATH_IMAGENET 

Tutorial

If you want to see another (more detailed) example for STL-10, checkout TUTORIAL.md. It provides a detailed guide and includes visualizations and log files with the training progress.

Citation

If you find this repo useful for your research, please consider citing our paper:

@inproceedings{vangansbeke2020scan,
  title={Scan: Learning to classify images without labels},
  author={Van Gansbeke, Wouter and Vandenhende, Simon and Georgoulis, Stamatios and Proesmans, Marc and Van Gool, Luc},
  booktitle={Proceedings of the European Conference on Computer Vision},
  year={2020}
}

For any enquiries, please contact the main authors.

License

This software is released under a creative commons license which allows for personal and research use only. For a commercial license please contact the authors. You can view a license summary here.

Acknoledgements

This work was supported by Toyota, and was carried out at the TRACE Lab at KU Leuven (Toyota Research on Automated Cars in Europe - Leuven).

unsupervised-classification's People

Contributors

masaishi avatar wvangansbeke avatar yk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unsupervised-classification's Issues

About the SCANLoss

the SCANLoss in losses.py is
total_loss = consistency_loss - self.entropy_weight * entropy_loss, however it is diferent from paper
image

Question about results

In comparison table, does the ACC of supervised method also the cluster accuracy (calculate after matching) or just the classification accuracy calculated using actual labels? By the way, did you have the results for CIFAR-100 instead of CIFAR-100-20? Thanks!

Are we passing the corresponding exact label to Unsupervised learning here?

In the code of CIFAR10 class here, it takes targets and data. These targets are the list of categories in numeric format. The __getitem__ returns numpy array of corresponding image values. According to definition of Unsupervised this shouldn't be the case. Considering the liberty of dataset, CIFAR10(labelled), this kind of implementation works, I think.

Question:
How can I pass the targets to my custom dataset ? Can i just pass random numbers, example 3 classes problem(0,1,2)?

But i know how many classes are there. I'm little confused with the implementation any help would be appreciated.

AttributeError: 'ClusteringModel' object has no attribute 'module'

Hello

An error occurred while trying to run the test code.

The data sets were cifar10 and cifar20.

The simclr.py code works fine, but the scan.py code does not.

What is the problem?

'''
Make prediction on validation set ...
Evaluate based on SCAN loss ...
{'scan': [{'entropy': 2.980037212371826, 'consistency': 2.5082473754882812, 'total_loss': -0.4717898368835449}], 'lowest_loss_head': 0, 'lowest_loss': -0.4717898368835449}
New lowest loss on validation set: 10000.0000 -> -0.4718
Lowest loss head is 0
Traceback (most recent call last):
File "scan.py", line 139, in
main()
File "scan.py", line 111, in main
torch.save({'model': model.module.state_dict(), 'head': best_loss_head}, p['scan_model'])
File "/home/ubuntu/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 576, in getattr
type(self).name, name))
AttributeError: 'ClusteringModel' object has no attribute 'module'
'''

AttributeError exception on ClusteringModel

torch.save({'model': model.module.state_dict(), 'head': best_loss_head}, p['scan_model'])

This line was causing the following AttributeError exception for me:

Evaluate based on SCAN loss ...
{'scan': [{'entropy': 5.282749176025391, 'consistency': 5.058314800262451, 'total_loss': -0.22443437576293945}, {'entropy': 5.275367736816406, 'consistency': 5.054675579071045, 'total_loss': -0.22069215774536133}, {'entropy'
: 5.274453163146973, 'consistency': 5.050076007843018, 'total_loss': -0.22437715530395508}, {'entropy': 5.277739524841309, 'consistency': 5.065070152282715, 'total_loss': -0.21266937255859375}, {'entropy': 5.24068021774292, 
'consistency': 4.836734294891357, 'total_loss': -0.4039459228515625}, {'entropy': 5.279136657714844, 'consistency': 5.064798355102539, 'total_loss': -0.2143383026123047}, {'entropy': 5.278388500213623, 'consistency': 5.06134
5100402832, 'total_loss': -0.21704339981079102}, {'entropy': 5.279128074645996, 'consistency': 5.065242767333984, 'total_loss': -0.21388530731201172}, {'entropy': 5.277782440185547, 'consistency': 5.065756320953369, 'total_l
oss': -0.21202611923217773}, {'entropy': 5.276066303253174, 'consistency': 5.05104923248291, 'total_loss': -0.22501707077026367}], 'lowest_loss_head': 4, 'lowest_loss': -0.4039459228515625}
New lowest loss on validation set: 10000.0000 -> -0.4039
Lowest loss head is 4
Traceback (most recent call last):
  File "scan.py", line 139, in <module>
    main()
  File "scan.py", line 111, in main
    torch.save({'model': model.module.state_dict(), 'head': best_loss_head}, p['scan_model'])
  File "/home/ubuntu/anaconda3/envs/Unsupervised-Classification/lib/python3.6/site-packages/torch/nn/modules/module.py", line 576, in __getattr__
    type(self).__name__, name))
AttributeError: 'ClusteringModel' object has no attribute 'module'

Removing .module seems to fix the issue.

some questions

Hello, I'm very appreciate your work, and when I look over your code and find some questions.
First, your paper saying that when computing scanloss, use one image's augmented image and its neighbor will be good, but in your code, I just see using one neighbor. In my opinion, if one image has one augmented image and one neighbor then giving one batch have batch_size m and when computing scanloss, it will be m to 2m. But I just see m to m.
Second, I don't see code how to select neighbors. So, I guess it likes KNN?

Failing to unzip downloaded pretrained weights

I downloaded the pretrained weights and tried to unzip them. However, I get these errors.

tar -xvf scan_cifar-10.pth.tar 
tar: This does not look like a tar archive
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

Cannot run eval using model

Hello,

First of all thank you for sharing your research to the public. This is interesting work!

I am new to this and please forgive me if I am asking a simple question.

I ran your scripts and it all worked as expected. At the end of self-labeling step my result is this

... tail end of the results ..
Epoch 200/200

Adjusted learning rate to 0.00010
Train ...
Epoch: [199][ 0/50] Loss 7.1724e-02 (7.1724e-02)
Epoch: [199][25/50] Loss 5.5752e-02 (6.9237e-02)
Evaluate ...
{'ACC': 0.8724, 'ARI': 0.751001763292155, 'NMI': 0.7818230363066652, 'ACC Top-5': 0.9878, 'hungarian_match': [(0, 9), (1, 8), (2, 7), (3, 4), (4, 1), (5, 2), (6, 3), (7, 0), (8, 6), (9, 5)]}
Checkpoint ...
Evaluate model at the end
{'ACC': 0.8724, 'ARI': 0.751001763292155, 'NMI': 0.7818230363066652, 'ACC Top-5': 0.9878, 'hungarian_match': [(0, 9), (1, 8), (2, 7), (3, 4), (4, 1), (5, 2), (6, 3), (7, 0), (8, 6), (9, 5)]}

In my results directory (./results/cifar-10/selflabel/) i have 3 files , checkpoint.pth.tar, confusion_matrix.png and model.pth.tar.

I tried running 'eval' using this !python3 eval.py --config_exp configs/scan/scan_cifar10.yml --visualize_prototypes --model ./results/cifar-10/selflabel/model.pth.tar but got this error:

Load model weights ...
Traceback (most recent call last):
File "eval.py", line 145, in
main()
File "eval.py", line 52, in main
model.load_state_dict(state_dict['model'])
KeyError: 'model'

I tried using the same way from the self-label in the Model zoo google drive share. What am i doing wrong here? I want to see the 'visualize-prototypes' functionality.

Cluster accuracy

Could you explain how the cluster accuracy (ACC) calculated? It is also not well defined in paper, thanks.

Runtime error

Hi again,

when running simclr.py I receive the following error:
RuntimeError: Given input size: (512x3x3). Calculated output size: (512-3x-3). Output size ist too small.

What does this mean? Has this something to so with the size of my images? Or with the size value I specify in my .yml file for Transformation?
How can I solvr this error?
Thank you so much for your fast answers and your patience!!

Tutorial for custom datasets:

It will be of great help if a tutorial to run the custom datasets is written.
Right now there's a lot of complexity due to dataset/data loader classes. It will make it easier for people from different domains to experiment with SCAN.

Clarification on moco.py

Hi, just wanted to highlight a possible, issue. In moco.py at line 50
train_dataloader = get_val_dataloader(p, train_dataset)
shoundn't the get_train_dataloader be used instead of get_val_dataloader?

Thank you.

Why use ResNet18 at SimCLR step?

SCAN is a surprising work. But I keep wonder why select ResNet18. Google-research offer a standard ResNet50 model for SimCLR.
What of the effect on clustering if i use the standard ResNet50 to replace ResNet18?

train on whole imagenet

Hello
I would like to run an expermint and train on the whole imagenet dataset and not only (50 , 100 , 200 classes)
any idea how can i train a moco model from scratch? and how many epoches its supposed to be done?
thanks for sharing the code!

How to start training unlabelled data from scratch

Hi, thanks for an amazing paper and code.

I have a question regarding how to start a brand new dataset with all unlabeled-images.

I follow your suggestion on other issues, change configs/env.yml,
create configs/pretext/simclr_*.yml,
update, utils/common_config
create new class for custom dataset,

I wonder how i should deal with the validation step as i dont have any label in place, could i just edit out the val_db_name: in the pretext/yml file or do i need to do other changes. Any advise is greatly appreciated.

How to test the SCAN trained model on a full image?

Hi @wvangansbeke, Thank you for sharing you work! Your results look amazing!

To train SCAN, each input image must be a tightly cropped image containing a single object. So if there are multiple classes on an image, in order to test a SCAN trained model on the image, it needs to run a 2D object detector to detect all objects first, crop them and then test the SCAN model on each cropped image. Right?

pretrained weights

Hello, could you release the pretrained weights on Resnet18?I want to train my own datasets.

About number of classes

is it possible in anyway to get an idea or estimate of the optimum number of classes/clusters from the pretext step without specifying the num_classes and to use that number during the second step where the actual clustering takes place? This use case is for a custom dataset where the number of actual cluster is not known beforehand. Any help would be appreciated. Thank you!

pretext + kmeans

Hi~ I am following your beautiful work, but I met some questions and hope you can help me.

I am using your pretext model of cifar10(https://drive.google.com/file/d/1Cl5oAcJKoNE5FSTZsBSAKLcyA5jXGgTT/view) to require the 128-dimension features and run kmeans algorithm, I got very low performance than your paper. (acc: 20% vs 60%). But I first run TSNE on 128-dimension features and got 2-dimension new features and used kmeans algorithm, I got the similar accuracy with your paper. Do you meet the same question?

Error: Mask in MaskedCrossEntropyLoss is all zeros

I get error in third (self-labeling) step. The error occurs at these lines:

def forward(self, input, target, mask, weight, reduction='mean'):
if not (mask != 0).any():
raise ValueError('Mask in MaskedCrossEntropyLoss is all zeros.')

Any suggestions?

AssertionError in the SCAN loss part

I have run the simclr.py part.
In the scan.py, there is an assertion error.

Unsupervised-Classification/utils/common_config.py", line 89, in get_model
    assert(set(missing[1]) == {
AssertionError

The error happens in this part:

assert(set(missing[1]) == {
                'contrastive_head.0.weight', 'contrastive_head.0.bias', 
                'contrastive_head.2.weight', 'contrastive_head.2.bias'}
                or set(missing[1]) == {
                'contrastive_head.weight', 'contrastive_head.bias'})

I print out the missing[1] and found it contains much more weight, including the backbone weights, and causes the error.
I comment the assert part and the performance wasn't good:

{'ACC': 0.4313, 'ARI': 0.22229977451486435, 'NMI': 0.3281780571616542, 'ACC Top-5': 0.8845, 'hungarian_match': [(0, 0), (1, 2), (2, 1), (3, 8), (4, 6), (5, 3), (6, 9), (7, 4
), (8, 7), (9, 5)]}

Is the weight loading part had some error? Or the simclr.py result was wrong that causes the weight loading error?

Thanks!

imagenet100

Hi
where can I find the Imagenet100 version?

Using a custom dataset

Hi @wvangansbeke, thank you for publishing your work.

I am totally new to the world of computer vision and came across your work by browsing the 'self-supervised-learning' GitHub tag.

I would like to cluster a custom dataset I collected online for a personal project and was wondering if it is currently possible to adapt your code to run against it. Any guidance would be appreciated, I will study the code in more details in the coming days.

Many thanks.

Harry

Running Imagenet100 Exp

Hello,
i want to run an experiment on imagenet100 using the config file : "moco_imagenet100.yml"
however running as : "python simclr.py --config_env configs/env.yml --config_exp configs/pretext/moco_imagenet100.yml"
give the following assertion :
if p['augmentation_strategy'] == 'standard':
KeyError: 'augmentation_strategy'

CUDA error while training on new dataset

I was trying to run simclr.py on a custom dataset. I was using colab.
The changes I made are:

  • Added custom dataloader file in data/ folder
  • Added new config files in all the three folders inside config/ similar to CIFAR_10
  • Set the target as 255 in the __getitem__ and class as 'unlabeled' as my dataset has no labels
  • Few tweaks in utils/common_config.py

The code can be found in this repo: https://github.com/k4rth33k/AvantariSolution

Here are the logs:

{'setup': 'simclr', 'backbone': 'resnet50', 'model_kwargs': {'head': 'mlp', 'features_dim': 128}, 'train_db_name': 'animals', 'val_db_name': 'animals', 'num_classes': 10, 'criterion': 'simclr', 'criterion_kwargs': {'temperature': 0.1}, 'epochs': 500, 'optimizer': 'sgd', 'optimizer_kwargs': {'nesterov': False, 'weight_decay': 0.0001, 'momentum': 0.9, 'lr': 0.4}, 'scheduler': 'cosine', 'scheduler_kwargs': {'lr_decay_rate': 0.1}, 'batch_size': 512, 'num_workers': 8, 'augmentation_strategy': 'simclr', 'augmentation_kwargs': {'random_resized_crop': {'size': 32, 'scale': [0.2, 1.0]}, 'color_jitter_random_apply': {'p': 0.8}, 'color_jitter': {'brightness': 0.4, 'contrast': 0.4, 'saturation': 0.4, 'hue': 0.1}, 'random_grayscale': {'p': 0.2}, 'normalize': {'mean': [0.4914, 0.4822, 0.4465], 'std': [0.2023, 0.1994, 0.201]}}, 'transformation_kwargs': {'crop_size': 32, 'normalize': {'mean': [0.4914, 0.4822, 0.4465], 'std': [0.2023, 0.1994, 0.201]}}, 'pretext_dir': '/content/drive/My Drive/Datasets/AvantariDataset/results/animals/pretext', 'pretext_checkpoint': '/content/drive/My Drive/Datasets/AvantariDataset/results/animals/pretext/checkpoint.pth.tar', 'pretext_model': '/content/drive/My Drive/Datasets/AvantariDataset/results/animals/pretext/model.pth.tar', 'topk_neighbors_train_path': '/content/drive/My Drive/Datasets/AvantariDataset/results/animals/pretext/topk-train-neighbors.npy', 'topk_neighbors_val_path': '/content/drive/My Drive/Datasets/AvantariDataset/results/animals/pretext/topk-val-neighbors.npy'}
Retrieve model
Model is ContrastiveModel
Model parameters: 27.97M
ContrastiveModel(
  (backbone): ResNet(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): Bottleneck(
        (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (downsample): Sequential(
          (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): Bottleneck(
        (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
      (2): Bottleneck(
        (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
    )
    (layer2): Sequential(
      (0): Bottleneck(
        (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (downsample): Sequential(
          (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): Bottleneck(
        (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
      (2): Bottleneck(
        (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
      (3): Bottleneck(
        (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
    )
    (layer3): Sequential(
      (0): Bottleneck(
        (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (downsample): Sequential(
          (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): Bottleneck(
        (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
      (2): Bottleneck(
        (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
      (3): Bottleneck(
        (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
      (4): Bottleneck(
        (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
      (5): Bottleneck(
        (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
    )
    (layer4): Sequential(
      (0): Bottleneck(
        (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (downsample): Sequential(
          (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): Bottleneck(
        (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
      (2): Bottleneck(
        (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
      )
    )
    (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
    (fc): Identity()
  )
  (contrastive_head): Sequential(
    (0): Linear(in_features=2048, out_features=2048, bias=True)
    (1): ReLU()
    (2): Linear(in_features=2048, out_features=128, bias=True)
  )
)
Set CuDNN benchmark
Retrieve dataset
Train transforms: Compose(
    RandomResizedCrop(size=(32, 32), scale=(0.2, 1.0), ratio=(0.75, 1.3333), interpolation=PIL.Image.BILINEAR)
    RandomHorizontalFlip(p=0.5)
    RandomApply(
    p=0.8
    ColorJitter(brightness=[0.6, 1.4], contrast=[0.6, 1.4], saturation=[0.6, 1.4], hue=[-0.1, 0.1])
)
    RandomGrayscale(p=0.2)
    ToTensor()
    Normalize(mean=[0.4914, 0.4822, 0.4465], std=[0.2023, 0.1994, 0.201])
)
Validation transforms: Compose(
    CenterCrop(size=(32, 32))
    ToTensor()
    Normalize(mean=[0.4914, 0.4822, 0.4465], std=[0.2023, 0.1994, 0.201])
)
tcmalloc: large alloc 3353354240 bytes == 0x60818000 @  0x7f22e3e8c1e7 0x59221c 0x4cde80 0x566c02 0x5a4df1 0x4dde06 0x5eaed5 0x4dec87 0x5ebb8e 0x50a5aa 0x50c1f4 0x507f24 0x509202 0x594b01 0x54a17f 0x5517c1 0x5a9eec 0x50a783 0x50cfd6 0x507f24 0x509c50 0x50a64d 0x50cfd6 0x509918 0x50a64d 0x50c1f4 0x507f24 0x50b053 0x634dd2 0x634e87 0x63863f
tcmalloc: large alloc 3353354240 bytes == 0x128e4e000 @  0x7f22e3e8c1e7 0x59221c 0x5eae96 0x4dec87 0x5ebb8e 0x50a5aa 0x50c1f4 0x507f24 0x509202 0x594b01 0x54a17f 0x5517c1 0x5a9eec 0x50a783 0x50cfd6 0x507f24 0x509c50 0x50a64d 0x50cfd6 0x509918 0x50a64d 0x50c1f4 0x507f24 0x50b053 0x634dd2 0x634e87 0x63863f 0x6391e1 0x4b0dc0 0x7f22e3a89b97 0x5b26fa
Dataset contains 4264/474 train/val samples
Build MemoryBank
tcmalloc: large alloc 3353354240 bytes == 0x7f21981fe000 @  0x7f22e3e8c1e7 0x59221c 0x4cde80 0x566c02 0x5a4df1 0x4dde06 0x5eaed5 0x4dec87 0x5ebb8e 0x50a5aa 0x50c1f4 0x507f24 0x509202 0x594b01 0x54a17f 0x5517c1 0x5a9eec 0x50a783 0x50cfd6 0x507f24 0x509c50 0x50a64d 0x50cfd6 0x509918 0x50a64d 0x50c1f4 0x507f24 0x50b053 0x634dd2 0x634e87 0x63863f
tcmalloc: large alloc 3353354240 bytes == 0x7f20d03fc000 @  0x7f22e3e8c1e7 0x59221c 0x5eae96 0x4dec87 0x5ebb8e 0x50a5aa 0x50c1f4 0x507f24 0x509202 0x594b01 0x54a17f 0x5517c1 0x5a9eec 0x50a783 0x50cfd6 0x507f24 0x509c50 0x50a64d 0x50cfd6 0x509918 0x50a64d 0x50c1f4 0x507f24 0x50b053 0x634dd2 0x634e87 0x63863f 0x6391e1 0x4b0dc0 0x7f22e3a89b97 0x5b26fa
Retrieve criterion
Criterion is SimCLRLoss
Retrieve optimizer
SGD (
Parameter Group 0
    dampening: 0
    lr: 0.4
    momentum: 0.9
    nesterov: False
    weight_decay: 0.0001
)
No checkpoint file at /content/drive/My Drive/Datasets/AvantariDataset/results/animals/pretext/checkpoint.pth.tar
Starting main loop
Epoch 0/500
---------------
Adjusted learning rate to 0.40000
Train ...
Epoch: [0][0/8]	Loss 7.0342e+00 (7.0342e+00)
Fill memory bank for kNN...
Fill Memory Bank [0/9]
Evaluate ...
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [32,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [33,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [34,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [35,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [36,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [37,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [38,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [39,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [40,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [41,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [42,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [43,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [44,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [45,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [46,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [47,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [48,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [49,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [50,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [51,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [52,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [53,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [54,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [55,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [56,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [57,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [58,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [59,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [60,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [61,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [98,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [0,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [1,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [2,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [3,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [4,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [5,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [6,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [7,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [8,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [9,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [10,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [11,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [12,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [13,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [14,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [15,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [16,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [17,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [18,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [19,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [20,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [21,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [22,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [23,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [24,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [25,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [26,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [27,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [28,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [29,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [30,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [59,0,0], thread: [31,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [0,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [1,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [2,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [3,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [4,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [5,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [6,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [7,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [8,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [9,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [10,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [11,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [12,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [13,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [14,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [15,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [16,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [17,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [18,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [19,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [20,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [21,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [22,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [23,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [24,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [25,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [26,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [27,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [28,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [29,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [30,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [31,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [32,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [33,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [34,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [35,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [36,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [37,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [38,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [39,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [40,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [41,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [42,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [43,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [44,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [45,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [46,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [47,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [48,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [49,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [50,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [51,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [52,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [53,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [54,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [55,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [56,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [57,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [58,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [59,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [60,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [61,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [124,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [32,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [33,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [34,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [35,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [36,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [37,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [38,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [39,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [40,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [41,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [42,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [43,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [44,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [45,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [46,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [47,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [48,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [49,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [50,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [51,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [52,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [53,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [54,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [55,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [56,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [57,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [58,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [59,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [60,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [61,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [85,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [32,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [33,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [34,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [35,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [36,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [37,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [38,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [39,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [40,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [41,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [42,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [43,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [44,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [45,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [46,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [47,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [48,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [49,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [50,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [51,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [52,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [53,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [54,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [55,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [56,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [57,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [58,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [59,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [60,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [61,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [62,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [63,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [0,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [1,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [2,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [3,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [4,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [5,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [6,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [7,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [8,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [9,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [10,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [11,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [12,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [13,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [14,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [15,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [16,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [17,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [18,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [19,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [20,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [21,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [22,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [23,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [24,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [25,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [26,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [27,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [28,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [29,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [30,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:208: operator(): block: [150,0,0], thread: [31,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Traceback (most recent call last):
  File "AvantariSolution/simclr.py", line 153, in <module>
    main()
  File "AvantariSolution/simclr.py", line 119, in main
    top1 = contrastive_evaluate(val_dataloader, model, memory_bank_base)
  File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
    return func(*args, **kwargs)
  File "/content/AvantariSolution/utils/evaluate_utils.py", line 29, in contrastive_evaluate
    top1.update(acc1.item(), images.size(0))
RuntimeError: CUDA error: device-side assert triggered

Experiment about SOTA Comparison

You have done great work! I have a question about the experiments.
As you have mentioned in your paper, Unsupervised Feature Learning via Non-Parametric Instance Discrimination is used as a representing learning method, but this paper seems to present the result of unsupervised classification in CIFAR-10. Why didn't you include it when comparing with SOTA? Is it because the task it does is not exactly equivalent to yours?

KeyError: 'temperature'

Dear Author:
I tried to run
python3 scan.py --config_env configs/cifer_env.yml --config_exp configs/scan/scan_cifar10.yml
but this error occurred

File "moco.py", line 57, in main
    memory_bank_train = MemoryBank(len(train_dataset), 2048, p['num_classes'], p['temperature'])
KeyError:'temperature'

Later, I looked for the element of the p dictionary, but did not find the element temperature. The p dictionary is as follows: {'setup':'scan','criterion':'scan','criterion_kwargs': {'entropy_weight': 5.0} ,'update_cluster_head_only': False,'num_heads': 1,'backbone':'resnet18','train_db_name':'cifar-10','val_db_name':'cifar-10','num_classes': 10,'num_neighbors' : 20,'augmentation_strategy':'ours','augmentation_kwargs': {'crop_size': 32,'normalize': {'mean': [0.4914, 0.4822, 0.4465],'std': [0.2023, 0.1994, 0.201] },'num_strong_augs': 4,'cutout_kwargs': {'n_holes': 1,'length': 16,'random': True}},'transformation_kwargs': {'crop_size': 32,'normalize': {' mean': [0.4914, 0.4822, 0.4465],'std': [0.2023, 0.1994, 0.201]}},'optimizer':'adam','optimizer_kwargs': {'lr': 0.0001,'weight_decay': 0.0001} ,'epochs': 50,'batch_size': 128,'num_workers': 8,'scheduler':'constant','pretext_dir':'./output/cifar-10/pretext','pretext_checkpoint':'./ output/cifar-10/pretext/checkpoint.pth.tar','pretext_model':'./output/cifar-10/pretext/model.pth.tar','topk_ne ighbors_train_path':'./output/cifar-10/pretext/topk-train-neighbors.npy','topk_neighbors_val_path':'./output/cifar-10/pretext/topk-val-neighbors.npy','scan_dir' :'./output/cifar-10/scan','scan_checkpoint':'./output/cifar-10/scan/checkpoint.pth.tar','scan_model':'./output/cifar-10/scan/ model.pth.tar','selflabel_dir':'./output/cifar-10/selflabel','selflabel_checkpoint':'./output/cifar-10/selflabel/checkpoint.pth.tar','selflabel_model': ' ./output/cifar-10/selflabel/model.pth.tar'}
Do you know what's going on, thank you

Regarding SCAN loss

Hi,

I was checking your code for the SCAN loss and it states:

def forward(self, anchors, neighbors):
    """
    input:
        - anchors: logits for anchor images w/ shape [b, num_classes]
        - neighbors: logits for neighbor images w/ shape [b, num_classes]
    output:
        - Loss
    """

Im not sure If Im missing the obvious, but I don't understand why the neighbors is of shape [batch, n_classes]. If we consider a batch of images of [128, n_classes] with 5 neighbors for each data point, shouldn't the batch be [k*batch, n_classes]? Is this a memory strat to deal with large k numbers?

Thanks.

embedding space

It looks like an amazing work. However, I feel like I need some help. I have an unlabeled dataset and I want to get a general understanding of my dataset by looking at the higher level of representation on each image in the embedding space. Where I could check the embedding of each image to visualize the cluster? Thanks for your help.

First step moco

Hello, thanks for your excellent work! I use moco.py, but the Accuracy of top-50 nearest neighbors on train/val set is nan. Have you meet this? what can I do to avoid this?

How can I train custom dataset?

I reviewed the paper and went through almost every part of the code. I executed the code on example datasets(stl_10 and cifar_10).

Now, I'm trying to adopt the code to my custom dataset.

It looks like we can no where to execute the training on custom dataset. The custom_dataset.py script exists but it is used for data augmentation.

Am i missing something or code is designed just to check on most common datasets?

Some questions about the third step self-labeling

Hello!
I am trying to reproduce the results in the paper on cifar and stl-10, and the code works really fine on cifar-10 and cifar-20 as reported in your paper. However when I try stl-10, I found that the clustering accuracy drops during the entire self-labeling training.
Then I tried some new datasets. For example, I extracted 10 classes from MiniImageNet ( let's call it mini-10 ), and 5 classes from CUB-200-2011 (cub-5). And I observed the same phenomenon in mini-10 - though the accuracy in the end of step 2 is trained to above 60%, it still drops rapidly in the third step. With an accuracy of 65% at end of step 2 on cub-5, step 3 simply cannot run because no probability can be found above the threshold 0.99 to produce pseudo label.
Wonder if you have ever encountered such problems and how to solve it. Greatly appreciate your answer.

Some questions about the clustering step

Hi,

According to the paper you freeze the weights of the contrastive backbone and only update the weights of the clustering heads. However; in this implementation it seems by default both the backbone and heads are trained (see here and here)

Furthermore the entropy weight is set to 5 according to the paper, but in the implementation the default is 2 (see here)

I would greatly appreciate if you could let me know which options are correct! I've tried training with the backbone enabled, which leads to very fast convergence; which I guess is always suspicious, hence my asking.

About KNN in SCAN

Hi, thanks for sharing your great work! I have a concern about the KNN in SCAN training.

In the Eq.2 of your paper, you calculate the loss by maximizing the similarities between each anchor and its KNN. However, in your code, it seems that you only maximize the similarity between each anchor and one of its randomly selected KNN, as the dataloder below.

neighbor_index = np.random.choice(self.indices[index], 1)[0]
neighbor = self.dataset.__getitem__(neighbor_index)

I am not sure if my understanding is correct.

Thanks.

overclustering

Hello, I want to know how could I overcluster a dataset, just change the num_classes in yml file?
Such as, cluster a cifar10 to 20 classes? Thanks

The Implementation result of pretext task + kmeans

Hi, Thank for your nice work. It inspired me most.
I want to implement the result of pretext task + kmeans on CIFAR10 (65% ACC in paper)
First, I download the checkpoint in here: https://drive.google.com/file/d/1Cl5oAcJKoNE5FSTZsBSAKLcyA5jXGgTT/view
Then, I add some code in the eval.py as follow:

        print('Fill Memory Bank')
        fill_memory_bank(dataloader, model, memory_bank)

        if not args.simclr_kmeans:
            print('Mine the nearest neighbors')
            for topk in [1, 5, 20]: # Similar to Fig 2 in paper
                _, acc = memory_bank.mine_nearest_neighbors(topk)
                print('Accuracy of top-{} nearest neighbors on validation set is {:.2f}'.format(topk, 100*acc))
        else:
            head = 0
            print(memory_bank.features.cpu().shape)
            kmeans = KMeans(n_clusters=config['num_classes'], random_state=0).fit(memory_bank.features.cpu())
            kmeans = torch.from_numpy(kmeans.labels_).cuda()
            predictions=[{'predictions':kmeans,'probabilities':1, 'targets':memory_bank.targets}]
            clustering_stats = hungarian_evaluate_me(head, predictions, dataset.classes,
                                                  compute_confusion_matrix=True)
            print(clustering_stats)

But i get the following result, which is far lower than the 65% in paper

{'ACC': 0.3647, 'ARI': 0.13848755246278868, 'NMI': 0.2627059928586838, 'hungarian_match': [(0, 2), (1, 1), (2, 8), (3, 3), (4, 5), (5, 9), (6, 0), (7, 4), (8, 6), (9, 7)]}

Maybe I make some mistake in the calculation, can you tell me where I am wrong? Great thank for your time!

train on custom dataset?

Hi, thanks for this great project.
I have a dataset with images that show a dotted font on a white background. These images have different kind of artifacts like pollution, scratches, ... often there are also several artifacts in one image. I want to cluster the dataset according to the different kind of artifacts. Do you think this is possible with this code?

Backbone not frozen during SCAN training

Hi there,

Thanks for contributing such a great work to the community.

It is mentioned in the Supplementary materials in the paper (Clustering Step of ImageNet) that the weights of the backbone are frozen during SCAN training. From the code, this doesn't apply for the training of smaller datasets. Is it because of the longer runtime required for ImageNet that leads to the requirement of weight-freeze during SCAN training? How does the performance compare if the weights are frozen for the small dataset case?

If the weights are not frozen during SCAN training, doesn't it mean that the extracted embeddings might deviate from the stored memory during pretext training? I assume that such deviation would affect the original kNN results and speculate that the continual update of the neighbor-memory would result in better performance in the SCAN training step.

New dataset

Hi Wouter!
First of all congratulations for your work, it’s really interesting :-)

I’m trying to use your algorithm with a new dataset: it has ∼4000 images, ∼60 classes and the size of the images is not fixed, I use “random_resized_crop” to 224 for the whole dataset (the size of the images is always greater than 224).

1 - With 224 as image size I have to decrement the batch size (from 512 to 32) since I have memory problems. Your advice is to decrement the image size in such a way that the model is trainable with batch size 512 (or at least 256) or it’s better to have image size 224 and batch size 32? (Referring to page 6 of your paper in the “Implementation details” section I read: “..we approximate the dataset statistics by sampling batches of sufficiently large size.”)

2 - In the case in which it’s better to have image size 224 and batch size 32, should I change something else? (for example learning rate.. ecc)

3 - I notice that you use a different implementation of Resnet-18 for Cifar and Stl (max pooling and average pooling). It is just a change due to dimensionality problems or there is a different reason? Should I create a custom Resnet-18 also for my new dataset?

4 - Why did you use Moco for ImageNet and SimCLR for Cifar/Stl?

Thank you for your time.

Attributing clusters to the correct classes

Hello @wvangansbeke! Congratulations on your work, and thanks for sharing a very organized code! I believe it is a very important (but often overlooked) step of research.

I'm playing with your code in a custom dataset, and I wondered if there is any code that "aligns" the clusters to the correct classes, or if it must be done manually... For example, in CIFAR-10, you use the labeled data to evaluate the unsupervised procedure's performance. How do you know that cluster 3 (for example) corresponds to the class "truck"? Also, having more clusters than necessary (using the example of the paper, 20 for CIFAR-10), how do you decide which clusters to merge to compose the classes?

Thanks again!

Some files in the model are not used, such as models.py and losses.py!

Dear author:
I am a loyal fan of yours. Recently I discovered that your code is open source, and it has attracted many stars in a short period of time. But I have some doubts. If you have time to trouble you, I find that the models.py and losses.py files in the code have not been imported by other files and have become independent. What are their functions? Why do you want to do this?

Does the parameter num_classes in simclr config has any significance in actual clustering?

Hi,
I am ML student, trying to sort a custom unlabelled dataset using this technique.I noticed in the pretext method (using simclr in my case) we have the parameter num_classes in the config file, as I understand the actual clusters are made on the second step, so I wanted to ask if this parameter has any high significance on the overall clustering? Also just to be sure, the accuracy metrics on simclr is based on the ground truth, yes?
Thank you for the papers and the codes. its really cool :)

edit: grammar error

Data Imbalance is a problem.

Hy. I have tried the code for datasets after introducing class imbalance. The model is not able to perform well for such datasets. Also, if the number of classes are high, more than 50, the model is not able to outperform kmeans.
Why so.? Even if we try to balance classes in self label step.
Is it the weakness of entropy term in SCAN loss?
You work is really good. But for real world datasets like high number of classes and skewed dataset, how to modify it? Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.