Coder Social home page Coder Social logo

hendrycks / ss-ood Goto Github PK

View Code? Open in Web Editor NEW
263.0 8.0 31.0 8.3 MB

Self-Supervised Learning for OOD Detection (NeurIPS 2019)

License: MIT License

Shell 0.13% Python 99.87%
robustness out-of-distribution-detection uncertainty self-supervised-learning self-supervised ml-safety

ss-ood's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ss-ood's Issues

Can you provide parameters for multiclass_ood/train_auxiliary?

Hi.
There seems to be no parameters to actually reproduce yours and the baseline for the multiclass_ood setting.

Can you provide the specific number for epochs and other parameters to actually reproduce the result?

The default param says 5 for epochs but it seems to be not right.

why bx*2 -1 ?

Before fed in network , the input bx become bx *2 - 1, i would like to why. Thanks!

Could I use this trick in normal model training?

Hi,

Thanks for providing this great paper and code. I am trying to use the method you proposed in normal classification task with dataset such as cifar-10. To make sure that I have understand the paper correctly, I feel I better ask you for some guidance:

Suppose my baseline cifar-10 classification model is WideResnet28-1, and I use batch size of 256 with cosine annealing lr scheduler. The initial learning rate is thus 0.2. The augmentation method is horizontal flip and random cropping after padded 4 pixels. Apart from these normal settings, I also used mixup to train the model.

The question is: what is the most suitable way to add self-supervision to the above training procedure? Here is my assumption: I should add a new 4-way classification fc layer head in parallel with the 10-way classification head to the model. The total loss thus should become L_10 + 0.5*L_4 according to the paper. As for the dataset, I first implement normal h-flip and random cropping augmentation, and then rotate the cropped and flipped image in (0, 90, 180, 360) to make the batch size to be 2564=1024. Since the batch size is amplified, I should also amplify the learning rate to 0.24=0.8. As for the mixup part, I should mix the 10-way classification labels as well as the 4-way rotation labels and then use cross entropy to compute the loss respectively.

Is this the correct way to use your method in normal classification?

Questions about the model

Thank you for sharing your work. I have two questions.

First, I found a undefined class as 'BAM' and undefined type as 'BAM' in the line 118-121 at the \models\cbam\model_resnet.py. Do you mean 'CBAM' here?

Second, when I tried to train one-class-ood on ResidualNet with depth of 18 and 34, which used the BasicBlock defined in your code, the model run fine. However when using ResidualNet with depth of 50 and 101, based on the Bottleneck, the model seemed to have some problem. The size of "out" and "residual" in the line 96 at \models\cbam\model_resnet.py is not the same shape. Here is the Traceback:
Traceback (most recent call last):
7 File "train.py", line 224, in
8 train()
9 File "train.py", line 164, in train
10 x = net(2 * data - 1)
11 File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
12 result = self.forward(*input, **kwargs)
13 File "/data/workplace/ss-ood-master/models/cbam/model_resnet.py", line 169, in forward
14 x = self.layer1(x)
15 File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
16 result = self.forward(*input, **kwargs)
17 File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
18 input = module(input)
19 File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 541, in call
20 result = self.forward(*input, **kwargs)
21 File "/data/workplace/ss-ood-master/models/cbam/model_resnet.py", line 96, in forward
22 out += residual
23 RuntimeError: The size of tensor a (256) must match the size of tensor b (64) at non-singleton dimension 1

Question about loss function (train.py code 168 line)

Firstly, thank you for releasing your codes. It's very helpful for my research :)

I wonder if the objective function in train.py(168 line) is just about Rotation and Translation class. Because, in your paper, highest score on ImageNet was the result of trained by RotNet + Translation + Self-attention + Resize.

I hope you could answer for my question soon !

Question about logits, pen = model(adv_bx * 2 - 1)

Hi,
Thanks for your paper and code. But I got an error during running adversary = attacks.PGD(epsilon=8./255, num_steps=10, step_size=2./255).cpu() in folder adversarial.
The error was:
logits, pen = model(adv_bx * 2 - 1)
ValueError: too many values to unpack (expected 2)

I don't know whether I did something wrong in the code. I just changed the .cuda() to .cpu() because I only used the CPU version.
Can anyone help me to solve this question?
Thanks so much.

Accuracy for Common Corruptions.

Hi Hendrycks!
May I ask one more thing?
Your reply to Adversarial helped me a lot.

Can you tell me how to change corruption strengths for CIFAR-10-C?
I downloaded Cifar-10-C from the link provided in https://github.com/hendrycks/robustness.
However, there seems to be nothing related to corruption strengths.

Can you provide your exact performance in common corruption setting?
And Did you report best accuracy? or the accuracy from the last epoch?

Since if Cifar-10-C dataset contain all five corruption strengths, then the epoch 86 seems to be most similar with your results. However, clean accuracy doesn't seem to reach 95.5% at all with WRN40-2. Can you provide how to reproduce your settings?

corrupt acc :  [53.946000000000005, 64.344, 57.306000000000004, 67.526, 58.382, 84.642, 79.41, 79.276, 71.054, 79.024, 83.966, 83.89, 93.55799999999999, 91.96600000000001, 82.48, 73.512, 83.636, 87.90599999999999, 81.658]
Epoch  86 |Time   150 |Tr Loss 0.0477 |Te Loss 0.884 |Test acc  |corrupt mean acc 76.71


corrupt acc :  [51.488, 62.674, 57.668, 66.06, 57.348, 84.75399999999999, 78.518, 78.544, 71.41199999999999, 78.74799999999999, 84.206, 85.49600000000001, 93.95599999999999, 92.78999999999999, 84.552, 72.418, 83.418, 87.768, 80.138]
Epoch 100 |Time   150 |Tr Loss 0.0261 |Te Loss 1.010 |Test acc  |corrupt mean acc 76.42

Thanks a lot, again!

Reproducing robustness results for CIFAR-10 via auxilliary rotation task

Hi,

I found your research paper very interesting.

However, when I was implementing your paper, I was unable to reproduce the results for CIFAR-10 with the following configs:
Network: wrn 40-2
Training loss = c.e(adv) + 0.5(Loss_rotation)
Adv. perturbation creation loss = cross-entropy(x,y) + Loss_rotation
SGD, learning rate = 0.1 , momentum = 0.9 and nestrov=true , batch=128 with cosine annealing for 205 epochs.
i.e.
optimizer = torch.optim.SGD([ {'params': model.parameters()}, {'params': rotate_classifier.parameters()} ] , lr=0.1 , nesterov = True , momentum = 0.9,weight_decay=0.0005)

scheduler = torch.optim.lr_scheduler.LambdaLR( optimizer, lr_lambda=lambda step: cosine_annealing( step, 205 * len(base_loader), 1, # since lr_lambda computes multiplicative factor 1e-6 / 0.1))

I am getting the following result:
Test Accuracy: 72.4425 : Rotation Accuracy 80.5675 : Adversarial Accuracy(pgd-10 only on cross-entropy loss ): 10.42

Can you please mention the hyper-parameters again for learning rate scheduler and the number of epochs of training you used for getting the results?

Thanks

training epochs

I was wondering how do you know how many epochs to use during training? It seems like increasing the epochs actually worsens the performance. However, looking at the training loss and testing loss, it does not seem to really correlate with the AUC performance during testing. Any advice? Thank you!!

Application of the Method on Signal Data

Hi, I want to detect the ood samples while performing one class classification. However my data is not consisting of images but signals like sound. Could this method be applicable to my scenario? Or do you have any suggestion regarding my scenario? Since I am not sure how I can rotate the signals. Thanks in advance

how can I reproduce your imageNet AUROC?

First and Foremost , I really appreciate you for sharing code.
Now I try to reproduce ImageNet AUROC 85.7%, but I don't know exact parameter setting(eg learning rate, epoch etc..)
and also ImageNet dataset downloaded from your github is enough for training? plz let me know~ thanks!

No training data provided

I found the link to download the test data. But training data is not provided. Does that mean I should download the Imagenet dataset and use the 'symlink_to_data.py' to create the training data?

Thanks.

Question about Adversarial Training + Rotations

Dan,

I think I know the answer to this question but I will ask it anyway. Do you compute the adversarial examples on the batch that contains all of the rotated versions of the data OR do you compute the adversarial examples on the 0deg rotation batch then compute the rotated versions? If it is the former, do you incur a ~4x increase in training time because the effective batch size is 4x bigger?

Thanks in advance,
Nate

Largely deviating AUROC scores in self-implemented MSP baseline (Multi-Class OoD Detection)

Hi, really interesting publication!
I tried to reproduce the results of your Multi-Class OoD Detector with rotation head compared to the vanilla MSP baseline.
The AUROC scores of the rotation network were quite similiar in my self-trained implementation: Gaussian OoD 99.38%, Cifar-100 OoD 90.65%.

My issue is now with the vanilla MSP baseline, because I get a very large deviation in AUROC scores to your baseline of more than 30% (Gaussian OoD: 65.41%, Cifar-100 OoD: 52.38%).

Now I am trying to figure out what the issue with my implementation is and would like to ask you to provide some more details about the (training) setup of your vanilla MSP baseline.
Basically how exactly is the model architecture, what training data (incl. perturbations) and loss function do you use and what hyperparameters did you have?

Best Regards and already thank you in advance!
Marc Alexander

wrn_prime not found in adversarial robustness

Hi,
Thanks for the wonderful paper and open-sourcing the code.
However, I had two issues in adversarial robustness code.

  1. File "/ss-ood/adversarial/attacks.py", line 62, in forward
    logits, pen = model(adv_bx*2-1)
    ValueError: too many values to unpack (expected 2)

  2. from models.wrn_prime import WideResNet, wrn_prime not found.
    I guess there is a model for which code is not present.

Reproducing CIFAR-10 One-class OOD Detector with OE

Firstly many thanks for sharing your work.

I found the paper very interesting and wanted to see if I could reproduce some of the results, especially regarding the one-class OOD detector on CIFAR-10 using OE. I was just wondering if this would be made available at any point.

Kind regards,
Se

Signs of the two score components for OOD detection

Hi,

in the scoring formula on page 7 in the paper, shouldn't the KL divergence of the classifier prediction from uniform be small for OOD inputs, and the rotation CE be large on OOD since the rotation head has not been trained to predict the original rotation on OOD inputs? I.e. one of the terms should have a minus sign, right?

If I read it correctly, the code uses different signs for those terms:

classification_loss = -1 * kl_div(class_uniform_dist, classification_smax)
rot_one_hot = torch.zeros_like(rot_smax).scatter_(1, target_rots.unsqueeze(1).cuda(), 1)
rot_loss = kl_div(rot_one_hot, rot_smax)

where KL is positive CE minus the constant entropy of U.

Question for Adversarial Attack Implementation

Hi Hendrycks.
Its pleasure to review your works in ood.
Meanwhile, I have a question about the adversarial attack.

adversary = attacks.PGD(epsilon=8./255, num_steps=20, step_size=2./255).cuda()

This code in line 125 in ss-ood/adversarial/train.py seems to train the 20-step PGD for adversarial training+Auxiliary Rotations.(I changed the num_steps to 20 to make your reported setting.)

However, the training log seems something weird.

Epoch  57 | Time   607 | Train Loss 1.6344 | Test Loss 0.843 | Test Error 27.96
Epoch  58 | Time   605 | Train Loss 1.6214 | Test Loss 0.829 | Test Error 26.39
Epoch  59 | Time   594 | Train Loss 1.5835 | Test Loss 0.808 | Test Error 25.44
Epoch  60 | Time   591 | Train Loss 1.6008 | Test Loss 0.802 | Test Error 24.35
Epoch  61 | Time   602 | Train Loss 1.6263 | Test Loss 0.796 | Test Error 26.33
Epoch  62 | Time   611 | Train Loss 1.5923 | Test Loss 0.790 | Test Error 24.62
Epoch  63 | Time   597 | Train Loss 1.5906 | Test Loss 0.783 | Test Error 25.35
Epoch  64 | Time   607 | Train Loss 1.6116 | Test Loss 0.811 | Test Error 25.18

Your reported accuracy is 50.4% for 20-step PGD but the test error is too low and it seems to be similar to Clean.
Can you please provide how you ran the code for generating 20-step PGD and 100-step PGD?

Thanks for your great work!

Inconsistent pixel range values

In the adversarial training code, the input to the model is in range (-1, 1). However, in the attack code the pixel values are clipped in range (0,1). Seems like a bug to me, unless I am missing something.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.