Coder Social home page Coder Social logo

robustbench / robustbench Goto Github PK

View Code? Open in Web Editor NEW
607.0 9.0 95.0 6 MB

RobustBench: a standardized adversarial robustness benchmark [NeurIPS'21 Benchmarks and Datasets Track]

Home Page: https://robustbench.github.io

License: Other

Python 98.61% Jinja 0.94% Shell 0.45%
adversarial-robustness adversarial-machine-learning benchmark model-zoo

robustbench's Introduction

RobustBench: a standardized adversarial robustness benchmark

Francesco Croce* (University of Tübingen), Maksym Andriushchenko* (EPFL), Vikash Sehwag* (Princeton University), Edoardo Debenedetti* (EPFL), Nicolas Flammarion (EPFL), Mung Chiang (Purdue University), Prateek Mittal (Princeton University), Matthias Hein (University of Tübingen)

Leaderboard: https://robustbench.github.io/

Paper: https://arxiv.org/abs/2010.09670

❗Note❗: if you experience problems with the automatic downloading of the models from Google Drive, install the latest version of RobustBench via pip install git+https://github.com/RobustBench/robustbench.git.

News

  • May 2022: We have extended the common corruptions leaderboard on ImageNet with 3D Common Corruptions (ImageNet-3DCC). ImageNet-3DCC evaluation is interesting since (1) it includes more realistic corruptions and (2) it can be used to assess generalization of the existing models which may have overfitted to ImageNet-C. For a quickstart, click here. Note that the entries in leaderboard are still sorted according to ImageNet-C performance.

  • May 2022: We fixed the preprocessing issue for ImageNet corruption evaluations: previously we used resize to 256x256 and central crop to 224x224 which wasn't necessary since the ImageNet-C images are already 224x224 (see this issue). Note that this changed the ranking between the top-1 and top-2 entries.

Main idea

The goal of RobustBench is to systematically track the real progress in adversarial robustness. There are already more than 3'000 papers on this topic, but it is still often unclear which approaches really work and which only lead to overestimated robustness. We start from benchmarking the Linf, L2, and common corruption robustness since these are the most studied settings in the literature.

Evaluation of the robustness to Lp perturbations in general is not straightforward and requires adaptive attacks (Tramer et al., (2020)). Thus, in order to establish a reliable standardized benchmark, we need to impose some restrictions on the defenses we consider. In particular, we accept only defenses that are (1) have in general non-zero gradients wrt the inputs, (2) have a fully deterministic forward pass (i.e. no randomness) that (3) does not have an optimization loop. Often, defenses that violate these 3 principles only make gradient-based attacks harder but do not substantially improve robustness (Carlini et al., (2019)) except those that can present concrete provable guarantees (e.g. Cohen et al., (2019)).

To prevent potential overadaptation of new defenses to AutoAttack, we also welcome external evaluations based on adaptive attacks, especially where AutoAttack flags a potential overestimation of robustness. For each model, we are interested in the best known robust accuracy and see AutoAttack and adaptive attacks as complementary to each other.

RobustBench consists of two parts:

  • a website https://robustbench.github.io/ with the leaderboard based on many recent papers (plots below 👇)
  • a collection of the most robust models, Model Zoo, which are easy to use for any downstream application (see the tutorial below after FAQ 👇)

FAQ

Q: How does the RobustBench leaderboard differ from the AutoAttack leaderboard? 🤔
A: The AutoAttack leaderboard was the starting point of RobustBench. Now only the RobustBench leaderboard is actively maintained.

Q: How does the RobustBench leaderboard differ from robust-ml.org? 🤔
A: robust-ml.org focuses on adaptive evaluations, but we provide a standardized benchmark. Adaptive evaluations have been very useful (e.g., see Tramer et al., 2020) but they are also very time-consuming and not standardized by definition. Instead, we argue that one can estimate robustness accurately mostly without adaptive attacks but for this one has to introduce some restrictions on the considered models. However, we do welcome adaptive evaluations and we are always interested in showing the best known robust accuracy.

Q: How is it related to libraries like foolbox / cleverhans / advertorch? 🤔
A: These libraries provide implementations of different attacks. Besides the standardized benchmark, RobustBench additionally provides a repository of the most robust models. So you can start using the robust models in one line of code (see the tutorial below 👇).

Q: Why is Lp-robustness still interesting? 🤔
A: There are numerous interesting applications of Lp-robustness that span transfer learning (Salman et al. (2020), Utrera et al. (2020)), interpretability (Tsipras et al. (2018), Kaur et al. (2019), Engstrom et al. (2019)), security (Tramèr et al. (2018), Saadatpanah et al. (2019)), generalization (Xie et al. (2019), Zhu et al. (2019), Bochkovskiy et al. (2020)), robustness to unseen perturbations (Xie et al. (2019), Kang et al. (2019)), stabilization of GAN training (Zhong et al. (2020)).

Q: What about verified adversarial robustness? 🤔
A: We mostly focus on defenses which improve empirical robustness, given the lack of clarity regarding which approaches really improve robustness and which only make some particular attacks unsuccessful. However, we do not restrict submissions of verifiably robust models (e.g., we have Zhang et al. (2019) in our CIFAR-10 Linf leaderboard). For methods targeting verified robustness, we encourage the readers to check out Salman et al. (2019) and Li et al. (2020).

Q: What if I have a better attack than the one used in this benchmark? 🤔
A: We will be happy to add a better attack or any adaptive evaluation that would complement our default standardized attacks.

Model Zoo: quick tour

The goal of our Model Zoo is to simplify the usage of robust models as much as possible. Check out our Colab notebook here 👉 RobustBench: quick start for a quick introduction. It is also summarized below 👇.

First, install the latest version of RobustBench (recommended):

pip install git+https://github.com/RobustBench/robustbench.git

or the latest stable version of RobustBench (it is possible that automatic downloading of the models may not work):

pip install git+https://github.com/RobustBench/[email protected]

Now let's try to load CIFAR-10 and some quite robust CIFAR-10 models from Carmon2019Unlabeled that achieves 59.53% robust accuracy evaluated with AA under eps=8/255:

from robustbench.data import load_cifar10

x_test, y_test = load_cifar10(n_examples=50)

from robustbench.utils import load_model

model = load_model(model_name='Carmon2019Unlabeled', dataset='cifar10', threat_model='Linf')

Let's try to evaluate the robustness of this model. We can use any favourite library for this. For example, FoolBox implements many different attacks. We can start from a simple PGD attack:

!pip install -q foolbox
import foolbox as fb
fmodel = fb.PyTorchModel(model, bounds=(0, 1))

_, advs, success = fb.attacks.LinfPGD()(fmodel, x_test.to('cuda:0'), y_test.to('cuda:0'), epsilons=[8/255])
print('Robust accuracy: {:.1%}'.format(1 - success.float().mean()))
>>> Robust accuracy: 58.0%

Wonderful! Can we do better with a more accurate attack?

Let's try to evaluate its robustness with a cheap version AutoAttack from ICML 2020 with 2/4 attacks (only APGD-CE and APGD-DLR):

# autoattack is installed as a dependency of robustbench so there is not need to install it separately
from autoattack import AutoAttack
adversary = AutoAttack(model, norm='Linf', eps=8/255, version='custom', attacks_to_run=['apgd-ce', 'apgd-dlr'])
adversary.apgd.n_restarts = 1
x_adv = adversary.run_standard_evaluation(x_test, y_test)
>>> initial accuracy: 92.00%
>>> apgd-ce - 1/1 - 19 out of 46 successfully perturbed
>>> robust accuracy after APGD-CE: 54.00% (total time 10.3 s)
>>> apgd-dlr - 1/1 - 1 out of 27 successfully perturbed
>>> robust accuracy after APGD-DLR: 52.00% (total time 17.0 s)
>>> max Linf perturbation: 0.03137, nan in tensor: 0, max: 1.00000, min: 0.00000
>>> robust accuracy: 52.00%

Note that for our standardized evaluation of Linf-robustness we use the full version of AutoAttack which is slower but more accurate (for that just use adversary = AutoAttack(model, norm='Linf', eps=8/255)).

What about other types of perturbations? Is Lp-robustness useful there? We can evaluate the available models on more general perturbations. For example, let's take images corrupted by fog perturbations from CIFAR-10-C with the highest level of severity (5). Are different Linf robust models perform better on them?

from robustbench.data import load_cifar10c
from robustbench.utils import clean_accuracy

corruptions = ['fog']
x_test, y_test = load_cifar10c(n_examples=1000, corruptions=corruptions, severity=5)

for model_name in ['Standard', 'Engstrom2019Robustness', 'Rice2020Overfitting',
                   'Carmon2019Unlabeled']:
 model = load_model(model_name, dataset='cifar10', threat_model='Linf')
 acc = clean_accuracy(model, x_test, y_test)
 print(f'Model: {model_name}, CIFAR-10-C accuracy: {acc:.1%}')
>>> Model: Standard, CIFAR-10-C accuracy: 74.4%
>>> Model: Engstrom2019Robustness, CIFAR-10-C accuracy: 38.8%
>>> Model: Rice2020Overfitting, CIFAR-10-C accuracy: 22.0%
>>> Model: Carmon2019Unlabeled, CIFAR-10-C accuracy: 31.1%

As we can see, all these Linf robust models perform considerably worse than the standard model on this type of corruptions. This curious phenomenon was first noticed in Adversarial Examples Are a Natural Consequence of Test Error in Noise and explained from the frequency perspective in A Fourier Perspective on Model Robustness in Computer Vision.

However, on average adversarial training does help on CIFAR-10-C. One can check this easily by loading all types of corruptions via load_cifar10c(n_examples=1000, severity=5), and repeating evaluation on them.

*New*: Evaluating robustness of ImageNet models against 3D Common Corruptions (ImageNet-3DCC)

3D Common Corruptions (3DCC) is a recent benchmark by Kar et al. (CVPR 2022) using scene geometry to generate realistic corruptions. You can evaluate robustness of a standard ResNet-50 against ImageNet-3DCC by following these steps:

  1. Download the data from here using the provided tool. The data will be saved into a folder named ImageNet-3DCC.

  2. Run the sample evaluation script to obtain accuracies and save them in a pickle file:

import torch 
from robustbench.data import load_imagenet3dcc
from robustbench.utils import clean_accuracy, load_model

corruptions_3dcc = ['near_focus', 'far_focus', 'bit_error', 'color_quant', 
                   'flash', 'fog_3d', 'h265_abr', 'h265_crf',
                   'iso_noise', 'low_light', 'xy_motion_blur', 'z_motion_blur'] # 12 corruptions in ImageNet-3DCC

device = torch.device("cuda:0")
model = load_model('Standard_R50', dataset='imagenet', threat_model='corruptions').to(device)
for corruption in corruptions_3dcc:
    for s in [1, 2, 3, 4, 5]:  # 5 severity levels
        x_test, y_test = load_imagenet3dcc(n_examples=5000, corruptions=[corruption], severity=s, data_dir=$PATH_IMAGENET_3DCC)
        acc = clean_accuracy(model, x_test.to(device), y_test.to(device), device=device)
        print(f'Model: {model_name}, ImageNet-3DCC corruption: {corruption} severity: {s} accuracy: {acc:.1%}')

Model Zoo

In order to use a model, you just need to know its ID, e.g. Carmon2019Unlabeled, and to run:

from robustbench import load_model

model = load_model(model_name='Carmon2019Unlabeled', dataset='cifar10', threat_model='Linf')

which automatically downloads the model (all models are defined in model_zoo/models.py).

Reproducing evaluation of models from the Model Zoo can be done directly from the command line. Here is an example of an evaluation of Salman2020Do_R18 model with AutoAttack on ImageNet for eps=4/255=0.0156862745:

python -m robustbench.eval --n_ex=5000 --dataset=imagenet --threat_model=Linf --model_name=Salman2020Do_R18 --data_dir=/tmldata1/andriush/imagenet --batch_size=128 --eps=0.0156862745

The CIFAR-10, CIFAR-10-C, CIFAR-100, and CIFAR-100-C datasets are downloaded automatically. However, the ImageNet datasets should be downloaded manually due to their licensing:

  • ImageNet: Obtain the download link here (requires just signing up from an academic email, the approval system there is automatic and happens instantly) and then follow the instructions here to extract the validation set in a pytorch-compatible format into folder val.
  • ImageNet-C: Please visit here for the instructions.
  • ImageNet-3DCC: Download the data from here using the provided tool. The data will be saved into a folder named ImageNet-3DCC.

In order to use the models from the Model Zoo, you can find all available model IDs in the tables below. Note that the full leaderboard contains a bit more models which we either have not yet added to the Model Zoo or their authors don't want them to appear in the Model Zoo.

CIFAR-10

Linf, eps=8/255

# Model ID Paper Clean accuracy Robust accuracy Architecture Venue
1 Peng2023Robust Robust Principles: Architectural Design Principles for Adversarially Robust CNNs 93.27% 71.07% RaWideResNet-70-16 BMVC 2023
2 Wang2023Better_WRN-70-16 Better Diffusion Models Further Improve Adversarial Training 93.25% 70.69% WideResNet-70-16 ICML 2023
3 Bai2024MixedNUTS MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers 95.19% 69.71% ResNet-152 + WideResNet-70-16 arXiv, Feb 2024
4 Bai2023Improving_edm Improving the Accuracy-Robustness Trade-off of Classifiers via Adaptive Smoothing 95.23% 68.06% ResNet-152 + WideResNet-70-16 + mixing network SIMODS 2024
5 Cui2023Decoupled_WRN-28-10 Decoupled Kullback-Leibler Divergence Loss 92.16% 67.73% WideResNet-28-10 arXiv, May 2023
6 Wang2023Better_WRN-28-10 Better Diffusion Models Further Improve Adversarial Training 92.44% 67.31% WideResNet-28-10 ICML 2023
7 Rebuffi2021Fixing_70_16_cutmix_extra Fixing Data Augmentation to Improve Adversarial Robustness 92.23% 66.56% WideResNet-70-16 arXiv, Mar 2021
8 Gowal2021Improving_70_16_ddpm_100m Improving Robustness using Generated Data 88.74% 66.10% WideResNet-70-16 NeurIPS 2021
9 Gowal2020Uncovering_70_16_extra Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 91.10% 65.87% WideResNet-70-16 arXiv, Oct 2020
10 Huang2022Revisiting_WRN-A4 Revisiting Residual Networks for Adversarial Robustness: An Architectural Perspective 91.58% 65.79% WideResNet-A4 arXiv, Dec. 2022
11 Rebuffi2021Fixing_106_16_cutmix_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 88.50% 64.58% WideResNet-106-16 arXiv, Mar 2021
12 Rebuffi2021Fixing_70_16_cutmix_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 88.54% 64.20% WideResNet-70-16 arXiv, Mar 2021
13 Kang2021Stable Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks 93.73% 64.20% WideResNet-70-16, Neural ODE block NeurIPS 2021
14 Xu2023Exploring_WRN-28-10 Exploring and Exploiting Decision Boundary Dynamics for Adversarial Robustness 93.69% 63.89% WideResNet-28-10 ICLR 2023
15 Gowal2021Improving_28_10_ddpm_100m Improving Robustness using Generated Data 87.50% 63.38% WideResNet-28-10 NeurIPS 2021
16 Pang2022Robustness_WRN70_16 Robustness and Accuracy Could Be Reconcilable by (Proper) Definition 89.01% 63.35% WideResNet-70-16 ICML 2022
17 Rade2021Helper_extra Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off 91.47% 62.83% WideResNet-34-10 OpenReview, Jun 2021
18 Sehwag2021Proxy_ResNest152 Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? 87.30% 62.79% ResNest152 ICLR 2022
19 Gowal2020Uncovering_28_10_extra Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 89.48% 62.76% WideResNet-28-10 arXiv, Oct 2020
20 Huang2021Exploring_ema Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks 91.23% 62.54% WideResNet-34-R NeurIPS 2021
21 Huang2021Exploring Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks 90.56% 61.56% WideResNet-34-R NeurIPS 2021
22 Dai2021Parameterizing Parameterizing Activation Functions for Adversarial Robustness 87.02% 61.55% WideResNet-28-10-PSSiLU arXiv, Oct 2021
23 Pang2022Robustness_WRN28_10 Robustness and Accuracy Could Be Reconcilable by (Proper) Definition 88.61% 61.04% WideResNet-28-10 ICML 2022
24 Rade2021Helper_ddpm Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off 88.16% 60.97% WideResNet-28-10 OpenReview, Jun 2021
25 Rebuffi2021Fixing_28_10_cutmix_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 87.33% 60.73% WideResNet-28-10 arXiv, Mar 2021
26 Sridhar2021Robust_34_15 Improving Neural Network Robustness via Persistency of Excitation 86.53% 60.41% WideResNet-34-15 ACC 2022
27 Sehwag2021Proxy Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? 86.68% 60.27% WideResNet-34-10 ICLR 2022
28 Wu2020Adversarial_extra Adversarial Weight Perturbation Helps Robust Generalization 88.25% 60.04% WideResNet-28-10 NeurIPS 2020
29 Sridhar2021Robust Improving Neural Network Robustness via Persistency of Excitation 89.46% 59.66% WideResNet-28-10 ACC 2022
30 Zhang2020Geometry Geometry-aware Instance-reweighted Adversarial Training 89.36% 59.64% WideResNet-28-10 ICLR 2021
31 Carmon2019Unlabeled Unlabeled Data Improves Adversarial Robustness 89.69% 59.53% WideResNet-28-10 NeurIPS 2019
32 Gowal2021Improving_R18_ddpm_100m Improving Robustness using Generated Data 87.35% 58.50% PreActResNet-18 NeurIPS 2021
33 Chen2024Data_WRN_34_20 Data filtering for efficient adversarial training 86.10% 58.09% WideResNet-34-20 Pattern Recognition 2024
34 Addepalli2021Towards_WRN34 Scaling Adversarial Training to Large Perturbation Bounds 85.32% 58.04% WideResNet-34-10 ECCV 2022
35 Addepalli2022Efficient_WRN_34_10 Efficient and Effective Augmentation Strategy for Adversarial Training 88.71% 57.81% WideResNet-34-10 NeurIPS 2022
36 Chen2021LTD_WRN34_20 LTD: Low Temperature Distillation for Robust Adversarial Training 86.03% 57.71% WideResNet-34-20 arXiv, Nov 2021
37 Rade2021Helper_R18_extra Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off 89.02% 57.67% PreActResNet-18 OpenReview, Jun 2021
38 Jia2022LAS-AT_70_16 LAS-AT: Adversarial Training with Learnable Attack Strategy 85.66% 57.61% WideResNet-70-16 arXiv, Mar 2022
39 Debenedetti2022Light_XCiT-L12 A Light Recipe to Train Robust Vision Transformers 91.73% 57.58% XCiT-L12 arXiv, Sep 2022
40 Chen2024Data_WRN_34_10 Data filtering for efficient adversarial training 86.54% 57.30% WideResNet-34-10 Pattern Recognition 2024
41 Debenedetti2022Light_XCiT-M12 A Light Recipe to Train Robust Vision Transformers 91.30% 57.27% XCiT-M12 arXiv, Sep 2022
42 Sehwag2020Hydra HYDRA: Pruning Adversarially Robust Neural Networks 88.98% 57.14% WideResNet-28-10 NeurIPS 2020
43 Gowal2020Uncovering_70_16 Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 85.29% 57.14% WideResNet-70-16 arXiv, Oct 2020
44 Rade2021Helper_R18_ddpm Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off 86.86% 57.09% PreActResNet-18 OpenReview, Jun 2021
45 Cui2023Decoupled_WRN-34-10 Decoupled Kullback-Leibler Divergence Loss 85.31% 57.09% WideResNet-34-10 arXiv, May 2023
46 Chen2021LTD_WRN34_10 LTD: Low Temperature Distillation for Robust Adversarial Training 85.21% 56.94% WideResNet-34-10 arXiv, Nov 2021
47 Gowal2020Uncovering_34_20 Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 85.64% 56.82% WideResNet-34-20 arXiv, Oct 2020
48 Rebuffi2021Fixing_R18_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 83.53% 56.66% PreActResNet-18 arXiv, Mar 2021
49 Wang2020Improving Improving Adversarial Robustness Requires Revisiting Misclassified Examples 87.50% 56.29% WideResNet-28-10 ICLR 2020
50 Jia2022LAS-AT_34_10 LAS-AT: Adversarial Training with Learnable Attack Strategy 84.98% 56.26% WideResNet-34-10 arXiv, Mar 2022
51 Wu2020Adversarial Adversarial Weight Perturbation Helps Robust Generalization 85.36% 56.17% WideResNet-34-10 NeurIPS 2020
52 Debenedetti2022Light_XCiT-S12 A Light Recipe to Train Robust Vision Transformers 90.06% 56.14% XCiT-S12 arXiv, Sep 2022
53 Sehwag2021Proxy_R18 Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? 84.59% 55.54% ResNet-18 ICLR 2022
54 Hendrycks2019Using Using Pre-Training Can Improve Model Robustness and Uncertainty 87.11% 54.92% WideResNet-28-10 ICML 2019
55 Pang2020Boosting Boosting Adversarial Training with Hypersphere Embedding 85.14% 53.74% WideResNet-34-20 NeurIPS 2020
56 Cui2020Learnable_34_20 Learnable Boundary Guided Adversarial Training 88.70% 53.57% WideResNet-34-20 ICCV 2021
57 Zhang2020Attacks Attacks Which Do Not Kill Training Make Adversarial Learning Stronger 84.52% 53.51% WideResNet-34-10 ICML 2020
58 Rice2020Overfitting Overfitting in adversarially robust deep learning 85.34% 53.42% WideResNet-34-20 ICML 2020
59 Huang2020Self Self-Adaptive Training: beyond Empirical Risk Minimization 83.48% 53.34% WideResNet-34-10 NeurIPS 2020
60 Zhang2019Theoretically Theoretically Principled Trade-off between Robustness and Accuracy 84.92% 53.08% WideResNet-34-10 ICML 2019
61 Cui2020Learnable_34_10 Learnable Boundary Guided Adversarial Training 88.22% 52.86% WideResNet-34-10 ICCV 2021
62 Addepalli2022Efficient_RN18 Efficient and Effective Augmentation Strategy for Adversarial Training 85.71% 52.48% ResNet-18 NeurIPS 2022
63 Chen2020Adversarial Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning 86.04% 51.56% ResNet-50
(3x ensemble)
CVPR 2020
64 Chen2020Efficient Efficient Robust Training via Backward Smoothing 85.32% 51.12% WideResNet-34-10 arXiv, Oct 2020
65 Addepalli2021Towards_RN18 Scaling Adversarial Training to Large Perturbation Bounds 80.24% 51.06% ResNet-18 ECCV 2022
66 Sitawarin2020Improving Improving Adversarial Robustness Through Progressive Hardening 86.84% 50.72% WideResNet-34-10 arXiv, Mar 2020
67 Engstrom2019Robustness Robustness library 87.03% 49.25% ResNet-50 GitHub,
Oct 2019
68 Zhang2019You You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle 87.20% 44.83% WideResNet-34-10 NeurIPS 2019
69 Andriushchenko2020Understanding Understanding and Improving Fast Adversarial Training 79.84% 43.93% PreActResNet-18 NeurIPS 2020
70 Wong2020Fast Fast is better than free: Revisiting adversarial training 83.34% 43.21% PreActResNet-18 ICLR 2020
71 Ding2020MMA MMA Training: Direct Input Space Margin Maximization through Adversarial Training 84.36% 41.44% WideResNet-28-4 ICLR 2020
72 Standard Standardly trained model 94.78% 0.00% WideResNet-28-10 N/A

L2, eps=0.5

# Model ID Paper Clean accuracy Robust accuracy Architecture Venue
1 Wang2023Better_WRN-70-16 Better Diffusion Models Further Improve Adversarial Training 95.54% 84.97% WideResNet-70-16 arXiv, Feb 2023
2 Wang2023Better_WRN-28-10 Better Diffusion Models Further Improve Adversarial Training 95.16% 83.68% WideResNet-28-10 arXiv, Feb 2023
3 Rebuffi2021Fixing_70_16_cutmix_extra Fixing Data Augmentation to Improve Adversarial Robustness 95.74% 82.32% WideResNet-70-16 arXiv, Mar 2021
4 Gowal2020Uncovering_extra Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 94.74% 80.53% WideResNet-70-16 arXiv, Oct 2020
5 Rebuffi2021Fixing_70_16_cutmix_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 92.41% 80.42% WideResNet-70-16 arXiv, Mar 2021
6 Rebuffi2021Fixing_28_10_cutmix_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 91.79% 78.80% WideResNet-28-10 arXiv, Mar 2021
7 Augustin2020Adversarial_34_10_extra Adversarial Robustness on In- and Out-Distribution Improves Explainability 93.96% 78.79% WideResNet-34-10 ECCV 2020
8 Sehwag2021Proxy Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? 90.93% 77.24% WideResNet-34-10 ICLR 2022
9 Augustin2020Adversarial_34_10 Adversarial Robustness on In- and Out-Distribution Improves Explainability 92.23% 76.25% WideResNet-34-10 ECCV 2020
10 Rade2021Helper_R18_ddpm Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off 90.57% 76.15% PreActResNet-18 OpenReview, Jun 2021
11 Rebuffi2021Fixing_R18_cutmix_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 90.33% 75.86% PreActResNet-18 arXiv, Mar 2021
12 Gowal2020Uncovering Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 90.90% 74.50% WideResNet-70-16 arXiv, Oct 2020
13 Sehwag2021Proxy_R18 Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? 89.76% 74.41% ResNet-18 ICLR 2022
14 Wu2020Adversarial Adversarial Weight Perturbation Helps Robust Generalization 88.51% 73.66% WideResNet-34-10 NeurIPS 2020
15 Augustin2020Adversarial Adversarial Robustness on In- and Out-Distribution Improves Explainability 91.08% 72.91% ResNet-50 ECCV 2020
16 Engstrom2019Robustness Robustness library 90.83% 69.24% ResNet-50 GitHub,
Sep 2019
17 Rice2020Overfitting Overfitting in adversarially robust deep learning 88.67% 67.68% PreActResNet-18 ICML 2020
18 Rony2019Decoupling Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses 89.05% 66.44% WideResNet-28-10 CVPR 2019
19 Ding2020MMA MMA Training: Direct Input Space Margin Maximization through Adversarial Training 88.02% 66.09% WideResNet-28-4 ICLR 2020
20 Standard Standardly trained model 94.78% 0.00% WideResNet-28-10 N/A

Common Corruptions

# Model ID Paper Clean accuracy Robust accuracy Architecture Venue
1 Diffenderfer2021Winning_LRR_CARD_Deck A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness 96.56% 92.78% WideResNet-18-2 NeurIPS 2021
2 Diffenderfer2021Winning_LRR A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness 96.66% 90.94% WideResNet-18-2 NeurIPS 2021
3 Diffenderfer2021Winning_Binary_CARD_Deck A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness 95.09% 90.15% WideResNet-18-2 NeurIPS 2021
4 Kireev2021Effectiveness_RLATAugMix On the effectiveness of adversarial training against common corruptions 94.75% 89.60% ResNet-18 arXiv, Mar 2021
5 Hendrycks2020AugMix_ResNeXt AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty 95.83% 89.09% ResNeXt29_32x4d ICLR 2020
6 Modas2021PRIMEResNet18 PRIME: A Few Primitives Can Boost Robustness to Common Corruptions 93.06% 89.05% ResNet-18 arXiv, Dec 2021
7 Hendrycks2020AugMix_WRN AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty 95.08% 88.82% WideResNet-40-2 ICLR 2020
8 Kireev2021Effectiveness_RLATAugMixNoJSD On the effectiveness of adversarial training against common corruptions 94.77% 88.53% PreActResNet-18 arXiv, Mar 2021
9 Diffenderfer2021Winning_Binary A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness 94.87% 88.32% WideResNet-18-2 NeurIPS 2021
10 Rebuffi2021Fixing_70_16_cutmix_extra_L2 Fixing Data Augmentation to Improve Adversarial Robustness 95.74% 88.23% WideResNet-70-16 arXiv, Mar 2021
11 Kireev2021Effectiveness_AugMixNoJSD On the effectiveness of adversarial training against common corruptions 94.97% 86.60% PreActResNet-18 arXiv, Mar 2021
12 Kireev2021Effectiveness_Gauss50percent On the effectiveness of adversarial training against common corruptions 93.24% 85.04% PreActResNet-18 arXiv, Mar 2021
13 Kireev2021Effectiveness_RLAT On the effectiveness of adversarial training against common corruptions 93.10% 84.10% PreActResNet-18 arXiv, Mar 2021
14 Rebuffi2021Fixing_70_16_cutmix_extra_Linf Fixing Data Augmentation to Improve Adversarial Robustness 92.23% 82.82% WideResNet-70-16 arXiv, Mar 2021
15 Addepalli2022Efficient_WRN_34_10 Efficient and Effective Augmentation Strategy for Adversarial Training 88.71% 80.12% WideResNet-34-10 CVPRW 2022
16 Addepalli2021Towards_WRN34 Towards Achieving Adversarial Robustness Beyond Perceptual Limits 85.32% 76.78% WideResNet-34-10 arXiv, Apr 2021
17 Standard Standardly trained model 94.78% 73.46% WideResNet-28-10 N/A

CIFAR-100

Linf, eps=8/255

# Model ID Paper Clean accuracy Robust accuracy Architecture Venue
1 Wang2023Better_WRN-70-16 Better Diffusion Models Further Improve Adversarial Training 75.22% 42.67% WideResNet-70-16 arXiv, Feb 2023
2 Bai2024MixedNUTS MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers 83.08% 41.80% ResNet-152 + WideResNet-70-16 arXiv, Feb 2024
3 Cui2023Decoupled_WRN-28-10 Decoupled Kullback-Leibler Divergence Loss 73.85% 39.18% WideResNet-28-10 arXiv, May 2023
4 Wang2023Better_WRN-28-10 Better Diffusion Models Further Improve Adversarial Training 72.58% 38.83% WideResNet-28-10 ICML 2023
5 Bai2023Improving_edm Improving the Accuracy-Robustness Trade-off of Classifiers via Adaptive Smoothing 85.21% 38.72% ResNet-152 + WideResNet-70-16 + mixing network SIMODS 2024
6 Gowal2020Uncovering_extra Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 69.15% 36.88% WideResNet-70-16 arXiv, Oct 2020
7 Bai2023Improving_trades Improving the Accuracy-Robustness Trade-off of Classifiers via Adaptive Smoothing 80.18% 35.15% ResNet-152 + WideResNet-70-16 + mixing network SIMODS 2024
8 Debenedetti2022Light_XCiT-L12 A Light Recipe to Train Robust Vision Transformers 70.76% 35.08% XCiT-L12 arXiv, Sep 2022
9 Rebuffi2021Fixing_70_16_cutmix_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 63.56% 34.64% WideResNet-70-16 arXiv, Mar 2021
10 Debenedetti2022Light_XCiT-M12 A Light Recipe to Train Robust Vision Transformers 69.21% 34.21% XCiT-M12 arXiv, Sep 2022
11 Pang2022Robustness_WRN70_16 Robustness and Accuracy Could Be Reconcilable by (Proper) Definition 65.56% 33.05% WideResNet-70-16 ICML 2022
12 Cui2023Decoupled_WRN-34-10_autoaug Decoupled Kullback-Leibler Divergence Loss 65.93% 32.52% WideResNet-34-10 arXiv, May 2023
13 Debenedetti2022Light_XCiT-S12 A Light Recipe to Train Robust Vision Transformers 67.34% 32.19% XCiT-S12 arXiv, Sep 2022
14 Rebuffi2021Fixing_28_10_cutmix_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 62.41% 32.06% WideResNet-28-10 arXiv, Mar 2021
15 Jia2022LAS-AT_34_20 LAS-AT: Adversarial Training with Learnable Attack Strategy 67.31% 31.91% WideResNet-34-20 arXiv, Mar 2022
16 Addepalli2022Efficient_WRN_34_10 Efficient and Effective Augmentation Strategy for Adversarial Training 68.75% 31.85% WideResNet-34-10 NeurIPS 2022
17 Cui2023Decoupled_WRN-34-10 Decoupled Kullback-Leibler Divergence Loss 64.08% 31.65% WideResNet-34-10 arXiv, May 2023
18 Cui2020Learnable_34_10_LBGAT9_eps_8_255 Learnable Boundary Guided Adversarial Training 62.99% 31.20% WideResNet-34-10 ICCV 2021
19 Sehwag2021Proxy Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness? 65.93% 31.15% WideResNet-34-10 ICLR 2022
20 Chen2024Data_WRN_34_10 Data filtering for efficient adversarial training 64.32% 31.13% WideResNet-34-10 Pattern Recognition 2024
21 Pang2022Robustness_WRN28_10 Robustness and Accuracy Could Be Reconcilable by (Proper) Definition 63.66% 31.08% WideResNet-28-10 ICML 2022
22 Jia2022LAS-AT_34_10 LAS-AT: Adversarial Training with Learnable Attack Strategy 64.89% 30.77% WideResNet-34-10 arXiv, Mar 2022
23 Chen2021LTD_WRN34_10 LTD: Low Temperature Distillation for Robust Adversarial Training 64.07% 30.59% WideResNet-34-10 arXiv, Nov 2021
24 Addepalli2021Towards_WRN34 Scaling Adversarial Training to Large Perturbation Bounds 65.73% 30.35% WideResNet-34-10 ECCV 2022
25 Cui2020Learnable_34_20_LBGAT6 Learnable Boundary Guided Adversarial Training 62.55% 30.20% WideResNet-34-20 ICCV 2021
26 Gowal2020Uncovering Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 60.86% 30.03% WideResNet-70-16 arXiv, Oct 2020
27 Cui2020Learnable_34_10_LBGAT6 Learnable Boundary Guided Adversarial Training 60.64% 29.33% WideResNet-34-10 ICCV 2021
28 Rade2021Helper_R18_ddpm Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off 61.50% 28.88% PreActResNet-18 OpenReview, Jun 2021
29 Wu2020Adversarial Adversarial Weight Perturbation Helps Robust Generalization 60.38% 28.86% WideResNet-34-10 NeurIPS 2020
30 Rebuffi2021Fixing_R18_ddpm Fixing Data Augmentation to Improve Adversarial Robustness 56.87% 28.50% PreActResNet-18 arXiv, Mar 2021
31 Hendrycks2019Using Using Pre-Training Can Improve Model Robustness and Uncertainty 59.23% 28.42% WideResNet-28-10 ICML 2019
32 Addepalli2022Efficient_RN18 Efficient and Effective Augmentation Strategy for Adversarial Training 65.45% 27.67% ResNet-18 NeurIPS 2022
33 Cui2020Learnable_34_10_LBGAT0 Learnable Boundary Guided Adversarial Training 70.25% 27.16% WideResNet-34-10 ICCV 2021
34 Addepalli2021Towards_PARN18 Scaling Adversarial Training to Large Perturbation Bounds 62.02% 27.14% PreActResNet-18 ECCV 2022
35 Chen2020Efficient Efficient Robust Training via Backward Smoothing 62.15% 26.94% WideResNet-34-10 arXiv, Oct 2020
36 Sitawarin2020Improving Improving Adversarial Robustness Through Progressive Hardening 62.82% 24.57% WideResNet-34-10 arXiv, Mar 2020
37 Rice2020Overfitting Overfitting in adversarially robust deep learning 53.83% 18.95% PreActResNet-18 ICML 2020

Corruptions

# Model ID Paper Clean accuracy Robust accuracy Architecture Venue
1 Diffenderfer2021Winning_LRR_CARD_Deck A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness 79.93% 71.08% WideResNet-18-2 NeurIPS 2021
2 Diffenderfer2021Winning_Binary_CARD_Deck A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness 78.50% 69.09% WideResNet-18-2 NeurIPS 2021
3 Modas2021PRIMEResNet18 PRIME: A Few Primitives Can Boost Robustness to Common Corruptions 77.60% 68.28% ResNet-18 arXiv, Dec 2021
4 Diffenderfer2021Winning_LRR A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness 78.41% 66.45% WideResNet-18-2 NeurIPS 2021
5 Diffenderfer2021Winning_Binary A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness 77.69% 65.26% WideResNet-18-2 NeurIPS 2021
6 Hendrycks2020AugMix_ResNeXt AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty 78.90% 65.14% ResNeXt29_32x4d ICLR 2020
7 Hendrycks2020AugMix_WRN AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty 76.28% 64.11% WideResNet-40-2 ICLR 2020
8 Addepalli2022Efficient_WRN_34_10 Efficient and Effective Augmentation Strategy for Adversarial Training 68.75% 56.95% WideResNet-34-10 CVPRW 2022
9 Gowal2020Uncovering_extra_Linf Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 69.15% 56.00% WideResNet-70-16 arXiv, Oct 2020
10 Addepalli2021Towards_WRN34 Towards Achieving Adversarial Robustness Beyond Perceptual Limits 65.73% 54.88% WideResNet-34-10 OpenReview, Jun 2021
11 Addepalli2021Towards_PARN18 Towards Achieving Adversarial Robustness Beyond Perceptual Limits 62.02% 51.77% PreActResNet-18 OpenReview, Jun 2021
12 Gowal2020Uncovering_Linf Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples 60.86% 49.46% WideResNet-70-16 arXiv, Oct 2020

ImageNet

Note: the values (even clean accuracy) might have small fluctuations depending on the version of the packages e.g. torchvision.

Linf, eps=4/255

# Model ID Paper Clean accuracy Robust accuracy Architecture Venue
1 Liu2023Comprehensive_Swin-L A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking 78.92% 59.56% Swin-L arXiv, Feb 2023
2 Bai2024MixedNUTS MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers 81.48% 58.50% ConvNeXtV2-L + Swin-L arXiv, Feb 2024
3 Liu2023Comprehensive_ConvNeXt-L A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking 78.02% 58.48% ConvNeXt-L arXiv, Feb 2023
4 Singh2023Revisiting_ConvNeXt-L-ConvStem Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models 77.00% 57.70% ConvNeXt-L + ConvStem NeurIPS 2023
5 Liu2023Comprehensive_Swin-B A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking 76.16% 56.16% Swin-B arXiv, Feb 2023
6 Singh2023Revisiting_ConvNeXt-B-ConvStem Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models 75.90% 56.14% ConvNeXt-B + ConvStem NeurIPS 2023
7 Liu2023Comprehensive_ConvNeXt-B A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking 76.02% 55.82% ConvNeXt-B arXiv, Feb 2023
8 Singh2023Revisiting_ViT-B-ConvStem Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models 76.30% 54.66% ViT-B + ConvStem NeurIPS 2023
9 Singh2023Revisiting_ConvNeXt-S-ConvStem Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models 74.10% 52.42% ConvNeXt-S + ConvStem NeurIPS 2023
10 Singh2023Revisiting_ConvNeXt-T-ConvStem Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models 72.72% 49.46% ConvNeXt-T + ConvStem NeurIPS 2023
11 Peng2023Robust Robust Principles: Architectural Design Principles for Adversarially Robust CNNs 73.44% 48.94% RaWideResNet-101-2 BMVC 2023
12 Singh2023Revisiting_ViT-S-ConvStem Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models 72.56% 48.08% ViT-S + ConvStem NeurIPS 2023
13 Debenedetti2022Light_XCiT-L12 A Light Recipe to Train Robust Vision Transformers 73.76% 47.60% XCiT-L12 arXiv, Sep 2022
14 Debenedetti2022Light_XCiT-M12 A Light Recipe to Train Robust Vision Transformers 74.04% 45.24% XCiT-M12 arXiv, Sep 2022
15 Debenedetti2022Light_XCiT-S12 A Light Recipe to Train Robust Vision Transformers 72.34% 41.78% XCiT-S12 arXiv, Sep 2022
16 Chen2024Data_WRN_50_2 Data filtering for efficient adversarial training 68.76% 40.60% WideResNet-50-2 Pattern Recognition 2024
17 Salman2020Do_50_2 Do Adversarially Robust ImageNet Models Transfer Better? 68.46% 38.14% WideResNet-50-2 NeurIPS 2020
18 Salman2020Do_R50 Do Adversarially Robust ImageNet Models Transfer Better? 64.02% 34.96% ResNet-50 NeurIPS 2020
19 Engstrom2019Robustness Robustness library 62.56% 29.22% ResNet-50 GitHub,
Oct 2019
20 Wong2020Fast Fast is better than free: Revisiting adversarial training 55.62% 26.24% ResNet-50 ICLR 2020
21 Salman2020Do_R18 Do Adversarially Robust ImageNet Models Transfer Better? 52.92% 25.32% ResNet-18 NeurIPS 2020
22 Standard_R50 Standardly trained model 76.52% 0.00% ResNet-50 N/A

Corruptions (ImageNet-C & ImageNet-3DCC)

# Model ID Paper Clean accuracy Robust accuracy Architecture Venue
1 Tian2022Deeper_DeiT-B Deeper Insights into the Robustness of ViTs towards Common Corruptions 81.38% 67.55% DeiT Base arXiv, Apr 2022
2 Tian2022Deeper_DeiT-S Deeper Insights into the Robustness of ViTs towards Common Corruptions 79.76% 62.91% DeiT Small arXiv, Apr 2022
3 Erichson2022NoisyMix_new NoisyMix: Boosting Robustness by Combining Data Augmentations, Stability Training, and Noise Injections 76.90% 53.28% ResNet-50 arXiv, Feb 2022
4 Hendrycks2020Many The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization 76.86% 52.90% ResNet-50 ICCV 2021
5 Erichson2022NoisyMix NoisyMix: Boosting Robustness by Combining Data Augmentations, Stability Training, and Noise Injections 76.98% 52.47% ResNet-50 arXiv, Feb 2022
6 Hendrycks2020AugMix AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty 77.34% 49.33% ResNet-50 ICLR 2020
7 Geirhos2018_SIN_IN ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness 74.98% 45.76% ResNet-50 ICLR 2019
8 Geirhos2018_SIN_IN_IN ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness 77.56% 42.00% ResNet-50 ICLR 2019
9 Geirhos2018_SIN ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness 60.08% 39.92% ResNet-50 ICLR 2019
10 Standard_R50 Standardly trained model 76.72% 39.48% ResNet-50 N/A
11 Salman2020Do_50_2_Linf Do Adversarially Robust ImageNet Models Transfer Better? 68.64% 36.09% WideResNet-50-2 NeurIPS 2020
12 AlexNet ImageNet Classification with Deep Convolutional Neural Networks 56.24% 21.12% AlexNet NeurIPS 2012

Notebooks

We host all the notebooks at Google Colab:

  • RobustBench: quick start: a quick tutorial to get started that illustrates the main features of RobustBench.
  • RobustBench: json stats: various plots based on the jsons from model_info (robustness over venues, robustness vs accuracy, etc).

Feel free to suggest a new notebook based on the Model Zoo or the jsons from model_info. We are very interested in collecting new insights about benefits and tradeoffs between different perturbation types.

How to contribute

Contributions to RobustBench are very welcome! You can help to improve RobustBench:

  • Are you an author of a recent paper focusing on improving adversarial robustness? Consider adding new models (see the instructions below 👇).
  • Do you have in mind some better standardized attack? Do you want to extend RobustBench to other threat models? We'll be glad to discuss that!
  • Do you have an idea how to make the existing codebase better? Just open a pull request or create an issue and we'll be happy to discuss potential changes.

Adding a new evaluation

In case you have some new (potentially, adaptive) evaluation that leads to a lower robust accuracy than AutoAttack, we will be happy to add it to the leaderboard. The easiest way is to open an issue with the "New external evaluation(s)" template and fill in all the fields.

Adding a new model

Public model submission (Leaderboard + Model Zoo)

The easiest way to add new models to the leaderboard and/or to the model zoo, is by opening an issue with the "New Model(s)" template and fill in all the fields.

In the following sections there are some tips on how to prepare the claim.

Claim

The claim can be computed in the following way (example for cifar10, Linf threat model):

import torch

from robustbench import benchmark
from myrobust model import MyRobustModel

threat_model = "Linf"  # one of {"Linf", "L2", "corruptions"}
dataset = "cifar10"  # one of {"cifar10", "cifar100", "imagenet"}

model = MyRobustModel()
model_name = "<Name><Year><FirstWordOfTheTitle>"
device = torch.device("cuda:0")

clean_acc, robust_acc = benchmark(model, model_name=model_name, n_examples=10000, dataset=dataset,
                                  threat_model=threat_model, eps=8/255, device=device,
                                  to_disk=True)

In particular, the to_disk argument, if True, generates a json file at the path model_info/<dataset>/<threat_model>/<Name><Year><FirstWordOfTheTitle>.json which is structured in the following way (example from model_info/cifar10/Linf/Rice2020Overfitting.json):

{
  "link": "https://arxiv.org/abs/2002.11569",
  "name": "Overfitting in adversarially robust deep learning",
  "authors": "Leslie Rice, Eric Wong, J. Zico Kolter",
  "additional_data": false,
  "number_forward_passes": 1,
  "dataset": "cifar10",
  "venue": "ICML 2020",
  "architecture": "WideResNet-34-20",
  "eps": "8/255",
  "clean_acc": "85.34",
  "reported": "58",
  "autoattack_acc": "53.42"
}

The only difference is that the generated json will have only the fields "clean_acc" and "autoattack_acc" (for "Linf" and "L2" threat models) or "corruptions_acc" (for the "corruptions" threat model) already specified. The other fields have to be filled manually.

If the given threat_model is corruptions, we also save unaggregated results on the different combinations of corruption types and severities in this csv file (for CIFAR-10).

For ImageNet benchmarks, the users should specify what preprocessing should be used (e.g. resize and crop to the needed resolution). There are some preprocessings already defined in robustbench.data.PREPROCESSINGS, which can be used by specifying the key as the preprocessing parameter of benchmark. Otherwise, it's possible to pass an arbitrary torchvision transform (or torchvision-compatible transform), e.g.:

transform = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor()
    ])
clean_acc, robust_acc = benchmark(model, model_name=model_name, n_examples=10000, dataset=dataset,
                                  threat_model=threat_model, eps=8/255, device=device,
                                  to_disk=True, preprocessing=transform)
Model definition

In case you want to add a model in the Model Zoo by yourself, then you should also open a PR with the new model(s) you would like to add. All the models of each <dataset> are saved in robustbench/model_zoo/<dataset>.py. Each file contains a dictionary for every threat model, where the keys are the identifiers of each model, and the values are either class constructors, for models that have to change standard architectures, or lambda functions that return the constructed model.

If your model is a standard architecture (e.g., WideResNet), does not apply any normalization to the input nor has to do things differently from the standard architecture, consider adding your model as a lambda function, e.g.

('Cui2020Learnable_34_10', {
    'model': lambda: WideResNet(depth=34, widen_factor=10, sub_block1=True),
    'gdrive_id': '16s9pi_1QgMbFLISVvaVUiNfCzah6g2YV'
})

If your model is a standard architecture, but you need to do something differently (e.g. applying normalization), consider inheriting the class defined in wide_resnet.py or resnet.py. For example:

class Rice2020OverfittingNet(WideResNet):
    def __init__(self, depth, widen_factor):
        super(Rice2020OverfittingNet, self).__init__(depth=depth, widen_factor=widen_factor,
                                                     sub_block1=False)
        self.mu = torch.Tensor([0.4914, 0.4822, 0.4465]).float().view(3, 1, 1).cuda()
        self.sigma = torch.Tensor([0.2471, 0.2435, 0.2616]).float().view(3, 1, 1).cuda()

    def forward(self, x):
        x = (x - self.mu) / self.sigma
        return super(Rice2020OverfittingNet, self).forward(x)

If instead you need to create a new architecture, please put it in robustbench/model_zoo/archietectures/<my_architecture>.py.

Model checkpoint

You should also add your model entry in the corresponding <threat_model> dict in the file robustbench/model_zoo/<dataset>.py. For instance, let's say your model is robust against common corruptions in CIFAR-10 (i.e. CIFAR-10-C), then you should add your model to the common_corruptions dict in robustbench/model_zoo/cifar10.py.

The model should also contain the Google Drive ID with your PyTorch model so that it can be downloaded automatically from Google Drive:

    ('Rice2020Overfitting', {
        'model': Rice2020OverfittingNet(34, 20),
        'gdrive_id': '1vC_Twazji7lBjeMQvAD9uEQxi9Nx2oG-',
})

Private model submission (leaderboard only)

In case you want to keep your checkpoints private for some reasons, you can also submit your claim by opening an issue with the same "New Model(s)" template, specifying that the submission is private, and sharing the checkpoints with the email address [email protected]. In this case, we will add your model to the leaderboard but not to the Model Zoo and will not share your checkpoints publicly.

License of the models

By default, the models are released under the MIT license, but you can also tell us if you want to release your model under a customized license.

Automatic tests

In order to run the tests, run:

  • python -m unittest discover tests -t . -v for fast testing
  • RUN_SLOW=true python -m unittest discover tests -t . -v for slower testing

For example, one can test if the clean accuracy on 200 examples exceeds some threshold (70%) or if clean accuracy on 10'000 examples for each model matches the ones from the jsons located at robustbench/model_info.

Note that one can specify some configurations like batch_size, data_dir, model_dir in tests/config.py for running the tests.

Citation

Would you like to reference the RobustBench leaderboard or you are using models from the Model Zoo?
Then consider citing our whitepaper:

@inproceedings{croce2021robustbench,
  title     = {RobustBench: a standardized adversarial robustness benchmark},
  author    = {Croce, Francesco and Andriushchenko, Maksym and Sehwag, Vikash and Debenedetti, Edoardo and Flammarion, Nicolas and Chiang, Mung and Mittal, Prateek and Matthias Hein},
  booktitle = {Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year      = {2021},
  url       = {https://openreview.net/forum?id=SSKZPJCt7B}
}

Contact

Feel free to contact us about anything related to RobustBench by creating an issue, a pull request or by email at [email protected].

robustbench's People

Contributors

cnocycle avatar craymichael avatar dash29 avatar dedeswim avatar evansuva avatar fra31 avatar framartin avatar hollen0318 avatar huanranchen avatar jeromerony avatar lumurillo avatar max-andr avatar ngoctnq avatar nmndeep avatar ofkar avatar shengyun-peng avatar vsehwag avatar ymerkli avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

robustbench's Issues

Multi GPU error on DMPreActResNet

self.mean_cuda = self.mean.cuda()robustbench/model_zoo/architectures/dm_wide_resnet.py#L279
self.std_cuda = self.std.cuda()robustbench/model_zoo/architectures/dm_wide_resnet.py#L280
This referenced code lines force to put the two tensor on cuda:0. So in case of multiple GPU, I have got an error like this:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:3 and cuda:0!

  • the code should be changed from deprecated .cuda() to .to(device)
  • device should be a global parameter that a user can specify

Training parameters of models in robustbench

Hi,

I might have missed this detail in your paper but is there any consistency of training params across the models in robustbench for things like epochs trained, optimizer, momentum, lr schedule etc. Would I have to check the original papers of the methods to find the training params they used?

Thanks

dropout rate of the Standard network

There is an argument to set dropout rate in the implementation of WideResnet class. And the default value is zero. I am wondering if I can modify this argument and set it to a non-zero value.

class BasicBlock(nn.Module): def __init__(self, in_planes, out_planes, stride, dropRate=0.0):

OSError: [Errno 101] Network is unreachable

There are some errors when I use your robustness toolbox. I need some help.
In my machine browser, https://drive.google.com can be accessed
source code:

from robustbench.utils import load_model
from robustbench.eval import benchmark

model = load_model(model_name='Rebuffi2021Fixing_70_16_cutmix_extra',
                   dataset='cifar10',
                   threat_model='Linf')

clean_acc, robust_acc = benchmark(model,
                                  dataset='cifar10',
                                  threat_model='Linf')
print(clean_acc,robust_acc) 

error:

Download started: path=models/cifar10/Linf/Rebuffi2021Fixing_70_16_cutmix_extra.pt (gdrive_id=1qKDTp6IJ1BUXZaRtbYuo_t0tuDl_4mLg)

Traceback (most recent call last):                                                                                                                                                                           
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/urllib3/connection.py", line 174, in _new_conn                                                                                               
    conn = connection.create_connection(                                                                                                                                                                     
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/urllib3/util/connection.py", line 96, in create_connection                                                                                   
    raise err                                                                                                                                                                                                
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/urllib3/util/connection.py", line 86, in create_connection                                                                                   
    sock.connect(sa)                                                                                                                                                                                         
OSError: [Errno 101] Network is unreachable                                                                                                                                                                  
                                                                                                                                                                                                             
During handling of the above exception, another exception occurred:                                                                                                                                          
                                                                                                                                                                                                             
Traceback (most recent call last):                                                                                                                                                                           
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/urllib3/connectionpool.py", line 699, in urlopen                                                                                             
    httplib_response = self._make_request(                                                                                                                                                                   
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/urllib3/connectionpool.py", line 382, in _make_request                                                                                       
    self._validate_conn(conn)                                                                                                                                                                                
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1010, in _validate_conn                                                                                     
    conn.connect()                                                                                                                                                                                           
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/urllib3/connection.py", line 358, in connect                                                                                                 
    conn = self._new_conn()                                                                                                                                                                                  
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/urllib3/connection.py", line 186, in _new_conn                                                                                               
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f07c1761cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 4, in <module>
    model = load_model(model_name='Rebuffi2021Fixing_70_16_cutmix_extra',
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/robustbench/utils.py", line 122, in load_model
    download_gdrive(models[model_name]['gdrive_id'], model_path)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/robustbench/utils.py", line 49, in download_gdrive
    response = session.get(url_base, params={'id': gdrive_id}, stream=True)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/requests/sessions.py", line 555, in get
    return self.request('GET', url, **kwargs)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='docs.google.com', port=443): Max retries exceeded with url: /uc?export=download&id=1qKDTp6IJ1BUXZaRtbYuo_t0tuDl_4mLg (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f07c1761cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 4, in <module>
    model = load_model(model_name='Rebuffi2021Fixing_70_16_cutmix_extra',
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/robustbench/utils.py", line 122, in load_model
    download_gdrive(models[model_name]['gdrive_id'], model_path)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/robustbench/utils.py", line 49, in download_gdrive
    response = session.get(url_base, params={'id': gdrive_id}, stream=True)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/requests/sessions.py", line 555, in get
    return self.request('GET', url, **kwargs)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/requests/sessions.py", line 542, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/requests/sessions.py", line 655, in send
    r = adapter.send(request, **kwargs)
  File "/home/ams/anaconda3/envs/tf/lib/python3.8/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='docs.google.com', port=443): Max retries exceeded with url: /uc?export=download&id=1qKDTp6IJ1BUXZaRtbYuo_t0tuDl_4mLg (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f07c1761cd0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))

how can i resolve this error ?

[New Model] <Addepalli2022DAJAT>

Paper Information

Leaderboard Claim(s)

Model 1

  • Architecture: ResNet18
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 85.71
  • Robust accuracy: 52.50
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: checkpoint code

Model 2

  • Architecture: WideResNet34-10
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 88.71
  • Robust accuracy: 57.81
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: checkpoint code

Model 3

  • Architecture: ResNet18
  • Dataset: cifar100
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 65.45
  • Robust accuracy: 27.69
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: checkpoint code

Model 4

  • Architecture: WideResNet34-10
  • Dataset: cifar100
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 68.75
  • Robust accuracy: 31.85
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: checkpoint code

Model 5

  • Architecture: WideResNet34-10
  • Dataset: cifar10
  • Threat Model: Common Corruptions
  • eps: None
  • Clean accuracy: 88.71
  • Robust accuracy: 80.12
  • Additional data: false
  • Evaluation method:
  • Checkpoint and code: checkpoint code

Model 6

  • Architecture: WideResNet34-10
  • Dataset: cifar100
  • Threat Model: Common Corruptions
  • eps: None
  • Clean accuracy: 68.75
  • Robust accuracy: 56.95
  • Additional data: false
  • Evaluation method:
  • Checkpoint and code: checkpoint code

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Notes:

For loading the model we use the following definition:
{ 'Addepalli2022DAJAT_RN18_C10', {
'model':
'lambda': ResNet18(),
'gdrive_id': '1tg0zyiwpo4AuC8SpW64-zx5Lvut_BI-Y'
}),

{ 'Addepalli2022DAJAT_WRN34_C10', {
'model':
'lambda': WideResNet(num_classes=10, depth=34),
'gdrive_id': '15sKVAMBY-Igtso75KNXAyPNBBH_NEWpn'
}),

{ 'Addepalli2022DAJAT_WRN34_C100', {
'model':
'lambda': WideResNet(num_classes=100, depth=34),
'gdrive_id': '1cwYBj3diYdZ3q8VLycik3qOGMBm29w59'
}),

{ 'Addepalli2022DAJAT_RN18_C100', {
'model':
'lambda': ResNet18(),
'gdrive_id': '1xWBKeuRPSaQdA7xZe3U7Mm9DVLpdv153'
})

[New Model] <Diffenderfer2021Winning>

Paper Information

  • Paper Title: A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness (NeurIPS 2021)
  • Paper URL: https://arxiv.org/abs/2106.09129
  • Paper Authors: James Diffenderfer, Brian R. Bartoldson, Shreya Chaganti, Jize Zhang, Bhavya Kailkhura

Leaderboard Claim(s)

There are 4 models for each of the datasets CIFAR-10 and CIFAR-100. All of the models are pruned to 95% sparsity and two of the models also have binary weights. Additionally, two of the four models for each dataset are prediction averaging ensembles of sparse models (one with full precision weights and the other with binary weights). The models and necessary code run in RobustBench but I just uploaded it to my github (https://github.com/chrundle/Diffenderfer2021Winning_robustbench) since it was not entirely clear if I should just push that branch and submit a pull request instead of creating the issue for new models. Since I integrated the models with RobustBench, I ran the benchmark tests on the models and the results are available in the appropriate subdirectories of model_info. For convenience in reproducing the results, there are files in the directory Diffenderfer2021Winning_benchmarks for running the benchmarks for each of the submitted models. The models are already hosted on my google drive and the appropriate links are in the version of RobustBench on my github (so the models should download when running the benchmark tests).

Model 1

Model 2

Model 3

Model 4

Model 5

Model 6

Model 7

Model 8

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL here)

About the inconsistency between 'robust accuracy' and 'Adversarial accuracy'

I noticed that the final robust accuracy in 'AutoAttack' should be the same as the Adversarial accuracy in the benchmark part. (see

adversary = AutoAttack(model,
).

adversary = AutoAttack(model,
                       norm=threat_model_.value,
                       eps=eps,
                       version='standard',
                       device=device)
x_adv = adversary.run_standard_evaluation(clean_x_test, clean_y_test)
adv_accuracy = clean_accuracy(model,
                              x_adv,
                              clean_y_test,
                              batch_size=batch_size,
                              device=device)

This is correct for the two models provided in the robustbench package: Wu2020Adversarial and Zhang2020Attacks.
But when I run my own model on 100 samples in CIFAR-10, which is also WideResNet-34-10, I got some different results (see below).

[2021/03/19 22:32:05] - Files already downloaded and verified
[2021/03/19 22:32:07] - Clean accuracy: 80.00%
[2021/03/19 22:32:07] - setting parameters for standard version
[2021/03/19 22:32:07] - using standard version including apgd-ce, apgd-t, fab-t, square
[2021/03/19 22:32:07] - initial accuracy: 80.00%
[2021/03/19 22:32:21] - apgd-ce - 1/1 - 27 out of 80 successfully perturbed
[2021/03/19 22:32:21] - robust accuracy after APGD-CE: 53.00% (total time 13.8 s)
[2021/03/19 22:33:47] - apgd-t - 1/1 - 9 out of 53 successfully perturbed
[2021/03/19 22:33:47] - robust accuracy after APGD-T: 44.00% (total time 100.2 s)
[2021/03/19 22:36:22] - fab-t - 1/1 - 2 out of 44 successfully perturbed
[2021/03/19 22:36:22] - robust accuracy after FAB-T: 42.00% (total time 255.3 s)
[2021/03/19 22:40:18] - square - 1/1 - 0 out of 42 successfully perturbed
[2021/03/19 22:40:18] - robust accuracy after SQUARE: 42.00% (total time 490.5 s)
[2021/03/19 22:40:18] - max Linf perturbation: 0.03137, nan in tensor: 0, max: 1.00000, min: 0.00000
[2021/03/19 22:40:18] - robust accuracy: 42.00%
[2021/03/19 22:40:18] - Adversarial accuracy: 60.00%

The Adversarial accuracy is super high compared to the robust accuracy.
I tried but could not debug this by myself.
Please let me know if you need further information.

[New Model] <Modas2021PRIME>

Paper Information

  • Paper Title: PRIME: A Few Primitives Can Boost Robustness to Common Corruptions
  • Paper URL: https://arxiv.org/abs/2112.13547
  • Paper authors: Apostolos Modas, Rahul Rade, Guillermo Ortiz-Jiménez, Seyed-Mohsen Moosavi-Dezfooli, Pascal Frossard

Leaderboard Claim(s)

Model 1

  • Dataset: cifar10
  • Architecture: ResNet-18
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 93.06
  • Robust accuracy: 89.05
  • Additional data: false
  • Checkpoint and code: [Checkpoint] [Evaluation code] [Logs].

Model 2

  • Dataset: cifar100
  • Architecture: ResNet-18
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 77.60
  • Robust accuracy: 68.28
  • Additional data: false
  • Checkpoint and code: [Checkpoint] [Evaluation code] [Logs].

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

can not load Wu2020Adversarial on cifar100

There seems some bugs in Wu2020Adversarial.pt on cifar100 dataset

when I run

from robustbench import load_model

model = load_model(model_name='Wu2020Adversarial', dataset='cifar100', threat_model='Linf')

I got:

Download started: path=models/cifar100/Linf/Wu2020Adversarial.pt (gdrive_id=1yWGvHmrgjtd9vOpV5zVDqZmeGhCgVYq7)
Download finished: path=models/cifar100/Linf/Wu2020Adversarial.pt (gdrive_id=1yWGvHmrgjtd9vOpV5zVDqZmeGhCgVYq7)
Traceback (most recent call last):
  File "main.py", line 3, in <module>
    model = load_model(model_name='Wu2020Adversarial', dataset='cifar100', threat_model='Linf')
  File "/mnt/d/Documents/Code/robustbench/utils.py", line 124, in load_model
    model = _safe_load_state_dict(model, model_name, state_dict)
  File "/mnt/d/Documents/Code/robustbench/utils.py", line 169, in _safe_load_state_dict
    raise e
  File "/mnt/d/Documents/Code/robustbench/utils.py", line 164, in _safe_load_state_dict
    model.load_state_dict(state_dict, strict=True)
  File "/home/hengpan/miniconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Wu2020AdversarialNet:
        Missing key(s) in state_dict: "mu", "sigma".

[New Model] <Dai2021Parameterizing>

Paper Information

  • Paper Title: Parameterizing Activation Functions for Adversarial Robustness
  • Paper URL: https://arxiv.org/abs/2110.05626
  • Paper authors: Sihui Dai, Saeed Mahloujifar, Prateek Mittal

Leaderboard Claim(s)

Model 1

  • Architecture: WRN-28-10-PSSiLU
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 87.02
  • Robust accuracy: 61.55
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: (Code) (Checkpoint).
  • Comments: It uses additional ~6M synthetic images in training.

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Model Definition

('Dai2021Parameterizing', { 'model': lambda: pssilu_wrn_28_10(num_classes=10), 'gdrive_id': '1ZKvonJNQMKKjE5uv4hoMhxC33rRUwnj9' })

RuntimeError: non-empty 3D or 4D input tensor expected but got ndim: 4

I try to test my attack policy using robustbench cifar100 model. When I test "Rebuffi2021Fixing_70_16_cutmix_ddpm","Cui2020Learnable_34_20_LBGAT6" and "Gowal2020Uncovering", it shows up a bug as follows:

Traceback (most recent call last):
File "run10.py", line 284, in
result_accuray = get_policy_accuracy(model,policy)
File "run10.py", line 137, in get_policy_accuracy
tmp_acc_total = get_attacker_accuracy(model,new_attack)
File "run10.py", line 121, in get_attacker_accuracy
adv_images, p = apply_attacker(test_images, attack_name, test_labels, model, attack_eps, previous_p, int(attack_steps), args.max_epsilon, _type=args.norm, gpu_idx=gpu_idx,)
File "/workspace/ours/attack_ops.py", line 2511, in apply_attacker
return augment_fn(x=img, y=y, model=model, magnitude=magnitude, previous_p=p, max_iters=steps,max_eps=max_eps, target=target, _type=_type, gpu_idx=gpu_idx)
File "/workspace/ours/attack_ops.py", line 1290, in ApgdDlrAttack
y_pred = model(x).max(1)[1]
File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/robustbench/model_zoo/architectures/dm_wide_resnet.py", line 183, in forward
out = F.avg_pool2d(out, 8)
RuntimeError: non-empty 3D or 4D input tensor expected but got ndim: 4

I think my code is ok, but I have no idea how to fix this bug. Please help.

pickle.UnpicklingError: invalid load key, '<'.

my pytorch version is 1.9+ and when I run

from robustbench import load_model
model_path = os.path.join(root, 'model')
model = load_model(model_name='Sehwag2020Hydra', model_dir=model_path, dataset='cifar10', threat_model='Linf')

It returns

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/workspace/util/robustbench/robustbench/utils.py", line 114, in load_model
    checkpoint = torch.load(model_path, map_location=torch.device('cpu'))
  File "/home/miniconda/envs/tftorch/lib/python3.6/site-packages/torch/serialization.py", line 608, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/miniconda/envs/tftorch/lib/python3.6/site-packages/torch/serialization.py", line 777, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.

does any body know why?

[New Model] <Tian2022Deeper>

Paper Information

  • Paper Title: Deeper Insights into the Robustness of ViTs towards Common Corruptions
  • Paper URL: https://arxiv.org/abs/2204.12143
  • Paper authors: Rui Tian, Zuxuan Wu, Qi Dai, Han Hu, Yu-Gang Jiang

Leaderboard Claim(s)

Model 1

  • Architecture: deit_small_patch16_224
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 77.32%
  • Robust accuracy: IN-C: 55.67% IN-3DCC: 59.34%
  • Additional data: false
  • Evaluation method: N/A
  • Checkpoint and code: checkpoint code eval log

Model 2

  • Architecture: deit_base_patch16_224
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 80.32%
  • Robust accuracy: IN-C: 62.88% IN-3DCC: 64.32%
  • Additional data: false
  • Evaluation method: N/A
  • Checkpoint and code: checkpoint code eval log

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among those here (check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Can I update my model after submission?

Hi, thank you for your kind reply.

I will write a paper as fast as I can, since I can submit my model only after writing a paper.

I have one more question to ask.

If I find a better model after submission, can I update my model?

For example, if I find best model with wide-resnet 28-10 at first and later I find more robust model with wide-resnet 70-16,
can I submit additional model?

And also, after my submission, if I find a better method which improves robust accuracy, can I update my model?

What should I do if I find a better model or method after my first submission.

Thanks.

Cui2020 cifar100 WideResNet models fail to load

The Linf cifar100 models by Cui et. al. fail to load:

>>> load_model(model_name='Cui2020Learnable_34_20_LBGAT6', dataset='cifar100', threat_model='Linf')
...
RuntimeError: Error(s) in loading state_dict for WideResNet:
Unexpected key(s) in state_dict: "sub_block1.layer.0.bn1.weight", ...

The issue is that the checkpointed Wideresnet by Cui et. al. uses sub_block1 (compare here). The wideresnet architecture used by robustbench has sub_block1 as optional and by default, sub_block1 is not included, thus causing the error when loading the model.

This is a simple fix in the model dict in cifar100.py.

[New Model] <Jia2022LAS-AT>

Paper Information

  • Paper Title: LAS-AT: Adversarial Training with Learnable Attack Strategy
  • Paper URL: https://arxiv.org/abs/2203.06616
  • Paper authors: Xiaojun Jia, Yong Zhang, Baoyuan Wu, Ke Ma, Jue Wang, Xiaochun Cao

Leaderboard Claim(s)

Add here the claim for your model(s). Copy and paste the following subsection for the number of models you want to add.

Model 1

Model 2

Model 3

Model 4

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • [] I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

ImageNet evaluation

Hi authors,

I'm preparing the my manuscript and need to do evaluation on ImageNet dataset. In my test, I got 71.92% clean accuracy and 39.87% robust accuracy. The results are much better than the best existing work. I'm wondering whether the configurations are correct or not in my experiments. I hope the author can help me clarify the following four questions.

  1. the robust evaluation: As mentioned in the original paper, the computational cost of evaluating robust accuracy on all validation set is unaffordable. A practical solution is evaluating the robust accuracy on first 5000 validation data. AutoAttack is used on L-inf with epsilon = 4/255 and the rest of configuration are default (apgd-t is 10 classes).

  2. the clean evaluation: Does the clean evaluation means accuracy on all validation data or first 5000 validation data which are used for the robust evaluation? In my test, I used the latter one.

  3. the domain of image space: Generally, the ImageNet models do normalization by subtracting the mean and dividing by standard deviation, but my ImageNet model applies 0-1 normalization as commonly used CIFAR10/CIFAR100 models. Is the comparison fair or not? Specifically, in 0-1 normalization, PGD attack does a 0-1 clip in each step which ensures all pixels are valid, but this operation is invalid for commonly used ImageNet models.

  4. the selection of the validation data: the robust accuracy of my model in the first 5000 and the second 5000 data have about 2% fluctuation. The gap is quite large and I think the robust evaluation may need to be refined.

I can provide my code for evaluating ImageNet model and checkpoint if necessary.

Thanks for your help.

Only using 10% ImageNet-C

I found that there are only 5000 images for each corruption for each ImageNet-C, while there are 50000 images for each corruption in the ImageNet-C dataset. I found the results for Imagenet-C are also for this 10% subset. How is this subset sampled? I really appreciate any help you can provide.

[New Model] <Sridhar2021Robust>

Paper Information

  • Paper Title: Robust Learning via Persistency of Excitation
  • Paper URL: https://arxiv.org/abs/2106.02078
  • Paper authors: Kaustubh Sridhar, Oleg Sokolsky, Insup Lee, James Weimer

Leaderboard Claim(s)

Add here the claim for your model(s). Copy and paste the following subsection for the number of models you want to add.

Model 1

Model 2

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Imagenet-C loader can only return 5k images

When loading imagenetc with a batchsize of over 5k images, you always get 5k images back.
This doesn't throw out an error, and it can be confusing when you expect to receive more images than you actually get.

This behavior can be shown using this code snippet:

from robustbench.data import load_imagenetc
x_test, y_test = load_imagenetc(50000, 5, path, False, ['brightness'])
print(x_test.size()) 

Downloaded model .pt files seem to be empty

Hi,

I am trying to load a model using RobustBench however I get a "EOFError: Ran out of input" error when it is trying to load the model. My code is `from robustbench.data import load_cifar10

x_test, y_test = load_cifar10(n_examples=50)

from robustbench.utils import load_model

model = load_model(model_name='Standard', dataset='cifar10', threat_model='Linf')`

But it seems to be the case for various other models also.

Any help in sorting this would be much appreciated, many thanks

[New Model] <Huang2022Two>

Paper Information

  • Paper Title:Two Heads are Better than One: Robust Learning Meets Multi-branch Models
  • Paper URL:https://arxiv.org/pdf/2208.08083.pdf
  • Paper authors:Dong HUANG, Qingwen BU, Yuhao QING, Haowen PI, Sen WANG, Heming CUI

Leaderboard Claim(s)

Add here the claim for your model(s). Copy and paste the following subsection for the number of models you want to add.

Model 1

  • Architecture:WRN28-10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 87.5%
  • Robust accuracy: 65.8% by torchattacks.AutoAttack
  • Additional data: False
  • Evaluation method: torchattack.AutoAttack
  • Checkpoint and code: source code, checkpoint

Model 2

  • Architecture:WRN34-10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 87.7%
  • Robust accuracy: 66.3% by torchattacks.AutoAttack
  • Additional data: False
  • Evaluation method: torchattack.AutoAttack
  • Checkpoint and code: source code, checkpoint will be released in the future.

Model 3

  • Architecture:WRN70-16
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 88.3%
  • Robust accuracy: 67.3% by torchattacks.AutoAttack
  • Additional data: False
  • Evaluation method: torchattack.AutoAttack
  • Checkpoint and code: source code, checkpoint will be released in the future.

Model 4

  • Architecture:WRN28-10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 63.7%
  • Robust accuracy: 38.2% by torchattacks.AutoAttack
  • Additional data: False
  • Evaluation method: torchattack.AutoAttack
  • Checkpoint and code: source code, checkpoint will be released in the future.

Model 5

  • Architecture:WRN28-10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 64.2%
  • Robust accuracy: 39.0% by torchattacks.AutoAttack
  • Additional data: False
  • Evaluation method: torchattack.AutoAttack
  • Checkpoint and code: source code, checkpoint will be released in the future.

Model 6

  • Architecture:WRN28-10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 63.68%
  • Robust accuracy: 41.5% by torchattacks.AutoAttack
  • Additional data: False
  • Evaluation method: torchattack.AutoAttack
  • Checkpoint and code: source code, checkpoint will be released in the future.

Model Zoo:

  • [ x] I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • [x ] I agree to release my model(s) under MIT license (check if true) OR
  • [ x] I want my models to be released under a custom license, located here: (custom license URL
    here)

load_imagenet bug in "robustbench/loaders.py"

When I load imagenet using the code below
from robustbench.data import load_imagenet load_imagenet(n_examples=50, data_dir='/root/hhtpro/123/imagenet')
where imagenet folder is where to store val image of imagenet. But it report error
"
File "/root/miniconda/lib/python3.8/site-packages/robustbench/loaders.py", line 20, in make_custom_dataset
with open(pkg_resources.resource_filename(name, path_imgs), 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/root/miniconda/lib/python3.8/site-packages/robustbench/data/imagenet_test_image_ids.txt'
"
I check the code in data and find it need a file named imagenet_test_image_ids.txt in source code, but it's not there.
Can you solve this bug?

Loading models error

Hi

I am interested in CIFAR100 models, but sometimes loading a model give an error and sometimes not, am I doing something wrong ?

the error:

`UnpicklingError Traceback (most recent call last)
in ()
----> 1 model = load_model(model_name='Gowal2020Uncovering', dataset='cifar100', threat_model='Linf')

2 frames
/usr/local/lib/python3.7/dist-packages/torch/serialization.py in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
775 "functionality.")
776
--> 777 magic_number = pickle_module.load(f, **pickle_load_args)
778 if magic_number != MAGIC_NUMBER:
779 raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '<'`

Reproducing the error :
https://colab.research.google.com/drive/1sF177aQtaZDLgm652lOj_0tsVJIqB6Fi?usp=sharing

Thanks for open sourcing such a project

Install version

I use the command
pip install git+https://github.com/RobustBench/[email protected]
to install the dependency. However, what I get is
Successfully installed robustbench-0.1.

I tried to uninstall and re-install robustbench. But I still get version-0.1.

How to fix it?

Thanks

UnicodeDecodeError during installation

Hi,

When installing via command pip install git+https://github.com/fra31/auto-attack, I encountered the following error, related to setup.py & README.md:

File "C:\Users\user\AppData\Local\Temp\pip-req-build-6ihokzxp\setup.py", line 4, in <module>
      long_description = fh.read()
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa4 in position 3438: illegal multibyte sequence

My system is Windows 7 x64.
How do I proceed? Thank you!

[New Model] <Rade2021Helper>

Paper Information

  • Paper Title: Helper-based Adversarial Training: Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off
  • Paper URL: https://openreview.net/forum?id=BuD2LmNaU3a
  • Paper authors: Rahul Rade, Seyed-Mohsen Moosavi-Dezfooli

Leaderboard Claim(s)

Model 1

  • Architecture: PreActResNet-18
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 89.02
  • Robust accuracy: 57.67
  • Additional data: true
  • Evaluation method: AutoAttack
  • Checkpoint and code: [Code] [Checkpoint] [How to evaluate].

Model 2

  • Architecture: PreActResNet-18
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 86.86
  • Robust accuracy: 57.09
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: [Code] [Checkpoint] [How to evaluate].
  • Footnote: It uses additional 1M synthetic images for training.

Model 3

  • Architecture: PreActResNet-18
  • Dataset: cifar10
  • Threat Model: L2
  • eps: 0.5
  • Clean accuracy: 90.57
  • Robust accuracy: 76.07
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: [Code] [Checkpoint] [How to evaluate].
  • Footnote: It uses additional 1M synthetic images for training.

Model 4

  • Architecture: PreActResNet-18
  • Dataset: cifar100
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 61.50
  • Robust accuracy: 28.88
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: [Code] [Checkpoint] [How to evaluate].
  • Footnote: It uses additional 1M synthetic images for training.

Model 5

  • Architecture: WideResNet-34-10
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 91.47
  • Robust accuracy: 62.83
  • Additional data: true
  • Evaluation method: AutoAttack
  • Checkpoint and code: [Code] [Checkpoint] [How to evaluate].

Model 6

  • Architecture: WideResNet-28-10
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 88.16
  • Robust accuracy: 60.97
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: [Code] [Checkpoint] [How to evaluate].
  • Footnote: It uses additional 1M synthetic images for training.

All the checkpoints contain the AutoAttack evaluation output in the file log-aa.log.

...

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Problem with AutoAttack and autograd.gradcheck function in Pytorch

Dear all,

Thank you for this project: it is very valuable.

I try to run your tutorial on the collab you provided and I get a bug at "AutoAttack evaluation" section

When I run the command - I get

ImportError: cannot import name 'zero_gradients' from 'torch.autograd.gradcheck' (/usr/local/lib/python3.7/dist-packages/torch/autograd/gradcheck.py)

Can you help me with this issue

Regards

Adrien

Can not use the transformer-based model by following the instruction in the ReadMe

e.g. I want to use Debenedetti2022Light_XCiT-L12, so I run the following scripts:
'from robustbench import load_model
model = load_model(model_name='Carmon2019Unlabeled', dataset='cifar10', threat_model='Linf')'
However, I received a Keyerror: not find Carmon2019Unlabeled, I have found this model is created by timm. So, maybe there is somewhere incompatible?

Failed to run `robustbench.leaderboard.template`

Hi there,

I tried to run the leaderboard HTML generator module, but failed.

$ python -m robustbench.leaderboard.template
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/weilinxu/coder/robustbench/robustbench/leaderboard/template.py", line 6, in <module>
    from jinja2 import Environment, PackageLoader, select_autoescape
  File "/home/weilinxu/coder/robustbench/.venv/lib/python3.8/site-packages/jinja2/__init__.py", line 12, in <module>
    from .environment import Environment
  File "/home/weilinxu/coder/robustbench/.venv/lib/python3.8/site-packages/jinja2/environment.py", line 25, in <module>
    from .defaults import BLOCK_END_STRING
  File "/home/weilinxu/coder/robustbench/.venv/lib/python3.8/site-packages/jinja2/defaults.py", line 3, in <module>
    from .filters import FILTERS as DEFAULT_FILTERS  # noqa: F401
  File "/home/weilinxu/coder/robustbench/.venv/lib/python3.8/site-packages/jinja2/filters.py", line 13, in <module>
    from markupsafe import soft_unicode
ImportError: cannot import name 'soft_unicode' from 'markupsafe' (/home/weilinxu/coder/robustbench/.venv/lib/python3.8/site-packages/markupsafe/__init__.py)

I created a clean virtualenv for this. (I wonder if the Jinja2 in requirements.txt is outdated .)

$ pip list
Package           Version   Location                        
----------------- --------- --------------------------------
autoattack        0.1       
bandit            1.7.4     
certifi           2022.6.15 
chardet           4.0.0     
geotorch          0.3.0     
gitdb             4.0.9     
GitPython         3.1.27    
idna              2.10      
Jinja2            2.11.3    
MarkupSafe        2.1.1     
numpy             1.23.1    
pandas            1.2.5     
pbr               5.9.0     
Pillow            9.2.0     
pip               20.0.2    
pkg-resources     0.0.0     
python-dateutil   2.8.2     
pytz              2022.1    
PyYAML            6.0       
requests          2.25.1    
robustbench       1.0       /home/weilinxu/coder/robustbench
scipy             1.8.1     
setuptools        44.0.0    
six               1.16.0    
smmap             5.0.0     
stevedore         4.0.0     
torch             1.12.0    
torchdiffeq       0.2.3     
torchvision       0.13.0    
tqdm              4.56.2    
typing-extensions 4.3.0     
urllib3           1.26.11   
wheel             0.34.2

[New Model] <Addepalli2021OAAT>

Paper Information

  • Paper Title: Towards Achieving Adversarial Robustness Beyond Perceptual Limits
  • Paper URL: https://openreview.net/forum?id=SHB_znlW5G7
  • Paper authors: Sravanti Addepalli, Samyak Jain, Gaurang Sriramanan, Shivangi Khare, Venkatesh Babu Radhakrishnan

Leaderboard Claims

Model 1

  • Architecture: ResNet18
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 80.24
  • Robust accuracy: 51.06
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: Code Checkpoint

Model 2

  • Architecture: WideResNet34-10
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 85.32
  • Robust accuracy: 58.04
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: Code Checkpoint

Model 3:

  • Architecture: PreActResnet18
  • Dataset: cifar100
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 62.02
  • Robust accuracy: 27.14
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: Code Checkpoint

Model 4:

  • Architecture: WideResNet34-10
  • Dataset: cifar100
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 65.73
  • Robust accuracy: 30.35
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: Code Checkpoint

Model 5:

  • Architecture: WideResNet34-10
  • Dataset: cifar10
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 85.32
  • Robust accuracy: 76.78
  • Additional data: false
  • Evaluation method:
  • Checkpoint and code: Code Checkpoint

Model 6:

  • Architecture: PreActResnet18
  • Dataset: cifar100
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 62.02
  • Robust accuracy: 51.77
  • Additional data: false
  • Evaluation method:
  • Checkpoint and code: Code Checkpoint

Model 7:

  • Architecture: WideResNet34-10
  • Dataset: cifar100
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 65.73
  • Robust accuracy: 54.88
  • Additional data: false
  • Evaluation method:
  • Checkpoint and code: Code Checkpoint

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Notes:

For loading the model we use the following definition:

{ 'Addepalli2021OAAT_RN18', {
         'model':
         'lambda':  ResNet18(),
         'gdrive_id': '1WNR5rECFuGX14XwcPdR0NLOHQ67C9_wq'
}),

{ 'Addepalli2021OAAT_WRN34', {
         'model':
         'lambda':  WideResNet(num_classes=10, depth=34, sub_block1 = True),
         'gdrive_id': '1A428tCJ_IhJZtg14KJ3htmu9w27gJnvc'
}),

{ 'Addepalli2021OAAT_WRN34', {
         'model':
         'lambda':  WideResNet(num_classes=100, depth=34, sub_block1 = True),
         'gdrive_id': '1WduJJI-pwO9E_1jJIh2ZafR7q5h1c3ES'
}),

{ 'Addepalli2021OAAT_PARN18', {
         'model':
         'lambda':  PreActResNet18(),
         'gdrive_id': '1jOqRKVrrU9184-eHDwfd_lbhX-DvsMOQ'
}),

Kindly note that for PreActResNet we use PreActBlockV2 instead of PreActBlock with bn_before_fc = True in the init function of PreActResNet.

[New Model] <Kang2021Stable>

Paper Information

  • Stable Neural ODE with Lyapunov-Stable Equilibrium Points for Defending Against Adversarial Attacks:
  • https://arxiv.org/abs/2110.12976:
  • Qiyu Kang, Yang Song, Qinxu Ding, Wee Peng Tay:

Leaderboard Claim(s)

Add here the claim for your model(s). Copy and paste the following subsection for the number of models you want to add.

Model 1

  • Architecture: DMWideResNet, ODE
  • Threat Model: Linf
  • eps: 8/255
  • Clean accuracy: 93.73
  • Robust accuracy: 71.28
  • Additional data: true
  • Evaluation method: AutoAttack
  • Checkpoint and code: https://github.com/KANGQIYU/SODEF.
  • Note: We would like to submit the result of the model Rebuffi2021+SODEF.

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Incorrect preprocessing for ImageNet-C evaluation

I see that the ImageNet-C evaluation uses the preprocessing: Resize(256)+CenterCrop(224)+ToTensor().

def load_imagenetc(
n_examples: Optional[int] = 5000,
severity: int = 5,
data_dir: str = './data',
shuffle: bool = False,
corruptions: Sequence[str] = CORRUPTIONS,
prepr: str = 'Res256Crop224'
) -> Tuple[torch.Tensor, torch.Tensor]:
transforms_test = PREPROCESSINGS[prepr]

This causes discrepancies with the scores reported in the original papers (DeepAugment, AugMix, Standard RN-50). The ImageNet-C dataset already contains 224x224 images and hence only ToTensor() should be used for consistency.

Fixing prepr='none' in load_imagenetc should solve the issue (assuming all the models are capable of handling 224x224 images as input).

load model issue using pickle

When I am using python 3.8 to load the image net model , the code reports that
File "/home/zaitang/anaconda3/envs/pytorch/lib/python3.9/site-packages/torch/serialization.py", line 762, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.

Could you please help to fix it?

[New Model] <Chen2021LTD>

Paper Information

  • Paper Title: LTD: Low Temperature Distillation for Robust Adversarial Training
  • Paper URL: https://arxiv.org/abs/2111.02331
  • Paper authors: Erh-Chung Chen, Che-Rung Lee

Leaderboard Claim(s)

Model 1

  • Architecture: WRN-34-10
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8 / 255
  • Clean accuracy: 85.21%
  • Robust accuracy: 56.94%
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: ckpt detail

Model 2

  • Architecture: WRN-34-20
  • Dataset: cifar10
  • Threat Model: Linf
  • eps: 8 / 255
  • Clean accuracy: 86.03%
  • Robust accuracy: 57.71%
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: ckpt detail

Model 3

  • Architecture: WRN-34-10
  • Dataset: cifar100
  • Threat Model: Linf
  • eps: 8 / 255
  • Clean accuracy: 64.07%
  • Robust accuracy: 30.59%
  • Additional data: false
  • Evaluation method: AutoAttack
  • Checkpoint and code: ckpt detail

...

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Can I upload my model before publishing a paper?

Hi, I'm interested in adversarial defense and studying this area.

Your New model template requires paper information.

But I have not published my paper yet.

Can I upload my model before publishing a paper?

And also, I want to reveal my training code after publishing my paper.

Can I only upload my pretrained model and evaluation code?

Thnaks.

[New Model] <Pang2022Robustness>

Paper Information

  • Paper Title: Robustness and Accuracy Could Be Reconcilable by (Proper) Definition
  • Paper URL: https://arxiv.org/pdf/2202.10103.pdf
  • Paper authors: Tianyu Pang, Min Lin, Xiao Yang, Jun Zhu, Shuicheng Yan

Leaderboard Claim(s)

Add here the claim for your model(s). Copy and paste the following subsection for the number of models you want to add.

Model 1

Model 2

Model 3

Model 4

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

[New Model] <Huang2021Exploring>

Paper Information

  • Paper Title: Exploring Architectural Ingredients of Adversarially Robust Deep Neural Networks (NeurIPS 2021)
  • Paper URL: https://arxiv.org/abs/2110.03825
  • Paper authors: Hanxun Huang, Yisen Wang, Sarah Monazam Erfani, Quanquan Gu, James Bailey, Xingjun Ma

Leaderboard Claim(s)

Add here the claim for your model(s). Copy and paste the following subsection for the number of models you want to add.

Model 1

  • Architecture: WRN-34-R
  • Threat Model: Linf
  • eps: 8
  • Clean accuracy: 90.56
  • Robust accuracy: 61.56
  • Additional data: True
  • Evaluation method: AutoAttack
  • Checkpoint and code: https://github.com/HanxunH/RobustWRN
  • Note: Additioinal 500k data used in Carmon2019Unlabeled.

Model 2

  • Architecture: WRN-34-R
  • Threat Model: Linf
  • eps: 8
  • Clean accuracy: 91.23
  • Robust accuracy: 62.54
  • Additional data: True
  • Evaluation method: AutoAttack
  • Checkpoint and code: https://github.com/HanxunH/RobustWRN
  • Note: Additioinal 500k data used in Carmon2019Unlabeled, with exponential moving average.

Model Zoo:

  • I want to add my models to the Model Zoo (check if true)
  • I use an architecture that is not included among
    those here (
    check if true).
  • I agree to release my model(s) under MIT license (check if true) OR
  • I want my models to be released under a custom license, located here: (custom license URL
    here)

Notes:

I have adapated our model defination and pretrained weights into the robustbench framework.

('Huang2021Exploring', {
            'model':
            lambda: RobustWideResNet(num_classes=10,
                                     channel_configs=[16, 320, 640, 512],
                                     depth_configs=[5, 5, 5]),
            'gdrive_id': '1Qxv_Q4slvp84s6ETfP7EEhzpyuqvPxNF'
}),
('Huang2021Exploring_ema', {
            'model':
            lambda: RobustWideResNet(num_classes=10,
                                     channel_configs=[16, 320, 640, 512],
                                     depth_configs=[5, 5, 5]),
            'gdrive_id': '1A13xrwItjJTfxyK1kiuYWSgxq0yYYc7D'
}),

class RobustWideResNet(nn.Module):
    def __init__(self, num_classes=10, channel_configs=[16, 160, 320, 640],
                 depth_configs=[5, 5, 5], stride_config=[1, 2, 2],
                 drop_rate_config=[0.0, 0.0, 0.0]):
        super(RobustWideResNet, self).__init__()
        assert len(channel_configs) - 1 == len(depth_configs) == len(stride_config) == len(drop_rate_config)
        self.channel_configs = channel_configs
        self.depth_configs = depth_configs
        self.stride_config = stride_config

        self.stem_conv = nn.Conv2d(3, channel_configs[0], kernel_size=3,
                                   stride=1, padding=1, bias=False)
        self.blocks = nn.ModuleList([])
        for i, stride in enumerate(stride_config):
            self.blocks.append(NetworkBlock(block=BasicBlock,
                                            nb_layers=depth_configs[i],
                                            in_planes=channel_configs[i],
                                            out_planes=channel_configs[i+1],
                                            stride=stride,
                                            dropRate=drop_rate_config[i],))

        # global average pooling and classifier
        self.bn1 = nn.BatchNorm2d(channel_configs[-1])
        self.relu = nn.ReLU(inplace=True)
        self.global_pooling = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Linear(channel_configs[-1], num_classes)
        self.fc_size = channel_configs[-1]

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
            elif isinstance(m, nn.BatchNorm2d) or isinstance(m, nn.GroupNorm):
                m.weight.data.fill_(1)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.bias.data.zero_()

    def forward(self, x):
        out = self.stem_conv(x)
        for i, block in enumerate(self.blocks):
            out = block(out)
        out = self.relu(self.bn1(out))
        out = self.global_pooling(out)
        out = out.view(-1, self.fc_size)
        out = self.fc(out)
        return out

[Bug] Getting pickle error upon loading existing model

Running:

from robustbench.data import load_cifar10
from robustbench.utils import load_model
x_test, y_test = load_cifar10(n_examples=50, data_dir='/data/dataset/cifar10')
model = load_model(model_name='Carmon2019Unlabeled', dataset='cifar10', threat_model='Linf')

returns:

  File "/home/gilad/venv_py37_new/lib/python3.7/site-packages/torch/serialization.py", line 777, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.

my pip list output is attached.
pip_list.txt

[NewModel] Erichson2022NoisyMix

Paper Information

  • Paper Title: NoisyMix: Boosting Robustness by Combining Data Augmentations, Stability Training, and Noise Injections:
  • Paper URL: https://arxiv.org/pdf/2202.01263.pdf:
  • Paper authors: N. Benjamin Erichson, Soon Hoe Lim, Francisco Utrera, Winnie Xu, Ziang Cao, and Michael W. Mahoney:

Leaderboard Claim(s)

Add here the claim for your model(s). Copy and paste the following subsection for the number of models you want to add.

Model 1

  • Architecture: ResNet-50
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 0.7714
  • Robust accuracy: 0.5225493333333333
  • Additional data: false
  • Evaluation method: ImageNet-C
  • Checkpoint and code: https://github.com/erichson/NoisyMix.

Model 2

  • Architecture: Wide-ResNet-28x4
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 0.8116
  • Robust accuracy: 0.7206066666666667
  • Additional data: false
  • Evaluation method: CIFAR-100-C
  • Checkpoint and code: https://github.com/erichson/NoisyMix.

Model 2

  • Architecture: Wide-ResNet-28x4
  • Threat Model: Common Corruptions
  • eps: N/A
  • Clean accuracy: 0.9673
  • Robust accuracy: 0.9277826666666665
  • Additional data: false
  • Evaluation method: CIFAR-10-C
  • Checkpoint and code: https://github.com/erichson/NoisyMix.

Model Zoo:

  • [] I want to add my models to the Model Zoo (check if true)
  • [] I use an architecture that is not included among
    those here (
    check if true).
  • [] I agree to release my model(s) under MIT license (check if true) OR
  • [] I want my models to be released under a custom license, located here: (custom license URL
    here)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.