Coder Social home page Coder Social logo

ebagdasa / backdoors101 Goto Github PK

View Code? Open in Web Editor NEW
317.0 6.0 77.0 29.89 MB

Backdoors Framework for Deep Learning and Federated Learning. A light-weight tool to conduct your research on backdoors.

License: MIT License

Python 99.49% HTML 0.51%
backdoors machine-learning research security pytorch adversarial-machine-learning adversarial deep-learning-security neural-trojan ml-backdoors

backdoors101's People

Contributors

davidhidde avatar dependabot[bot] avatar ebagdasa avatar phil0042 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

backdoors101's Issues

General questions regarding the framework

Hi,

I'm researching defenses against the blind backdoor attack. I have a couple of questions regarding the backdoors 101 framework w.r.t. defenses:

  • I can't seem to find the implementations of the defenses (NC and SentiNet) mentioned in the Readme. Are these implemented and if so: how are they implemented and how could I add new defenses myself?

  • Are the backdoor tasks that change the task of the model (MultiMNIST addition, MultiMNIST multiply) also implemented?

  • Does the framework provide anything to evaluate the models retrieved from the training process?

Thanks for your help in advance. If possible, I will contribute some defenses after my research is done.

Questions Regarding the code Implementation

Hi, thanks for the code!

I have some questions regarding the code implementation.

  1. In the line 119 of attack.py file, I think the purpose here is to scale the local update of a compromised client so that the local update can replace the global model as described in the equation (3) of the paper How To Backdoor Federated Learning. In the implementation, the scaling factor is set to self.params.fl_weight_scale. And in the config file, it was set to the total number of participants, However, I think this is not correct as it does not take the parameter fl_eta (server-side stepsize) into account, which is used in here to perform global weight update. Also, I think it ignores the fact that the training protocol allows partial participation as implied by this line here. From what I have in mind, the scaling factor should be num_of_participants_at_the_attacked_round / fl_eta.

  2. In the model simple.py, a F.log_softmax is applied. But later, the attack uses the the nn.CrossEntropyLoss, which ends with "normalizing" the neural net's output twice. This seems to be a bit weird to me. Is there any specific reason for this?

Thank you!

How do you measure the effectiveness of the attack?

Hi there, I would like to ask how do you measure the effectiveness of the attack? For instance, I tried to launch a pixel pattern attack on CIFAR-10 via the code. From the paper "Blind Backdoors in Deep Learning Models", I saw that there is a main-task accuracy and backdoor-task accuracy measure as shown below
image

Is it possible to produce these results via the code? If so, how do I proceed? If not, what are other measures to measure the effectiveness of an attack?

about the PIPA dataset

Hello, I am preparing for my graduate, which aims at Person Recognition.
However, I failed to find the PIPA dataset in the Internet, since the pulic link to the dataset has gone.

Could you share the PIPA? Thanks very much in advance.
Looking for your reply.

Questions about low accuracy of Test_backdoor_True

Hey, Eugene Bagdasaryan.
Thanks a lot for the sharing of the codes of "How To Backdoor Federated Learning".

But I met some problems when I was trying to run cifar_fed with the default settings in your codes with:

python training.py --name cifar10 --params configs/cifar_fed.yaml

I got very low accuracy of Test_backdoor_True.

image

I'd really appreciate it if you could tell me why.
Thanks a lot.

Bug in save_model function

Hi,
the save_model function does not properly save the best checkpoint. The reason being the following two lines of code.

self.best_loss = float('inf')

if val_loss < self.best_loss:

During training, save_model is called and loss_val contains the accuracy of the current iteration on the test set, not the loss value.

Fix:
Change the initial value of self.best_loss and modify the comparison (maybe rename self.best_loss and val_loss as well).
self.best_loss = float(0) and if val_loss >= self.best_loss:

Problem saving results into "runs" and "saved_models"

Hi there,

As I am a beginner on Federated Learning and its backdoor attacks, may I check how do I view the training results on tensorboard? Nothing shows on the tensorboard.
image

Even when I aborted the training, it shows the error "Aborted training. No output generated". I have created the folders "runs" and "saved_models" as mentioned in the instructions.
image

can't get a clear result

I'm new to this study.So I want to recurrence your work,but aftter the end of 'python training.py --name mnist --params configs/mnist_params.yaml --commit none ', I can't get a clear result.
i can see some of the processes while the program is running.But there is no logs in runs/ or saved_models/ . and ' No scalar data was found. ' in tensorboard.
like thouse:
2022-11-26 22:03:54 - WARNING - Backdoor True . Epoch: 349. Accuracy: Top-1: 100.00 | Loss: value: 0.00
0it [00:00, ?it/s]2022-11-26 22:03:54 - INFO - Epoch: 350. Batch: 0/938. Losses: ['backdoor: 0.00', 'normal: 0.00', 'total: 0.00']. Scales: ['backdoor: 0.25', 'normal: 0.75']
99it [00:03, 28.02it/s]2022-11-26 22:03:58 - INFO - Epoch: 350. Batch: 100/938. Losses: ['backdoor: 0.00', 'normal: 0.00', 'total: 0.00']. Scales: ['backdoor: 0.23', 'normal: 0.77']
197it [00:07, 28.73it/s]2022-11-26 22:04:01 - INFO - Epoch: 350. Batch: 200/938. Losses: ['backdoor: 0.00', 'normal: 0.00', 'total: 0.00']. Scales: ['backdoor: 0.21', 'normal: 0.79']

test_loader is NoneType Object

when I run the training.py, I got this error, then I check the task.py file, the test_loader was initalized None. How can I solve it?
image

Running FL

Hi! Please what does the eta and fl_weight_scale stand for in the Federated Learning setup? Thank you!

AttributeError: 'NoneType' object has no attribute 'to'

When i try to run training with ‘python training.py --name mnist --params configs/mnist_params.yaml --commit none’, the following error occurs:

Traceback (most recent call last):
File "training.py", line 119, in
helper = Helper(params)
File "D:\lab\backdoors101\helper.py", line 40, in init
self.make_task()
File "D:\lab\backdoors101\helper.py", line 64, in make_task
self.task = task_class(self.params)
File "D:\lab\backdoors101\tasks\task.py", line 43, in init
self.init_task()
File "D:\lab\backdoors101\tasks\task.py", line 49, in init_task
self.model = self.model.to(self.params.device)
AttributeError: 'NoneType' object has no attribute 'to'

Then I find that the function build_model() in class Task is 'NotImplemented'. Does it mean that i have to make some changes to the code before i use 'python training.py --name mnist --params configs/mnist_params.yaml --commit none'?

Questions about the low benign accuracy on CIFAR-10 and GTSRB dataset of Blind Backdoor

Hi, Eugene Bagdasaryan,

Congratulations on the acceptance of your paper `Blind Backdoors in Deep Learning Models' and thanks for the sharing of its codes.

However, when we run your code on CIFAR-10 dataset and GTSRB dataset, we get a very low benign accuracy (CIFAR: BA: 18.24, ASR: 98.64; GTSRB: BA: 5.7, ASR: 100) with the default settings in your codes. (PS: we get satisified results on MNIST (BA: 98.86, ASR: 99.99)). We are not for sure where the problems are or whether you used different settings in the experiments of your paper. Can you kindly help us for this problem?

Besides, we also reproduce your codes in our open-sourced toolbox (https://github.com/THUYimingLi/BackdoorBox/blob/main/core/attacks/Blind.py) based on your codes and we meet the same problem. I would be very grateful if you can also help us to check our reproduced codes.

Best Regard,
Yiming Li

Questions regarding evading Neural Cleanse

Hi,

Thanks for sharing the code.

I am trying to reproduce the results in the USENIX paper Blind Backdoors in Deep Learning Models that evade the Neural Cleanse defense. I am using the MNIST dataset. I assume if I uncomment the line "- neural_cleanse" in "loss_tasks" in configs/mnist_params.yaml, this should be the same loss function as the one described in Section 6.1 in the paper. Correct me if this is not the case.

So I train a model using the above setting, which is supposed to evade the detection by Neural Cleanse. However, when I use Neural Cleanse to scan this trained model, I get an anomaly index larger than 2, which means the trained model is still considered to be backdoored.

Is there anything not configured properly? Would you be able to take a look? I'd really appreciate it.

pip install failing

There are multiples issues with installing the version of the packages

  1. numpy~=1.18.4 : Throws error: subprocess-exited-with-error
  2. torch, torchtext versions missing, or is it because of a different Python version. It throws this: ERROR: Ignored the following versions that require a different python version: 0.7 Requires-Python >=3.6, <3.7; 0.8 Requires-Python >=3.6, <3.7 ERROR: Could not find a version that satisfies the requirement torchtext~=0.7.0 (from versions: 0.1.1, 0.2.0, 0.2.1, 0.2.3, 0.3.1, 0.4.0, 0.5.0, 0.6.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.11.0, 0.11.1, 0.11.2, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.14.1, 0.15.1, 0.15.2) ERROR: No matching distribution found for torchtext~=0.7.0
    Is updating the functions according to the recent versions of the libraries/packages used the only way to go forward?

Question regarding federated experiment with multiple GPUs on one node (machine)

Dear authors, thank you very much for your nice work! From the two papers you wrote, I found some of your experiments were run on either 2 or 4 Titan X GPUs. I was wondering if some experiments in this repo, e.g., cifar federated, can run on multiple GPUs as well? Could you please point me to the point where this is achieved in the code (I couldn't find any code related to torch.dp or torch.ddp)? Many thanks!

Enquiries about the attacks

Does the fact that the function synthesizes_inputs is not implemented mean that all of the attacks in the paper are still not implemented ? Or only batch poisoning ?

Where can you find the dataset for training of model?

Hi there, I was wondering where are the datasets (e.g. CIFAR-10) stored? As I am trying to launch a backdoor attack with image-scaling, how or where can I store my own images? After training with the poisoned images, a model will be saved into the saved_models folder as shown here:
image

From here, how should I proceed to test whether the attack is successful?

I am sorry for these questions as I am still a beginner in machine learning.

Question about parameter fl_eta in cifar_fed.yaml

Hi @ebagdasa,

Thanks for sharing code.

I am trying to run cifar_fed with command,

    python training.py --name cifar --params configs/cifar_fed.yaml --commit none

I am a little confused about the parameter fl_eta.

In function, run_fl_round (training.py) , the variable, round_participants,

    round_participants = hlpr.task.sample_users_for_round(epoch)

uses parameter fl_no_models (cifar_fed.yaml) to decide the number of users updating weights to server, for example 10 in cifar_fed.yaml.

Then, the code

    hlpr.task.update_global_model(weight_accumulator, global_model)

calls the function update_global_model (fl_task.py).

In function update_global_model (fl_tas.py),

    def update_global_model(self, weight_accumulator, global_model: Module):
        for name, sum_update in weight_accumulator.items():
            if self.check_ignored_weights(name):
                continue
            scale = self.params.fl_eta / self.params.fl_total_participants
            average_update = scale * sum_update
            self.dp_add_noise(average_update)
            model_weight = global_model.state_dict()[name]
            model_weight.add_(average_update)

the sum_update is the sum of all users' weights, which is supposed to be divided by the value of fl_no_model. In the code, however, you use variables scale

    scale = self.params.fl_eta / self.params.fl_total_participants
    average_update = scale * sum_update

to process the sum_update. I didn't find any explains of this logic in papers or any comments in the code.

I wonder would you mind giving more details about the usage of fl_eta?
My questions are,
1. Why sum_update doesn't divide fl_no_model?
2. What is the meaning of self.params.fl_eta / self.params.fl_total_participants?
3. How should I set the fl_eta, if I trying to increase the value of fl_no_model?

Thanks

I can't download dateset

when I run "python training.py --name mnist --params configs/mnist_params.yaml --commit none"
my ternimal will says" urllib.error.HTTPError: HTTP Error 503: Service Unavailable"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.