ebagdasa / backdoor_federated_learning Goto Github PK
View Code? Open in Web Editor NEWSource code for paper "How to Backdoor Federated Learning" (https://arxiv.org/abs/1807.00459)
License: MIT License
Source code for paper "How to Backdoor Federated Learning" (https://arxiv.org/abs/1807.00459)
License: MIT License
Since the files are quite old I am suspecting the resnet model configurations are redundant now and we can directly load resnet configs from torch.models?
Or did you define the configurations because of some additional customizations on base models?
Thank you for providing us with a good code.
The link in Corpus parsed dataset seems to have expired at this time.
Can you upload it again?
Hi I am interested in running the code for word backdoor. It seems that the data is missing.
Could you provide the link to download the data to reddit comments?
Many thanks!
Yingqi
Hi, I am trying to implement the attack as the paper, but found out if split the poison data for train and test, I cannot get a good poison model( either it will show high backdoor acc but low main task acc or show low acc in backdoor but high in the main task).
With so little data, I am not understanding how the attacker model can learn poison data's semantic feature.
Hello. Can you please answer one question for me? Do you know why this code is removing the backdoor data?
`
range_no_id = list(range(50000))
for image in self.params['poison_images'] + self.params['poison_images_test']:
if image in range_no_id:
range_no_id.remove(image)
`
The code above is at line 103 in the backdoor_federated_learning/image_helper.py.
And in train.py, the adversary should change the poisoned data's label. If poisoned_data ( = helper.poisoned_data_for_train) has removed those specific images, like green cars, cars with racing stripes and cars with vertically striped walls in the background, how do the adversary change the labels of that specific images?
Thank you for your help.
I have some problems when i am trying to reproduce the experiment results in this paper.
I pull the source code and run it with params.yaml after setting is_poison=True and switching some report loss options to True. After finishing training i found that the test accuracy on backdoor tasks cant reach 100% as it is said in the paper, it usually remains to be around 70% sometimes lower. I am wondering if there is something need to be set particularly or something wrong with my params.yaml. Could someone plz help me?
here are my params.yaml and the final results.
type | image |
---|---|
lr | 0.1 |
momentum | 0.9 |
decay | 0.0005 |
batch_size | 64 |
no_models | 10 |
epochs | 10100 |
retrain_no_times | 2 |
number_of_total_participants | 100 |
sampling_dirichlet | True |
dirichlet_alpha | 0.9 |
eta | 1 |
save_model | True |
save_on_epochs | [10, 100, 500, 1000, 2000, 5000] |
resumed_model | False |
environment_name | ppdl_experiment_Jul.13_13.34 |
report_train_loss | True |
report_test_loss | True |
report_poison_loss | True |
track_distance | False |
track_clusters | False |
modify_poison | False |
poison_type | wall |
poison_images_test | [330, 568, 3934, 12336, 30560] |
poison_images | [30696, 33105, 33615, 33907, 36848, 40713, 41706] |
poison_image_id | 2775 |
poison_image_id_2 | 1605 |
poison_label_swap | 2 |
size_of_secret_dataset | 200 |
poisoning_per_batch | 1 |
poison_test_repeat | 1000 |
is_poison | True |
baseline | False |
random_compromise | False |
noise_level | 0.01 |
poison_epochs | [10000] |
retrain_poison | 15 |
scale_weights | 100 |
poison_lr | 0.05 |
poison_momentum | 0.9 |
poison_decay | 0.005 |
poison_step_lr | True |
clamp_value | 1.0 |
alpha_loss | 1.0 |
number_of_adversaries | 1 |
poisoned_number | 2 |
results_json | False |
s_norm | 1000000 |
diff_privacy | False |
fake_participants_load | False |
fake_participants_file | data/reddit/updates_cifar.pt.tar |
fake_participants_save | False |
current_time | Jul.13_14.56.08 |
adversary_list | [0] |
Even with "is_poison"=false, the accuracy is only about 10%. When "is_poison"=true and batch_size=264, I get the results as follows:
When there are adversaries, accuracy of backdoor is about 100%,accuracy without backdoor is increasing as epoch increases. But when there are not adversaries, accuracy is always about 10%
thank you for your code firstly.
could you please tell me the specific version number of "visdom PyYAML torchvision tqdm"?
I can open the Google BigQuery,so what should i do?Should I clean the data and download?If you can provide the data,many thanks!!!
I pull your source code from github and try to experiment it, but I did not get a good experimental results.
___Test Target_ResNet_18 poisoned: False, epoch: 5: Average loss: 2.3041, Accuracy: 1001/10000 (10.0100%)
Done in 6.461248397827148 sec.
After every epoch,it always get "Accuracy: 1001/10000 (10.0100%)"
I hope you don’t mind my asking, and could you please give me some general instructions step by step and all the parameter settings in params.yaml for your experiments in your paper.
It seems to me that poison_dataset() didn't poison the data, it just sampled 64 images 200 times and required them not to be images from "posion_image" and "poison_images_test". But what does that have to do with data poisoning.
Thank you for publishing this nice work. It reports an error when I run the code at the function 'train()' in training.py, line 663. The following error displayed:
Traceback (most recent call last):
File "D:/cmWorks/backdoor_federated_learning-master/training.py", line 667, in
is_poison=helper.params['is_poison'], last_weight_accumulator=weight_accumulator)
File "D:/cmWorks/backdoor_federated_learning-master/training.py", line 183, in train
name='Classification Loss', win='poison')
File "D:\cmWorks\backdoor_federated_learning-master\models\simple.py", line 43, in train_vis
opts=dict(showlegend=True, width=700, height=400, title='Train loss_{0}'.format(self.created_time)))
File "E:\Anaconda3\envs\pyTorch\lib\site-packages\visdom_init_.py", line 389, in wrapped_f
return f(*args, **kwargs)
File "E:\Anaconda3\envs\pyTorch\lib\site-packages\visdom_init_.py", line 1715, in line
update=update, name=name)
File "E:\Anaconda3\envs\pyTorch\lib\site-packages\visdom_init_.py", line 389, in wrapped_f
return f(*args, **kwargs)
File "E:\Anaconda3\envs\pyTorch\lib\site-packages\visdom_init_.py", line 1640, in scatter
return self.send(data_to_send, endpoint=endpoint)
File "E:\Anaconda3\envs\pyTorch\lib\site-packages\visdom_init.py", line 711, in send
data=json.dumps(msg),
File "E:\Anaconda3\envs\pyTorch\lib\json_init.py", line 231, in dumps
return _default_encoder.encode(obj)
File "E:\Anaconda3\envs\pyTorch\lib\json\encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "E:\Anaconda3\envs\pyTorch\lib\json\encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "E:\Anaconda3\envs\pyTorch\lib\json\encoder.py", line 180, in default
o.class.name)
TypeError: Object of type 'Tensor' is not JSON serializable
It seems like the function "json.dumps()" can not fit the type of some "kwargs". Is it caused by the version's problems or something else?
Thanks a lot!
Not easy to read and comprehend. Takes me four days to get through it.
不同版本的pytorch会对实验结果有影响吗
I noticed that the last_weight_accumulator in the train function is not used. Will this cause the network training speed to drop?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.