Coder Social home page Coder Social logo

thunlp / openbackdoor Goto Github PK

View Code? Open in Web Editor NEW
147.0 10.0 22.0 38.73 MB

An open-source toolkit for textual backdoor attack and defense (NeurIPS 2022 D&B, Spotlight)

Home Page: https://openbackdoor.readthedocs.io/

License: Apache License 2.0

Shell 1.00% Python 99.00%
nlp backdoor-attacks

openbackdoor's Introduction

OpenBackdoor

Documentation Status GitHub PRs are Welcome

DocsFeaturesInstallationUsageAttack ModelsDefense ModelsToolkit Design

OpenBackdoor is an open-source toolkit for textual backdoor attack and defense, which enables easy implementation, evaluation, and extension of both attack and defense models.

Features

OpenBackdoor has the following features:

  • Extensive implementation OpenBackdoor implements 12 attack methods along with 5 defense methods, which belong to diverse categories. Users can easily replicate these models in a few lines of code.

  • Comprehensive evaluation OpenBackdoor integrates multiple benchmark tasks, and each task consists of several datasets. Meanwhile, OpenBackdoor supports Huggingface's Transformers and Datasets libraries.

  • Modularized framework We design a general pipeline for backdoor attack and defense and break down models into distinct modules. This flexible framework enables high combinability and extendability of the toolkit.

Installation

You can install OpenBackdoor through Git

Git

git clone https://github.com/thunlp/OpenBackdoor.git
cd OpenBackdoor
python setup.py install

Download Datasets

OpenBackdoor supports multiple tasks and datasets. You can download the datasets for each task with bash scripts. For example, download sentiment analysis datasets by

cd datasets
bash download_sentiment_analysis.sh
cd ..

Usage

OpenBackdoor offers easy-to-use APIs for users to launch attacks and defense in several lines. The below code blocks present examples of built-in attack and defense. After installation, you can try running demo_attack.py and demo_defend.py to check if OpenBackdoor works well:

Attack

# Attack BERT on SST-2 with BadNet
import openbackdoor as ob 
from openbackdoor import load_dataset
# choose BERT as victim model 
victim = ob.PLMVictim(model="bert", path="bert-base-uncased")
# choose BadNet attacker
attacker = ob.Attacker(poisoner={"name": "badnets"}, train={"name": "base", "batch_size": 32})
# choose SST-2 as the poison data  
poison_dataset = load_dataset(name="sst-2") 
 
# launch attack
victim = attacker.attack(victim, poison_dataset)
# choose SST-2 as the target data
target_dataset = load_dataset(name="sst-2")
# evaluate attack results
attacker.eval(victim, target_dataset)

Defense

# Defend BadNet attack BERT on SST-2 with ONION
import openbackdoor as ob 
from openbackdoor import load_dataset
# choose BERT as victim model 
victim = ob.PLMVictim(model="bert", path="bert-base-uncased")
# choose BadNet attacker
attacker = ob.Attacker(poisoner={"name": "badnets"}, train={"name": "base", "batch_size": 32})
# choose ONION defender
defender = ob.defenders.ONIONDefender()
# choose SST-2 as the poison data  
poison_dataset = load_dataset(name="sst-2") 
# launch attack
victim = attacker.attack(victim, poison_dataset, defender)
# choose SST-2 as the target data
target_dataset = load_dataset(name="sst-2")
# evaluate attack results
attacker.eval(victim, target_dataset, defender)

Results

OpenBackdoor summarizes the results in a dictionary and visualizes key messages as below:

results

Play with configs

OpenBackdoor supports specifying configurations using .json files. We provide example config files in configs.

To use a config file, just run the code

python demo_attack.py --config_path configs/base_config.json

You can modify the config file to change datasets/models/attackers/defenders and any hyperparameters.

Plug your own attacker/defender

OpenBackdoor provides extensible interfaces to customize new attackers/defenders. You can define your own attacker/defender class

Customize Attacker
class Attacker(object):

    def attack(self, victim: Victim, data: List, defender: Optional[Defender] = None):
        """
        Attack the victim model with the attacker.

        Args:
            victim (:obj:`Victim`): the victim to attack.
            data (:obj:`List`): the dataset to attack.
            defender (:obj:`Defender`, optional): the defender.

        Returns:
            :obj:`Victim`: the attacked model.

        """
        poison_dataset = self.poison(victim, data, "train")

        if defender is not None and defender.pre is True:
            poison_dataset["train"] = defender.correct(poison_data=poison_dataset['train'])
        backdoored_model = self.train(victim, poison_dataset)
        return backdoored_model

    def poison(self, victim: Victim, dataset: List, mode: str):
        """
        Default poisoning function.

        Args:
            victim (:obj:`Victim`): the victim to attack.
            dataset (:obj:`List`): the dataset to attack.
            mode (:obj:`str`): the mode of poisoning.
        
        Returns:
            :obj:`List`: the poisoned dataset.

        """
        return self.poisoner(dataset, mode)

    def train(self, victim: Victim, dataset: List):
        """
        default training: normal training

        Args:
            victim (:obj:`Victim`): the victim to attack.
            dataset (:obj:`List`): the dataset to attack.
    
        Returns:
            :obj:`Victim`: the attacked model.
        """
        return self.poison_trainer.train(victim, dataset, self.metrics)

An attacker contains a poisoner and a trainer. The poisoner is used to poison the dataset. The trainer is used to train the backdoored model.

You can set your own data poisoning algorithm as a poisoner

class Poisoner(object):

    def poison(self, data: List):
        """
        Poison all the data.

        Args:
            data (:obj:`List`): the data to be poisoned.
        
        Returns:
            :obj:`List`: the poisoned data.
        """
        return data

And control the training schedule by a trainer

class Trainer(object):

    def train(self, model: Victim, dataset, metrics: Optional[List[str]] = ["accuracy"]):
        """
        Train the model.

        Args:
            model (:obj:`Victim`): victim model.
            dataset (:obj:`Dict`): dataset.
            metrics (:obj:`List[str]`, optional): list of metrics. Default to ["accuracy"].
        Returns:
            :obj:`Victim`: trained model.
        """

        return self.model
Customize Defender

To write a custom defender, you need to modify the base defender class. In OpenBackdoor, we define two basic methods for a defender.

  • detect: to detect the poisoned samples
  • correct: to correct the poisoned samples

You can also implement other kinds of defenders.

class Defender(object):
    """
    The base class of all defenders.

    Args:
        name (:obj:`str`, optional): the name of the defender.
        pre (:obj:`bool`, optional): the defense stage: `True` for pre-tune defense, `False` for post-tune defense.
        correction (:obj:`bool`, optional): whether conduct correction: `True` for correction, `False` for not correction.
        metrics (:obj:`List[str]`, optional): the metrics to evaluate.
    """
    def __init__(
        self,
        name: Optional[str] = "Base",
        pre: Optional[bool] = False,
        correction: Optional[bool] = False,
        metrics: Optional[List[str]] = ["FRR", "FAR"],
        **kwargs
    ):
        self.name = name
        self.pre = pre
        self.correction = correction
        self.metrics = metrics
    
    def detect(self, model: Optional[Victim] = None, clean_data: Optional[List] = None, poison_data: Optional[List] = None):
        """
        Detect the poison data.

        Args:
            model (:obj:`Victim`): the victim model.
            clean_data (:obj:`List`): the clean data.
            poison_data (:obj:`List`): the poison data.
        
        Returns:
            :obj:`List`: the prediction of the poison data.
        """
        return [0] * len(poison_data)

    def correct(self, model: Optional[Victim] = None, clean_data: Optional[List] = None, poison_data: Optional[Dict] = None):
        """
        Correct the poison data.

        Args:
            model (:obj:`Victim`): the victim model.
            clean_data (:obj:`List`): the clean data.
            poison_data (:obj:`List`): the poison data.
        
        Returns:
            :obj:`List`: the corrected poison data.
        """
        return poison_data

Attack Models

  1. (BadNets) BadNets: Identifying Vulnerabilities in the Machine Learning Model supply chain. Tianyu Gu, Brendan Dolan-Gavitt, Siddharth Garg. 2017. [paper]
  2. (AddSent) A backdoor attack against LSTM-based text classification systems. Jiazhu Dai, Chuanshuai Chen. 2019. [paper]
  3. (SynBkd) Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger. Fanchao Qi, Mukai Li, Yangyi Chen, Zhengyan Zhang, Zhiyuan Liu, Yasheng Wang, Maosong Sun. 2021. [paper]
  4. (StyleBkd) Mind the Style of Text! Adversarial and Backdoor Attacks Based on Text Style Transfer. Fanchao Qi, Yangyi Chen, Xurui Zhang, Mukai Li, Zhiyuan Liu, Maosong Sun. 2021. [paper]
  5. (POR) Backdoor Pre-trained Models Can Transfer to All. Lujia Shen, Shouling Ji, Xuhong Zhang, Jinfeng Li, Jing Chen, Jie Shi, Chengfang Fang, Jianwei Yin, Ting Wang. 2021. [paper]
  6. (TrojanLM) Trojaning Language Models for Fun and Profit. Xinyang Zhang, Zheng Zhang, Shouling Ji, Ting Wang. 2021. [paper]
  7. (SOS) Rethinking Stealthiness of Backdoor Attack against NLP Models. Wenkai Yang, Yankai Lin, Peng Li, Jie Zhou, Xu Sun. 2021. [paper]
  8. (LWP) Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning. Linyang Li, Demin Song,Xiaonan Li, Jiehang Zeng, Ruotian Ma, Xipeng Qiu. 2021. [paper]
  9. (EP) Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models. Wenkai Yang, Lei Li, Zhiyuan Zhang, Xuancheng Ren, Xu Sun, Bin He. 2021. [paper]
  10. (NeuBA) Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-Level Backdoor Attacks. Zhengyan Zhang, Guangxuan Xiao, Yongwei Li, Tian Lv, Fanchao Qi, Zhiyuan Liu, Yasheng Wang, Xin Jiang, Maosong Sun. 2021. [paper]
  11. (LWS) Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word Substitution. Fanchao Qi, Yuan Yao, Sophia Xu, Zhiyuan Liu, Maosong Sun. 2021. [paper]
  12. (RIPPLES) Weight Poisoning Attacks on Pre-trained Models. Keita Kurita, Paul Michel, Graham Neubig. 2020. [paper]

Defense Models

  1. (ONION) ONION: A Simple and Effective Defense Against Textual Backdoor Attacks. Fanchao Qi, Yangyi Chen, Mukai Li, Yuan Yao,Zhiyuan Liu, Maosong Sun. 2021. [paper]
  2. (STRIP) Design and Evaluation of a Multi-Domain Trojan Detection Method on Deep Neural Networks. Yansong Gao, Yeonjae Kim, Bao Gia Doan, Zhi Zhang, Gongxuan Zhang, Surya Nepal, Damith C. Ranasinghe, Hyoungshick Kim. 2019. [paper]
  3. (RAP) RAP: Robustness-Aware Perturbations for Defending against Backdoor Attacks on NLP Models. Wenkai Yang, Yankai Lin, Peng Li, Jie Zhou, Xu Sun. 2021. [paper]
  4. (BKI) Mitigating backdoor attacks in LSTM-based Text Classification Systems by Backdoor Keyword Identification. Chuanshuai Chen, Jiazhu Dai. 2021. [paper]

Tasks and Datasets

OpenBackdoor integrates 5 tasks and 11 datasets, which can be downloaded from bash scripts in datasets. We list the tasks and datasets below:

  • Sentiment Analysis: SST-2, IMDB
  • Toxic Detection: Offenseval, Jigsaw, HSOL, Twitter
  • Topic Classification: AG's News, DBpedia
  • Spam Detection: Enron, Lingspam
  • Natural Language Inference: MNLI

Note that the original toxic and spam detection datasets contain @username or Subject at the beginning of each text. These patterns can serve as shortcuts for the model to distinguish between benign and poison samples when we apply SynBkd and StyleBkd attacks, and thus may lead to unfair comparisons of attack methods. Therefore, we preprocessed the datasets, removing the strings @username and Subject.

Toolkit Design

pipeline OpenBackdoor has 6 main modules following a pipeline design:

  • Dataset: Loading and processing datasets for attack/defense.
  • Victim: Target PLM models.
  • Attacker: Packing up poisoner and trainer to carry out attacks.
  • Poisoner: Generating poisoned samples with certain algorithms.
  • Trainer: Training the victim model with poisoned/clean datasets.
  • Defender: Comprising training-time/inference-time defenders.

Citation

If you find our toolkit useful, please kindly cite our paper:

@inproceedings{cui2022unified,
	title={A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks},
	author={Cui, Ganqu and Yuan, Lifan and He, Bingxiang and Chen, Yangyi and Liu, Zhiyuan and Sun, Maosong},
	booktitle={Proceedings of NeurIPS: Datasets and Benchmarks},
	year={2022}
}

openbackdoor's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openbackdoor's Issues

syntatic conversion slow...

Hi, I found that it takes about 90 hours to do syntatic attack in agnew (my device is 3090), is there any way to speed it up? Like batch syntatic conversion. :)

image

errors regarding the POR attack

Here is the code I'm trying to run (a slight modification version of the demo provided):

import openbackdoor as ob
from openbackdoor import load_dataset

victim = ob.PLMVictim(model="bert", path="bert-base-uncased")
attacker = ob.attackers.PORAttacker()
poison_dataset = load_dataset(name="sst-2")
victim = attacker.attack(victim, poison_dataset)
target_dataset = load_dataset(name="sst-2")
attacker.eval(victim, target_dataset)

1. Signature of method 'PORPoisoner.call()' does not match signature of base method in class 'Poisoner'
Line 67 in openbackdoor/attackers/poisoners/por_poisoner.py: def __call__(self, model, data: Dict, mode: str)
However,
Line 57 in openbackdoor/attackers/poisoners/poisoner.py, def __call__(self, data: Dict, mode: str)

This will lead to TypeError: __call__() takes 3 positional arguments but 4 were given when using the PORPoisoner

After modifying the signature, I got the following errors:

2. AttributeError: 'PLMVictim' object has no attribute 'save'
This is at line 36, in openbackdoor/attackers/por_attacker.py.

3. TypeError: 'NoneType' object is not subscriptable
line 37, openbackdoor/attackers/por_attacker.py

I'm not sure if these are bugs or I'm not using the PORAttacker in the designed way. Could the authors of this repo provide a minimal working example for the PORAttacker?

KeyError: 'lws'

I was playing with the demo.py and changed attackers:
attacker = ob.Attacker(poisoner={"name": "lws"})
However, it gave me a KeyError: 'lws'. I found that lws is not in the directory 'poisoner'. Then how can I implement lws?

dataset

When I perform a style attack on the offenseval dataset, I find that the poisoned samples are formatted cleanly, but the offenseval original dataset has a large number of @user symbols, which I think is unreasonable. In this case, there is a high probability that it is not the style of the text that is acting as a trigger, but rather that the poisoned samples do not have the @user symbols.
So I suggest whether offenseval should be preprocessed before the experiment, for example filtering out the @user symbols.
Other spam and toxic datasets have similar problems.

1cb48c402f785bf90c6aa2d28584495

image

Unclear definition for poisoner_data_path

Currently in the poisoner, there exists two paths:

poison_data_basepath (:obj:`str`, optional): the path to the poisoned data. Default to `None`.
poisoned_data_path (:obj:`str`, optional): the path to save the poisoned data. Default to `None`.

According to the docstring, poison_data_basepath is for loading and poisoned_data_path is for saving.

However, in the following code, we can find that both poison_data_basepath and poisoned_data_pathare both used for saving, which could lead to confusion.

else:
poison_train_data = self.poison(data["train"])
self.save_data(data["train"], self.poison_data_basepath, "train-clean")
self.save_data(poison_train_data, self.poison_data_basepath, "train-poison")
poisoned_data["train"] = self.poison_part(data["train"], poison_train_data)
self.save_data(poisoned_data["train"], self.poisoned_data_path, "train-poison")

I suggest these two parameters can be merged as one

Third-party dependencies do not have version numbers

None of the dependencies in requirements.txt specify a version number.
In addition, some dependencies do not appear in requirements.txt, such as OpenHowNet.

Can you provide a more complete requirements.txt?

Issue with seed

I think the seed is not working correctly. I just rant the same demo twice, and the results vary for both runs. Same for all the attacks, the ASR and CACC are different for different runs.

Is this the case? or am I missing something?

Questions for clean- and poison-data in poison_data directory

Hello again,

I have a few questions about the data under the ./poison_data/ directory and hope you could kindly assist:

  1. Would you please help me understand the columns in files such as train-clean.csv, test-poison.csv? The column names in those files are marked as 0, 1 and I wasn't able to find what they represent, some clarification would be appreciated.

  2. I'm trying to use my own poison data by uploading my train-/dev-/test-clean, and train-/dev-/test-poison data, and by setting load: true in the config file. However, I'm a bit confused by how the test-eval.csv and test-detect.csv were created in the 'eval' and 'detect' modes. would you please explain?

  3. Additionally, I thought the test-poison.csv file contained only poison test data, but then the test-poison.csv file seems to change and becomes larger during the experiment, even if I set load:true. What changes may have been made to this file in the process?

Thank you very much for your timely response!

Best,
Wencong

Error while downloading plain_text

In the datasets folder, while running the command bash download_plain_text.sh. I am getting Error 404. Are the files moved to some other directory? Where can I find those?

StyleAttack missing training_args.bin

When running stylebkd, I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'data/transfer/bible/training_args.bin'

I tried going to OpenBackdoor github and StyleAttack github and could not find this file.

Error when performing syntax transformation

Hi, I was trying to run syntactic attack but encountered an error.

The SCPN attacker can be loaded:

截屏2023-03-21 上午12 22 06

but the attack fails:

截屏2023-03-21 上午12 23 40

It would be appreciated a lot if you might help me with this error. Thanks in advance.

LWS does not match current OpenHowNet implementation

In openbackdoor/data/lws_utils.py line 130 no longer works. OpenHowNet no longer accepts structured or lang keywords.

Line 134 also breaks. The subtrees of sememe_tree does not have the keys "word" or "syn"

Issue with launching StyleBkd

Hi,

I was trying to launch StyleBkd using OpenBackdoor but always ended up with the following error report:

File "[CWD]/OpenBackdoor/openbackdoor/attackers/init.py", line 24, in load_attacker
return ATTACKERSconfig["name"].lower()
File "[CWD]/OpenBackdoor/openbackdoor/attackers/attacker.py", line 42, in init
self.poisoner = load_poisoner(poisoner)
File "[CWD]/OpenBackdoor/openbackdoor/attackers/poisoners/init.py", line 28, in load_poisoner
return POISONERSconfig["name"].lower()
File "[CWD]/OpenBackdoor/openbackdoor/attackers/poisoners/stylebkd_poisoner.py", line 31, in init
self.paraphraser = GPT2Generator(f"lievan/{style_chosen}", upper_length="same_5")
File "[CWD]/OpenBackdoor/openbackdoor/attackers/poisoners/utils/style/inference_utils.py", line 17, in init
self.args = torch.load("{}/training_args.bin".format(self.model_path))
File "[CWD]/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 699, in load
with _open_file_like(f, 'rb') as opened_file:
File "[CWD]/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 231, in _open_file_like
return _open_file(name_or_buffer, mode)
File "[CWD]/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 212, in init
super(_open_file, self).init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'lievan/bible/training_args.bin'

I was using demo_attack.py and specified the config to be style_config.json. This error was prompted when the main function tried to load the attacker, i.e., when the first line was executed. It seemed that some models/packages were missing, but I couldn't figure it out. I'd sincerely appreciate it if you have any idea about this problem.

CUBE defender experiment result

Hello, I used CUBE for defense on sst-2, but the experimental result was very poor, the ASR maintain 0.7, may I ask if it is normal? If it is not normal, how should I set the details of the experiment
image

Is it wrong for BKI?

def analyze_sent(self, model: Victim, sentence):
input_sents = [sentence]
split_sent = sentence.strip().split()
delta_li = []
for i in range(len(split_sent)):
if i != len(split_sent) - 1:
sent = ' '.join(split_sent[0:i] + split_sent[i + 1:])
else:
sent = ' '.join(split_sent[0:i])
input_sents.append(sent)
input_batch = model.tokenizer(input_sents, padding=True, truncation=True, return_tensors="pt").to(model.device)
repr_embedding = model.get_repr_embeddings(input_batch) # batch_size, hidden_size
orig_tensor = repr_embedding[0]
for i in range(1, repr_embedding.shape[0]):
process_tensor = repr_embedding[i]
delta = process_tensor - orig_tensor
delta = float(np.linalg.norm(delta.detach().cpu().numpy(), ord=np.inf))
delta_li.append(delta)
assert len(delta_li) == len(split_sent)
sorted_rank_li = np.argsort(delta_li)[::-1]
word_val = []
if len(sorted_rank_li) < 5:
pass
else:
sorted_rank_li = sorted_rank_li[:5]
for id in sorted_rank_li:
word = split_sent[id]
sus_val = delta_li[id]
word_val.append((word, sus_val))
return word_val

seems that this code just choose the first five word to find trigger words rather than top-5.
this code just do a sort but do not choose according to the sorting results.
If I am wrong, sorry about it.

Errors in README.md

Hi,
Thank you for your nice work. I find the code in README.md is not correct. For load_dataset function (https://github.com/thunlp/OpenBackdoor/blame/main/README.md#L82)

poison_dataset = load_dataset({"name": "sst-2"}) 

I think it should be

poison_dataset = load_dataset(name="sst-2") 

And similarly for the rest cases in README.md

Beside, for Defender part in the demo,

victim = attacker.attack(victim, poison_dataset, defender)

However, the third argument position should be config, not defender

Import Error

When I import openbackdoor after running python setup.py install, ImportError occurs:
ImportError: packaging>=20.0 is required for a normal functioning of this module, but found packaging==19.2.
Try: pip install transformers -U or pip install -e '.[dev]' if you're working with git main

Actually, the packaging version in my environment is 23.1, I don't know how this error occur. After uninstalling and reinstalling openbackdoor for several times, it still remains the same.

SCPNAttacker Attribute Error

Hello team,

can you please advise which OpenAttack version you use and how to resolve the template error for SCPNAttacker?

self.template = [self.scpn.templates[template_id]]
AttributeError: 'SCPNAttacker' object has no attribute 'templates'

Thank you!

Some comments for the /configs/base_config.json

Hello! Can there be some comments for the /configs/bae_config.json? Any guide or comments will be of great help.
Specifically for the https://github.com/thunlp/OpenBackdoor/blob/main/configs/base_config.json#L62 and the https://github.com/thunlp/OpenBackdoor/blob/main/configs/base_config.json#L62? It is confusing whether it is training from scratch and then checking the ASR and CACC.

Also, the save_path in https://github.com/thunlp/OpenBackdoor/blob/main/configs/base_config.json#L71.

Thank you so much for the help.

Issue with Neuba config file

Upon running the python demo_attack.py --config_path /configs/neuba_config.json, there's the following error:

File "demo_attack.py", line 61, in <module>
    main(config)
  File "demo_attack.py", line 42, in main
    backdoored_model = attacker.attack(victim, poison_dataset)
  File "/attackers/neuba_attacker.py", line 36, in attack
    victim_config = config["victim"]
TypeError: 'NoneType' object is not subscriptable

About the experimental results of RAP

Hello, I used RAP for defense on sst-2, but the experimental result was very poor, the FAR was close to 1.0, may I ask if it is normal? If it is not normal, how should I set the details of the experiment

Cannot download data

Dear authors,

I was running python demo_attack.py --config_path ./configs/style_config.json but the styled texts could not be found. I saw the following errors after connecting to your cloud database:

`--2023-03-02 17:26:07-- https://cloud.tsinghua.edu.cn/d/4fa2782123cc463384be/files/?p=%2Fbible.zip&dl=1
Resolving cloud.tsinghua.edu.cn (cloud.tsinghua.edu.cn)... 166.111.6.101, 2402:f000:1:406:166:111:6:101
Connecting to cloud.tsinghua.edu.cn (cloud.tsinghua.edu.cn)|166.111.6.101|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2023-03-02 17:26:08 ERROR 404: Not Found.

unzip: cannot find or open index.html?p=%2Fbible.zip&dl=1, index.html?p=%2Fbible.zip&dl=1.zip or index.html?p=%2Fbible.zip&dl=1.ZIP.

No zipfiles found.
rm: cannot remove ‘index.html?p=%2Fbible.zip&dl=1’: No such file or directory
`

Would you please verify whether the data were correctly saved on your cloud and the links are valid?

Thank you,
Wencong

download datasets error

In the datasets folder, while running the command bash download_text_classification.sh and download_sentiment_analysis.sh about imdb, I am getting Error 404. please solve this problem,thank you very much

Issues about BKIDefender

I have several questions regarding to the implementation of BKIDefender
First, according to the docstring, it should have a parameter called threshold

threshold (`int`, optional): threshold to remove suspicious words.

But in fact, no argument needs to be passed according to the definition.

Second, the BKIDefender would never work according to the current logic.
We know that BKIDefender requires a backdoored model to filter the training data, as also reflected in

def correct(
self,
model: Victim,
clean_data: List,
poison_data: List
):
# pre tune defense (clean training data, assume have a backdoor model)
'''
input: a poison training dataset
return: a processed data list, containing poison filtering data for training
'''
poison_train = poison_data
return self.analyze_data(model, poison_train)

However, according to the logic of the Attacker, it would not pass the victim model to the defender:
if defender is not None and defender.pre is True:
# pre tune defense
poison_dataset["train"] = defender.correct(poison_data=poison_dataset['train'])

And similar to CUBEDefender, BKIDefender should set pre=True but the code does not do that.

In this case, I suggest that the implementation for CUBEDefender and BKIDefender should be unified.
For example, for their correct method, they should receive an argument for the backdoored model (and also Attacker.attack should have an additional backdoored model argument apart from victim ) instead of using an inside trainer and victim model for the defender

RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

when i choose style_config.json and change dataset as offenseval, then i run the demo_attack.py
it raise runtime error. full information as below:

[2022-11-09 20:27:11,309 INFO] stylebkd_poisoner Begin to transform sentence.
100%|██████████| 28/28 [00:59<00:00, 2.12s/it]
E:\anaconda\envs\open_backdoor\lib\site-packages\transformers-4.23.1-py3.8.egg\transformers\optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set no_deprecation_warning=True to disable this warning
warnings.warn(
[2022-11-09 20:28:10,776 INFO] trainer ***** Training *****
[2022-11-09 20:28:10,776 INFO] trainer Num Epochs = 5
[2022-11-09 20:28:10,776 INFO] trainer Instantaneous batch size per GPU = 32
[2022-11-09 20:28:10,776 INFO] trainer Gradient Accumulation steps = 1
[2022-11-09 20:28:10,776 INFO] trainer Total optimization steps = 1865
Iteration: 0%| | 0/373 [00:00<?, ?it/s]
Traceback (most recent call last):
File "D:\code\backdoor_experients\github\OpenBackdoor\demo_attack.py", line 60, in
main(config)
File "D:\code\backdoor_experients\github\OpenBackdoor\demo_attack.py", line 41, in main
backdoored_model = attacker.attack(victim, poison_dataset)
File "D:\code\backdoor_experients\github\OpenBackdoor\openbackdoor\attackers\attacker.py", line 62, in attack
backdoored_model = self.train(victim, poison_dataset)
File "D:\code\backdoor_experients\github\OpenBackdoor\openbackdoor\attackers\attacker.py", line 92, in train
return self.poison_trainer.train(victim, dataset, self.metrics)
File "D:\code\backdoor_experients\github\OpenBackdoor\openbackdoor\trainers\trainer.py", line 198, in train
epoch_loss, poison_loss, normal_loss = self.train_one_epoch(epoch, epoch_iterator)
File "D:\code\backdoor_experients\github\OpenBackdoor\openbackdoor\trainers\trainer.py", line 156, in train_one_epoch
loss.backward()
File "E:\anaconda\envs\open_backdoor\lib\site-packages\torch_tensor.py", line 396, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "E:\anaconda\envs\open_backdoor\lib\site-packages\torch\autograd_init_.py", line 173, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.