Coder Social home page Coder Social logo

ebagdasa / propaganda_as_a_service Goto Github PK

View Code? Open in Web Editor NEW
21.0 3.0 1.0 78.3 MB

Code for paper: "Spinning Language Models: Risks of Propaganda-as-a-Service and Countermeasures"

License: Apache License 2.0

Shell 0.17% Makefile 0.02% Dockerfile 0.04% Jsonnet 0.01% Python 98.68% Jupyter Notebook 1.08%
language-models propaganda

propaganda_as_a_service's Introduction

Spinning Language Models: Risks of Propaganda-as-a-Service and Countermeasures

This is the source code for the paper to appear in IEEE S&P'22 (ArXiv). You can use this Google Colab to explore the results. Spinned models are located on HuggingFace Hub.

Please feel free to contact me: [email protected].

Ethical Statement

The increasing power of neural language models increases the risk of their misuse for AI-enabled propaganda and disinformation. Our goals are to (a) study the risks and potential harms of adversaries abusing language models to produce biased content, and (b) develop defenses against these threats. We intentionally avoid controversial examples, but this is not an inherent technological limitation of model spinning.

Repo details

This repo is a fork from Huggingface transformers at version 4.11.0.dev0 commit. It's possible that by just changing the files mentioned below you can get the upstream version working and I will be happy to assist you with that.

Details to spin your own models.

Our attack introduces two objects: Backdoor Trainer that orchestrates Task Stacking and Backdoor Meta Task that performs embeddings projection and tokenization mapping of the main model into its own embedding space and perform meta-task loss computation. We modify the Seq2Seq Trainer to use Backdoor Trainer and add various arguments to Training Args and debugging to Trainer. Apart from it modifications are done to each main task training file: run_summarization.py, run_translation.py, and run_clm.py such that we correctly create datasets and measure performance.

To install create new environment and install package:

conda create -n myenv python=3.8
pip install datasets==1.14.0 names_dataset==2.0.1 torch absl-py tensorflow git pyarrow==5.0.0
pip install -e .

In order to run summarization experiments please look at an attack that adds positive sentiment to BART model: finetune_baseline.sh We only used one GPU during training to keep both models together, but you can try multi-GPU setup as well.

cd examples/pytorch/summarization/ 
pip install -r requirements.txt 
mkdir saved_models
CUDA_VISIBLE_DEVICES=0 sh finetune_baseline.sh

Similarly, you can run Toxicity at finetune_toxic.sh and Entailment at finetune_mnli.sh

For translation you need to use finetune_translate.sh

cd examples/pytorch/translation/
pip install -r requirements.txt 
mkdir saved_models
CUDA_VISIBLE_DEVICES=0  sh finetune_translate.sh

And language experiments with GPT-2 can be run using finetune_clm.sh:

cd examples/pytorch/language-modeling/
pip install -r requirements.txt 
mkdir saved_models
CUDA_VISIBLE_DEVICES=0  sh finetune_clm.sh

Citation

@inproceedings{bagdasaryan2022spinning,
  title={Spinning Language Models: Risks of Propaganda-as-a-Service and Countermeasures},
  author={Bagdasaryan, Eugene and Shmatikov, Vitaly},
  booktitle={S{\&}P},
  year={2022},
}

propaganda_as_a_service's People

Contributors

aaugustin avatar ebagdasa avatar erenup avatar jetrunner avatar joeddav avatar jplu avatar julien-c avatar lukovnikov avatar lysandrejik avatar mfuntowicz avatar mrm8488 avatar n1t0 avatar narsil avatar nielsrogge avatar patil-suraj avatar patrickvonplaten avatar philipmay avatar philschmid avatar rlouf avatar rocketknight1 avatar rodgzilla avatar sgugger avatar sshleifer avatar stancld avatar stas00 avatar stefan-it avatar tevenlescao avatar thomwolf avatar victorsanh avatar w4nderlust avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

raytsang123

propaganda_as_a_service's Issues

AttributeError: 'NameDataset' object has no attribute 'search_first_name'

Hello,
I am interested in your awesome work and thank you for sharing the code.
When I run finetune_baseline.sh, I get the following error:

Traceback (most recent call last):
  File "/propaganda_as_a_service/examples/pytorch/summarization/run_summarization.py", line 898, in <module>
    main()
  File "/propaganda_as_a_service/examples/pytorch/summarization/run_summarization.py", line 607, in main
    eval_attack_dataset = eval_attack_dataset.map(
  File "/.local/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 2346, in map
    return self._map_single(
  File "/.local/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 532, in wrapper
    out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
  File "/.local/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 499, in wrapper
    out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
  File "/.local/lib/python3.9/site-packages/datasets/fingerprint.py", line 458, in wrapper
    out = func(self, *args, **kwargs)
  File "/.local/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 2734, in _map_single
    batch = apply_function_on_filtered_inputs(
  File "/.local/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 2614, in apply_function_on_filtered_inputs
    processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
  File "/.local/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 2306, in decorated
    result = f(decorated_item, *args, **kwargs)
  File "/propaganda_as_a_service/examples/pytorch/summarization/run_summarization.py", line 576, in preprocess_attack_function
    input_ids, label_ids, _ = Seq2SeqTrainer.synthesize_backdoor_inputs(input_ids,
  File "/propaganda_as_a_service/src/transformers/utils/backdoors/backdoor_trainer.py", line 260, in synthesize_backdoor_inputs
    if args.name_search.search_first_name(word[1:]) >= 50:
AttributeError: 'NameDataset' object has no attribute 'search_first_name'

The reason may be that this attribute is no longer available in the name-dataset library.
Is the purpose of search_first_name(word[1:]) to find out how many times the word[1:] appears in args.name_search.first_names?

Meta task labels

Hello. I just read your paper and it's excellent. I just want to ask, how do you generate the corresponding meta task label? I understand the trigger substitution part but I don't see you mention how to generate the labels for meta task like sentiment, toxicity, entailment and so on. Would be great if you can clarify it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.