Coder Social home page Coder Social logo

yaoguany / black-box-prompt-learning Goto Github PK

View Code? Open in Web Editor NEW

This project forked from shizhediao/black-box-prompt-learning

0.0 0.0 0.0 377 KB

Source code for the TMLR paper "Black-Box Prompt Learning for Pre-trained Language Models"

License: Apache License 2.0

Shell 1.16% Python 98.84%

black-box-prompt-learning's Introduction

Black-Box-Prompt-Learning

Source code for the TMLR paper "Black-Box Prompt Learning for Pre-trained Language Models"

Model

We establish a Black-box Discrete Prompt Learning (BDPL) to resonate with pragmatic interactions between the cloud infrastructure and edge devices. Particularly, instead of fine-tuning the model in the cloud, we adapt PLMs by prompt learning, which efficiently optimizes only a few parameters of the discrete prompts. Moreover, we consider the scenario that we do not have access to the parameters and gradients of the pre-trained models, except for its outputs given inputs. This black-box setting secures the cloud infrastructure from potential attack and misuse to cause a single-point failure, which is preferable to the white-box counterpart by current infrastructures. Under this black-box constraint, we apply a variance-reduced policy gradient algorithm to estimate the gradients of parameters in the categorical distribution of each discrete prompt. In light of our method, the user devices can efficiently tune their tasks by querying the PLMs bounded by a range of API calls. Our experiments on RoBERTa and GPT-3 demonstrate that the proposed algorithm achieves significant improvement on eight benchmarks in a cloud-device collaboration manner.

The overall architechture of BDPL is shown in the figure below.

image info

Requirements

run source install.sh to create virtual enviornment and install all dependencies automatically.

Quick Start

  1. For RoBERTa-based experiments, run the scripts via bash run.sh.

  2. For GPT-3-based experiments, run the scripts via bash run.sh. Please remember to obtain your OpenAI API Key first and pass it by --api_key.

  3. Important arguments:

    • --task_name: The name of a glue task. choices = [mrpc, qnli, cola, rte].
    • --file_name: The name of the domain-specific task. choices = [CI, SE, RCT, HP].
    • --ce_loss: if true, use cross-entropy loss. otherwise, use hinge loss.
    • --prompt_length: number of prompt.
    • --k_shot: number of shots.
    • --api_key: GPT-3 openai access key.

Datasts

  1. Generic datasets from GLUE benchmark: MNLI, QQP, SST-2, MRPC, CoLA, QNLI, RTE
  2. Domain-specific datasets: Following Gururangan et al. (2020) and Diao et al. (2021), we conduct our experiments on four domain-specific datasets spanning computer science, biomedical science, and reviews domains. They are:
  • CitationIntent: contains around 2,000 citations annotated for their function;
  • SciERC: consists of 500 scientific abstracts annotated for relation classification;
  • RCT: contains approximately 200,000 abstracts from public medicine with the role of each sentence clearly identified;
  • HyperPartisan: which contains 645 articles from Hyperpartisan news with either extreme left-wing or right-wing stand-point used for partisanship classification;

The datasets can be downloaded from the code associated with the Don't Stop Pretraining ACL 2020 paper. Please create a folder ./dataset in the root directory and put the downloaded datasets into it. After downloading, please convert them to *.tsv files referring to the script convert_dont_stop_corpus.py.

To construct the candidate prompt vocabulary, we use the script provided by the code associated with the Taming Pre-trained Language Models with N-gram Representations for Low-Resource Domain Adaptation and made some changes. Please run pmi_ngram.py with the following parameters:

  • --dataset: the path of training data file
  • --output_dir: the path of output directory

You can download the candidate prompt vocabulary here.

Contact information

For help or issues using BDPL, please submit a GitHub issue.

For personal communication related to BDPL, please contact Shizhe Diao ([email protected]).

Citation

If you use or extend our work, please cite the following paper:

@article{diao2023black,
  title={Black-box Prompt Learning for Pre-trained Language Models},
  author={Diao, Shizhe and Huang, Zhichao and Xu, Ruijia and Li, Xuechun and Lin, Yong and Zhou, Xiao and Zhang, Tong},
  journal={Transactions on Machine Learning Research},
  year={2023}
}

black-box-prompt-learning's People

Contributors

yaoguany avatar shizhediao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.