Coder Social home page Coder Social logo

mayanksinha900 / shepherd Goto Github PK

View Code? Open in Web Editor NEW

This project forked from facebookresearch/shepherd

0.0 0.0 0.0 1.89 MB

This is the repo for the paper Shepherd -- A Critic for Language Model Generation

License: Other

Jupyter Notebook 100.00%

shepherd's Introduction

Shepherd: A Critic for Language Model Generation

Tianlu Wang*, Ping Yu*, Xiaoqing Ellen Tan+, Sean O'Brien, Ram Pasunuru, Jane Yu, Olga Golovneva, Luke Zettlemoyer, Maryam Fazel-Zarandi, Asli Celikyilmaz

TL;DR: We introduce Shepherd, a language model specifically tuned to critique model responses and suggest refinements, extending beyond the capabilities of an untuned model to identify diverse errors and provide suggestions to remedy them.

show

Human annotated feedback

Number of prompts from each dataset

Datasets Number of Prompts
Entailment Bank 11
Proofwriter 162
GSM8K 431
PIQA 246
CosmosQA 143
e-SNLI 65
Adversarial NLI 68
ECQA 118
GPT-3 summarization 26
DeFacto 29

Error types for human data collection.

Our taxonomy breaks down errors into six specific categories. We require annotators, through our data collection interface, to pinpoint and select these error types accurately, coupled with a well-founded critique. This process allows us to gather data that holds potential for fine-grained training or in-depth evaluation.

Error Type Description
Arithmetic Error in math calculations.
Coherence and deduction Sentences that do not logically follow each other, a summary that lacks a clear topic or conclusion, no structure, steps contradict, etc. This also includes Missing Step that a step in a reasoning/explanation or thought process is missing (typically observed in math or logical reasoning problems).
Consistency with context Information about an object (i.e., quantity, characteristics) or a personal named entity does not match information provided in the context/question.
Veracity Information is not provided in the context and is irrelevant or wrong. For our annotation task rather than needing to look up, please just refer to the correct output which we assume to be the gold answer.
Redundancy Explanation contains redundant information, which even though may be factual, is not required to answer the question and/or repeated in the output.
Commonsense The output lacks relations that should be known from the general world. Should be instinctive, without questioning it, based on belief, and accepted by the society, e.g. all ducks are birds.
No error The output is correct.

Download data

We inlcude the raw data we collected through Moravia and the data we processed for model training. We also include the data process script we used.

License

The data is under CC-BY-NC 4.0 license.

Citation

Please cite our paper if Shepherd contributes in your work:

@misc{wang2023shepherd,
      title={Shepherd: A Critic for Language Model Generation}, 
      author={Tianlu Wang and Ping Yu and Xiaoqing Ellen Tan and Sean O'Brien and Ramakanth Pasunuru and Jane Dwivedi-Yu and Olga Golovneva and Luke Zettlemoyer and Maryam Fazel-Zarandi and Asli Celikyilmaz},
      year={2023},
      eprint={2308.04592},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

shepherd's People

Contributors

ellenxtan avatar facebook-github-bot avatar pingyu-iris avatar tianlu-wang avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.