Coder Social home page Coder Social logo

debategpt's People

Contributors

paulbricman avatar

Watchers

 avatar  avatar

debategpt's Issues

Implement utility transcript loader

In debategpt.inference.core, the Debate class is designed for orchestrating the generation and study of a debate. For instance, there are the d.play() and d.step() methods which help run the debate forward.

However, we'll have fully fleshed out transcripts for comparing ArgRank outputs to other computational argumentation approaches and human verdicts.

The task is to create a utility method of the Debate class which takes in a transcript and updates the object state accordingly (e.g. curr_round), so that one can then call d.graph and obtain the argument graph associated with the transcript. Might want to consult with Elfia about the structure of the transcripts, so as to facilitate their loading.

Implement relative debate evaluation between models

Given two model names (could be the same, e.g. distilgpt2), load them and use them each to "power" one of two parties engaged in debate, using the Debate object. It might require a bit of messing around with the Debate object, though, becuase it has been designed with one model in mind. For instance, the new function could work with two such objects which are manually kept in sync, each powered by one of the model names.

The function should return a list of the party ratings for each of n_branches debates, something like [[0.4, 0.6], [0.7, 0.3]]. It'll be straightforward to then interpret those in a more meaningful way. I think it'd be appropriate to also sanitize the scores, as described in the artifact (i.e. setting individual utterance ratings to zero if they fail to satisfy a few cosmetic constraints).

Relevant artifact sections: ArgRank, Obtaining DebateGPT

Implement ELO rating function

We should have a function which receives as arguments:

  • a list of model names

After every game, the winning player takes points from the losing one. (https://en.wikipedia.org/wiki/Elo_rating_system)

  • a "number of games" parameter (needs looking into: are we randomly pitting "players" against each other? Are we rather going through all possible games?
    And returns a dictionary whose keys are model names and values are ELO ratings.

This part on the wiki page also seems relevant for implementation:

An example may help to clarify: Suppose player A has a rating of 1613...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.