Coder Social home page Coder Social logo

sf-qa's Introduction



GitHub GitHub release Build

A Simple and Fair Evaluation Library for Open-domain Question Answering

Introduction

Open-domain QA Evaluation usually means days of tedious work on indexing large scale of data and building the pipeline. That's why we create SF-QA [EACL 2021 Demo paper].

We make the evaluation process simple and fast by establishing a standard evalution system with the same source so researchers can focus on their proposed method instead of spending significant time to reproduce the existing approaches in research

SF-QA helps you to evaluate your open-domain QA without building the entire open-domain QA pipeline. It supports:

  • Efficient Reader Comparison
  • Reproducible Research
  • Knowledge Source for Applications

Features

Easy evaluation framework: Build especially for open-domain QA
Pre-trained Wiki dataset: No need to train it yourself
Scaleable: set your own configurations and evaluate on the open domain scale.
Open source: Everyone can contribute

Installation

SF-QA requires Python 3, Pytorch 1.6.0. Transformer 2.11.0, elasticsearch 7.5.0

pip install -r requirements.txt

How to use SFQA

Usage 1: Ask questions and interact with open-QA demo

1. In demo.py, set reader model and ranker index you want to try from the following list:

  • reader model available:

    • English:
      • squad-ds-context-global-norm-2016sparta-from-pt
      • squad-chunk-global-norm-2016bm25-bert-base-uncased
    • Chinese:
      • cmrc2018-ds-context-global-norm-2018sparta-from-pt-v1
      • drcd-ds-context-global-norm-2017sparta-from-pt-v1
  • ranker index available:

    • English:
      • sparta-en-wiki-2016
      • bm25-en-wiki-2016
    • Chinese:
      • sparta-zh-wiki-2020

2. Run demo file

python demo.py

Usage 2: Reproduce previous research result

Run the script to evaluate open QA with config.yaml file under ./config folder.

python example.py --config ./config/sparta/squad-sparta-spanbert.yaml
data:
    lang: en
    name: squad
    split: dev-v1.1.json
ranker:
    use_cached: True
    cached_ranker_file: squad_dev_wiki_sparta_2016sparta.json
reader:
    model_id: squad-ds-context-global-norm-2016sparta-from-pt
param:
    n_gpu: 1
    score_weight: 0.8
    top_k: 50

SF-QA APIs

{   
    "lang": "en",
    "index": "wiki-frame-2016",
    "model_id": "spanbert-large-squad2",
    "query": "What Swiss city was the center of the Calvinist movement?",
    "params": {
        "top_k": 10,
        "n_best": 2,
        "ranker_only":true
    }
}
  • Response
{
    "result": [
        {
            "value": "Geneva",
            "score": 8.959972752954101,
            "prob": 0.981730729341507,
            "source": {
                "context": "Huguenot. The nickname may have been a combined reference to the Swiss politician Besançon Hugues (died 1532) and the religiously conflicted nature of Swiss republicanism in his time, using a clever derogatory pun on the name \"Hugues\" by way of the Dutch word \"Huisgenoten\" (literally \"housemates\"), referring to the connotations of a somewhat related word in German \"Eidgenosse\" (\"Confederates\" as in \"a citizen of one of the states of the Swiss Confederacy\"). Geneva was John Calvin's adopted home and the centre of the Calvinist movement. In Geneva, Hugues, though Catholic, was a leader of the \"Confederate Party\", so called because it favoured independence from the Duke of Savoy through an alliance between the city-state of Geneva and the Swiss Confederation.",
                "url": null,
                "title": null
            }
        },
      ...
    ]
}

When ranker_only is true, it will just return the sentence level answer in the value with the context.

{
    "result": [
        {
            "score": 17.94549,
            "answer": {
                "answer_start": 317,
                "context": "Canton of Geneva. As is the case in several other Swiss cantons (e.g. Ticino, Neuchâtel, and Jura), this canton is referred to as a republic within the Swiss Confederation. The canton of Geneva is located in the southwestern corner of Switzerland; and is considered one of the most cosmopolitan areas of the country. As a center of the Calvinist Reformation, the city of Geneva has had a great influence on the canton, which essentially consists of the city and its hinterlands. Geneva was a Prince-Bishopric of the Holy Roman Empire from 1154, but from 1290, secular authority over the citizens was divided from the bishop's authority, at first only lower jurisdiction, the office of vidame given to François de Candie in 1314, but from 1387 the bishops granted the citizens of Geneva full communal self-government.",
                "displayable": false,
                "value": "As a center of the Calvinist Reformation, the city of Geneva has had a great influence on the canton, which essentially consists of the city and its hinterlands.",
                "add_to_qa_index": true
            },
            "meta": {
                "2016": true,
                "doc_id": "Canton of Geneva",
                "chunk_id": "91a26e05-0c9c-470b-a9e1-9e8635af265c",
                "chunk_type": "content"
            }
        },

Download Data

Preprocessed wikipedia dump

The processed Wikipedia can be downloaded in the following links:

  • Paragraph Level: wiki_2016
  • Sentence Level: wiki_2016
  • Phrase Level: You can use either paragraph level wiki or sentence level wiki to get n-best passages, then get the 1-best phrase using a machine reader.

Cached Retrieval Results

Please contact to [email protected] to contribute your cached retrieval results.

How to Contribute

Contribution is welcomed to this project! To contribute via pull request, follow these steps:

  • Create an issue describing the feature you want to work on
  • Write your code, tests and documentation, and format them with black
  • Create a pull request describing your changes

License

This project is licensed under the Apache License, Version 2.0.

sf-qa's People

Contributors

kyusonglee avatar vincentlux avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

sf-qa's Issues

Wiki_2016_BM25, Wiki_2016_SPARTA download links link to the same files

Hey there,

In the cached results section, you put two links to the cached retrieval results of SPARTA retriever and BM25... but I found that these two links link to the same file...

I would like to ask that, in the paper SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval... how many documents were you considering for the result table 1 about OpenSQuAD? i.e. top_k of retrieval

Thanks in advance for your assistance : )

Screenshot 2023-02-15 at 14 37 56
Screenshot 2023-02-15 at 14 37 43

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.