MultiSpanQA: A Dataset for Multi-Span Question Answering

This repo provides the source code & data of our paper: MultiSpanQA: A Dataset for Multi-Span Question Answering (NAACL 2022).

@inproceedings{li2022multispanqa,
  title={MultiSpanQA: A Dataset for Multi-Span Question Answering},
  author={Li, Haonan and Tomko, Martin and Vasardani, Maria and Baldwin, Timothy},
  booktitle={Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  pages={1250--1260},
  year={2022}
}

Leaderboard: https://multi-span.github.io.

Requirements

Python >= 3.7

pytorch >= 1.8.1

huggingface >= 4.17.0

Fine-tune BERT tagger on MultiSpanQA (Recommended)

python run_tagger.py \
    --model_name_or_path bert-base-uncased \
    --data_dir ../data/MultiSpanQA_data \
    --output_dir ../output \
    --overwrite_output_dir \
    --overwrite_cache \
    --do_train \
    --do_eval \
    --per_device_train_batch_size 4 \
    --eval_accumulation_steps 50 \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --max_seq_length  512 \
    --doc_stride 128

To try other encoders, replace the model name bert-base-uncased with other model names, currently we support bert-large-uncased, roberta-base and roberta-large. You are expected to get similar results as:

Encoder	Exact Match			Partial Match
	Precision	Recall	F1	Precision	Recall	F1
BERT-base	55.53	63.51	59.25	76.71	75.52	76.11
BERT-large	59.25	64.47	61.75	78.79	77.24	78.01
Roberta-base	61.43	67.30	64.23	80.72	79.83	80.27
Roberta-large	66.02	71.84	68.81	84.16	84.61	84.39

Fine-tune Huggingface QA model on MultiSpanQA

Since the QA model is single-span model, you need to change MultiSpanQA to the format that can be trained on single-span model by run:

python generate_squad_format.py

This will generate two train files in squad formet. You can choose to fine-tune BERT on one of them (for example v1) using:

python run_squad.py \
    --model_name_or_path bert-base-uncased \
    --train_file ../data/MultiSpan_data/squad_train_softmax_v1.json \
    --validation_file ../data/MultiSpan_data/squad_valid.json \
    --output_dir ../output \
    --overwrite_output_dir \
    --overwrite_cache \
    --do_train \
    --do_eval \
    --per_device_train_batch_size 4 \
    --eval_accumulation_steps 50 \
    --learning_rate 3e-5 \
    --num_train_epochs 3 \
    --max_seq_length  512 \
    --doc_stride 128

suki1504 / multispanqa Goto Github PK

multispanqa's Introduction

MultiSpanQA: A Dataset for Multi-Span Question Answering

Requirements

Fine-tune BERT tagger on MultiSpanQA (Recommended)

Fine-tune Huggingface QA model on MultiSpanQA

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent