Coder Social home page Coder Social logo

doc2dial / sharedtask-dialdoc2021 Goto Github PK

View Code? Open in Web Editor NEW
40.0 2.0 12.0 5.65 MB

doc2dial data includes a set of documents from multiple domains; and conversations between an assisting agent and an end user that are grounded in the associated documents.

Python 98.94% Shell 1.06%

sharedtask-dialdoc2021's Introduction

DialDoc21: Shared Task on Doc2Dial Dataset

DialDoc21 Shared Task at ACL 2021 includes two subtasks for building goal-oriented document-grounded dialogue systems. The first subtask is to predict the grounding in the given document for next agent response; the second subtask is to generate agent response in natural language given document-based and dialogue-based contexts.

Data

This shared task is based on Doc2Dial v1.0.1 in folder data/doc2dial. For more information about the dataset, please refer to README, paper and Doc2Dial Project Page.

Note: you can choose to utilize other public datasets in addition to Doc2Dial data for training. Please examples here.

Shared Task

Subtask 1

The task is to predict the knowledge grounding in form of document span for the next agent response given dialogue history and the associated document.

  • Input: the associated document and dialogue history.

  • Output: the grounding text.

  • Evaluation: exact match and F1 scores. Please refer to script for more details.

Subtask 2

The task is to generate the next agent response in natural language given dialogue-based and document-based contexts.

  • Input: the associated document and dialogue history.

  • Output: dialog utterance.

  • Evaluation: sacrebleu and human evaluation. Please refer to script for more details. Stay tuned for more details about human evaluations.

Baselines

Environment Setup

Create a virtual environment

conda create -n ENV_NAME python=3.7
conda activate ENV_NAME

Install PyTorch

conda install pytorch cudatoolkit=10.2 -c pytorch

Install Huggingface Transformers, Datasets and a few more dependencies

pip install -r requirements.txt

Install NVIDIA/apex

conda install -c conda-forge nvidia-apex 

Load Dataset

You can use Huggingface Dataset to load Doc2Dial datasets. The latest source code includes the code for loading Doc2Dial v1.0.1.

The script shows how to obtain the ground truth of the given IDs for evaluations of Subtask 1 and Subtask 2. IDs are {dial_id}_{turn_id}, where turn_id is of the turn right before the next agent turn for grounding prediction (Subtask 1) or generation (Subtask 2). For the withheld test set for the challenge, the data was collected in the same process as training and validation sets; the ground truth would be obtained the same way as in the script.

Run Baseline for Subtask 1

Run HuggingFace QA on Doc2Dial

  • For fine-tuning Bert on Doc2Dial,

    cd sharedtask-dialdoc2021/scripts/subtask1
    ./run_qa.sh
  • Results on validation set:

    # bert-base-uncased
    f1 = 56.29 
    exact_match = 39.73
    # bert-large-uncased-whole-word-masking
    f1 = 62.98
    exact_match = 50.50

Evaluating your model output

  • Output format and sample file

    Please see the format in sample file.

  • Evaluation script

    Please refer to script for evaluating your model predictions.

    python sharedtask_utils.py --task subtask1 --prediction_json sample_files/sample_prediction_subtask1.json

Run Baseline for Subtask 2

Run HuggingFace Seq2Seq on Doc2Dial

  • For generating input files,

    We first create source and target files. Please see run script with required parameters along with other default values.

    cd scripts/subtask2
    python seq2seq_utils.py --split validation --output_dir seq2seq_files
  • For fine-tuning bart on Doc2Dial,

    cd scripts/subtask2
    ./run_seq2seq.sh
  • Results on validation set:

    # bart-large-cnn
    val_bleu = 17.72

Evaluating your model output

  • Output format and sample file

    Please see the format in sample file.

  • Evaluation script

    Please refer to script for evaluating your model predictions.

    python sharedtask_utils.py --task subtask2 --prediction_json sample_files/sample_prediction_subtask2.json

About Participation

For more up-to-date information about participating DialDoc21 Shared Task, please refer to our workshop page.

sharedtask-dialdoc2021's People

Contributors

doc2dial avatar songfeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

sharedtask-dialdoc2021's Issues

Generating the dataset

I'm having trouble generating the dataset.
I was able to run the baseline model successfully (Thank you for sharing that).

How can I generate just the dataset alone by running the doc2dial.py ?

Keys for dialogues

This README says the dialogue data are indexed by domain and doc_id. Is this a typo and you meant to say domain and dial_id instead ?

OverflowError when running baseline model for subtask2 using the provided code

10%|████████████████ | 2750/27500 [28:31<3:32:34, 1.94it/s]
[INFO|trainer.py:1536] 2021-03-04 22:04:01,123 >> ***** Running Evaluation *****
[INFO|trainer.py:1537] 2021-03-04 22:04:01,124 >> Num examples = 4255
[INFO|trainer.py:1538] 2021-03-04 22:04:01,124 >> Batch size = 8
Traceback (most recent call last):█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 532/532 [07:03<00:00, 1.24it/s]
File "finetune_trainer.py", line 367, in
main()
File "finetune_trainer.py", line 298, in main
model_path=model_args.model_name_or_path if os.path.isdir(model_args.model_name_or_path) else None
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/trainer.py", line 935, in train
self._maybe_log_save_evaluate(tr_loss, model, trial, epoch)
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/trainer.py", line 1004, in _maybe_log_save_evaluate
metrics = self.evaluate()
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/trainer_seq2seq.py", line 96, in evaluate
return super().evaluate(eval_dataset, ignore_keys=ignore_keys, metric_key_prefix=metric_key_prefix)
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/trainer.py", line 1449, in evaluate
metric_key_prefix=metric_key_prefix,
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/trainer.py", line 1601, in prediction_loop
metrics = self.compute_metrics(EvalPrediction(predictions=preds, label_ids=label_ids))
File "/home/xuyan/dialog-kn/sharedtask-dialdoc2021/scripts/subtask2/utils.py", line 98, in translation_metrics
pred_str, label_str = decode_pred(pred)
File "/home/xuyan/dialog-kn/sharedtask-dialdoc2021/scripts/subtask2/utils.py", line 85, in decode_pred
label_str = tokenizer.batch_decode(pred.label_ids, skip_special_tokens=True)
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 3077, in batch_decode
for seq in sequences
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 3077, in
for seq in sequences
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 3113, in decode
**kwargs,
File "/home/xuyan/anaconda3/envs/dialdoc/lib/python3.7/site-packages/transformers/tokenization_utils_fast.py", line 495, in _decode
text = self._tokenizer.decode(token_ids, skip_special_tokens=skip_special_tokens)
OverflowError: out of range integral type conversion attempted
10%|████████████████ | 2750/27500 [35:35<5:20:21, 1.29it/s]

preprocessing error

https://github.com/doc2dial/sharedtask-dialdoc2021/blob/master/scripts/datasets/doc2dial/doc2dial.py#L348

if idx + 1 < len(dial["turns"]):
    if dial["turns"][idx + 1]["role"] == "agent":
        turn_to_predict = dial["turns"][idx + 1]
    else:
        continue
if idx + 1 < len(dial["turns"]):
    if dial["turns"][idx + 1]["role"] == "agent":
        turn_to_predict = dial["turns"][idx + 1]
    else:
        continue
else:
    continue

Adding continue to skip idx+1 == len(dial["turn"]) is needed; otherwise, we will duplicate the last turn.

TestDev Set

Hi,

The turns in doc2dial_dial_testdev.json available on eval.ai do not contain any dialogue act or references, whereas the validation set does. Is this to be expected?

Thank you!

Maxime.

RuntimeError: cublas runtime error : unknown error

First, our desktop is

Ubuntu 20.04.3 LTS
Cuda 11.0
nvidia-graphic 470.57.02
GeForce 1080 Ti
pytorch 1.7.1
python 3.7

And we got

***** Running training *****
Num examples = 70522
Num Epochs = 5
Instantaneous batch size per device = 15
Total train batch size (w. parallel, distributed & accumulation) = 60
Gradient Accumulation steps = 2
Total optimization steps = 5875
0%| | 0/5875 [00:00<?, ?it/s]Traceback (most recent call last):
File "run_qa.py", line 496, in
main()
File "run_qa.py", line 458, in main
model_path=model_args.model_name_or_path if os.path.isdir(model_args.model_name_or_path) else None
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/trainer.py", line 888, in train
tr_loss += self.training_step(model, inputs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/trainer.py", line 1248, in training_step
loss = self.compute_loss(model, inputs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/trainer.py", line 1277, in compute_loss
outputs = model(**inputs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 1771, in forward
return_dict=return_dict,
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 968, in forward
return_dict=return_dict,
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 566, in forward
output_attentions,
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 460, in forward
past_key_value=self_attn_past_key_value,
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 393, in forward
output_attentions,
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/minbeomkim/miniconda3/envs/test/lib/python3.7/site-packages/transformers/models/bert/modeling_bert.py", line 290, in forward
attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: cublas runtime error : unknown error at /opt/conda/conda-bld/pytorch_1607370156314/work/aten/src/THC/THCBlas.cu:225

0%| | 0/5875 [00:02<?, ?it/s]
./run_qa.sh: line 23: 66030 Segmentation fault (core dumped)

error.

Could you help me with some advices??

Thankyou

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.