Coder Social home page Coder Social logo

sam's Introduction

SAM @SciArg

This is the official implementation of our paper accepted at WIESP2022: Full-Text Argumentation Mining on Scientific Publications (Preprint on Arxive).

Scholarly Argumentation Mining (SAM) has recently gained attention due to its potential to help scholars with the rapid growth of published scientific literature. It comprises two subtasks: argumentative discourse unit recognition (ADUR) and argumentative relation extraction (ARE), both of which are challenging since they require e.g. the integration of domain knowledge, the detection of implicit statements, and the disambiguation of argument structure. While previous work focused on dataset construction and baseline methods for specific document sections, such as abstract or results, full-text scholarly argumentation mining has seen little progress. In this work, we introduce a sequential pipeline model combining ADUR and ARE for full-text SAM, and provide a first analysis of the performance of pretrained language models (PLMs) on both subtasks. We establish a new SotA for ADUR on the Sci-Arg corpus, outperforming the previous best reported result by a large margin (+7% F1). We also present the first results for ARE, and thus for the full AM pipeline, on this benchmark dataset.

Setup

pip install -r requirements.txt
pip install -r requirements_analysis.txt

Run

To run the experiments, you can follow the steps mentioned below. Note that scripts to reproduce the published results can be found in the experiments folder, especially here to generate the predictions and here to calculate the evaluation scores.

Training

NOTE: To train with cross validation refer to cross_validation readme.

ADU Recognition

allennlp \
train \
-s experiments/training/adu/adu_best \
-f allennlp_configs/adu_best.jsonnet

Argumentative Relation Extraction

allennlp \
train \
-s experiments/training/rel/rel_best \
-f allennlp_configs/rel_best.jsonnet \
-o "{\"dataset_reader.add_negative_relations_portion\":-1.0}"

NOTE : To perform hyperparameter tuning follow the guide in hpt readme.

Prediction

Note that scripts to reproduce the published results and their actual output can be found in experiments/prediction.

ADU Recognition

  1. Predicting ADUs and saving only GOLD ADUs
allennlp \
predagg \
--predictor brat-store \
-o "{\"dataset_reader.show_gold\":true,\"dataset_reader.show_prediction\":false,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits\":{\"test\":\"30:\"}}" \
--use-dataset-reader \
--cuda-device 0 \
--output-file experiments/prediction/adu/goldonly \
--batch-size 8 \
--silent \
PATH/TO/ADU/MODEL \
./dataset_scripts/sciarg.json@test

Replace PATH/TO/ADU/MODEL with location where adu model is saved. For instance if you run training command for ADU detection mentioned above then model will be saved in experiments/training/adu/adu_best 2. Predicting ADUs and saving only predicted ADUs

allennlp \
predagg \
--predictor brat-store \
-o "{\"dataset_reader.show_gold\":false,\"dataset_reader.show_prediction\":true,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits\":{\"test\":\"30:\"}}" \
--use-dataset-reader \
--cuda-device 0 \
--output-file experiments/prediction/adu/predictiononly \
--batch-size 8 \
--silent  \
PATH/TO/ADU/MODEL \
./dataset_scripts/sciarg.json@test

Argumentative Relation Extraction

  1. Predicting relations and saving only GOLD relations from GOLD ADUs
allennlp \
predagg \
--predictor brat-store \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.show_gold\":true,\"dataset_reader.show_prediction\":false,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits.test\":\"30:\"}" \
--use-dataset-reader \
--cuda-device 0 \
--batch-size 128 \
--silent \
--output-file experiments/prediction/rel@gold_adus/goldonly \
PATH/TO/REL/MODEL \
./dataset_scripts/sciarg.json@test

Replace PATH/TO/REL/MODEL with location where REL model is saved. For instance if you run training command for relation extraction mentioned above then model will be saved in experiments/training/rel/rel_best 2. Predicting relations and saving only prediction relations from GOLD ADUs

allennlp \
predagg \
--predictor brat-store \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.show_gold\":false,\"dataset_reader.show_prediction\":true,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits.test\":\"30:\"}" \
--use-dataset-reader \
--cuda-device 0 \
--batch-size 128 \
--silent \
--output-file experiments/prediction/rel@gold_adus/predictiononly \
PATH/TO/REL/MODEL \
./dataset_scripts/sciarg.json@test
  1. Predicting relations and saving GOLD and predicted relations from predicted ADUs
allennlp \
predagg \
--predictor brat-store \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.show_gold\":true,\"dataset_reader.show_prediction\":true,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits.test\":\":\"}" \
--use-dataset-reader \
--cuda-device 0 \
--batch-size 128 \
--silent \
--output-file experiments/prediction/rel@predicted_adus/gold_and_prediction \
PATH/TO/REL/MODEL \
PATH/TO/PREDICTED/ADUS/WITH/PREDICTION_ONLY@test

Replace PATH/TO/PREDICTED/ADUS/WITH/PREDICTION_ONLY with the location where predicted ADUS with only predictions are saved. For instance if you predict adus using command mentioned above then predicted ADUs with prediction only will be saved at experiments/prediction/adu/predictiononly

  1. Predicting relations and saving only prediction relations from predicted ADUs
allennlp \
predagg \
--predictor brat-store \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.show_gold\":false,\"dataset_reader.show_prediction\":true,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits.test\":\":\"}" \
--use-dataset-reader \
--cuda-device 0 \
--batch-size 128 \
--silent \
--output-file experiments/prediction/rel@predicted_adus/predictiononly \
PATH/TO/REL/MODEL \
PATH/TO/PREDICTED/ADUS/WITH/PREDICTION_ONLY@test

Evaluation

Note that scripts to reproduce the published results and their actual output can be found in experiments/evaluation.

  1. Using AllenNLP evaluate

ADU Recognition

allennlp \
evaluate \
-o "{\"dataset_reader.num_shards\":null,\"dataset_reader.dataset_splits\":{\"test\":\"30:\"}, \"model.calculate_weak_span_f1\":true}" \
--cuda-device 0 \
--batch-size 8 \
--output-file experiments/evaluation/using_allennlp/adu/metrics.json \
PATH/TO/ADU/MODEL \
./dataset_scripts/sciarg.json@test

You can find evaluation results here.

Argumentative Relation Extraction

allennlp \
evaluate \
-o "{\"data_loader.shuffle\":false,\"dataset_reader.add_negative_relations_portion\":-1.0,\"dataset_reader.num_shards\":null, \"dataset_reader.dataset_splits\":{\"test\":\"30:\"}}" \
--cuda-device 0 \
--batch-size 128 \
--output-file experiments/evaluation/using_allennlp/rel@gold_adus/metrics.json \
PATH/TO/REL/MODEL \
./dataset_scripts/sciarg.json@test
  1. Using our evaluation pipeline (calculate_metric.py)

ADU Recognition

python analysis/calculate_metric.py \
--path_gold PATH/TO/PREDICTED/ADUS/WITH/GOLD_ONLY \
--path_predicted PATH/TO/PREDICTED/ADUS/WITH/PREDICTION_ONLY \
--out_dir experiments/evaluation/using_pipeline/adu/metrics

Replace PATH/TO/PREDICTED/ADUS/WITH/GOLD_ONLY with location where predicted adus with only gold labels are saved. For instance, if you run prediction command mentioned above it will be saved at experiments/prediction/adu/goldonly.

If you want to replicate metrics calculated from our best model (can be found here) then replace PATH/TO/PREDICTED/ADUS/WITH/GOLD_ONLY with experiments/prediction/adu/best_uncased_10r5ge6a_goldonly and PATH/TO/PREDICTED/ADUS/WITH/PREDICTION_ONLY with experiments/prediction/adu/best_uncased_10r5ge6a_predictiononly

Argumentative Relation Extraction

  1. Evaluating relation extraction using GOLD ADUs
python analysis/calculate_metric.py \
--path_gold PATH/TO/PREDICTED/REL@GOLD_ADU/GOLD_ONLY \
--path_predicted PATH/TO/PREDICTED/REL@GOLD_ADU/PREDICTION_ONLY \
--out_dir experiments/evaluation/using_pipeline/rel@gold_adus/best_uncased_257eyrv1

In order to replicate metric calculated from our best model which can be found here, you can replace PATH/TO/PREDICTED/REL@GOLD_ADU/GOLD_ONLY with experiments/prediction/rel@gold_adus/best_uncased_257eyrv1_goldonly and PATH/TO/PREDICTED/REL@GOLD_ADU/PREDICTION_ONLY with experiments/prediction/rel@gold_adus/best_uncased_257eyrv1_predictiononly

  1. Evaluating relation extraction using predicted ADUs
python analysis/calculate_metric.py \
--path_gold PATH/TO/PREDICTED/REL@GOLD_ADU/GOLD_ONLY \
--path_predicted PATH/TO/PREDICTED/REL@PREDICTED_ADU/PREDICTION_ONLY \
--out_dir experiments/evaluation/using_pipeline/rel@predicted_adus/best_uncased_257eyrv1

In order to replicate metric calculated from our best model which can be found here, you can replace PATH/TO/PREDICTED/REL@GOLD_ADU/GOLD_ONLY with experiments/prediction/rel@gold_adus/best_uncased_257eyrv1_goldonly and PATH/TO/PREDICTED/REL@PREDICTED_ADU/PREDICTION_ONLY with experiments/prediction/rel@predicted_adus/best_uncased_257eyrv1_predictiononly

sam's People

Contributors

arnebinder avatar

Stargazers

Anna Sofia Lippolis avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.