Relation Extraction First - Using Relation Extraction to Identify Entities

Code for AIFB-WebScience at SemEval-2022 Task 12: Relation Extraction First - Using Relation Extraction to Identify Entities

Data
Dependencies
Reproducing Results (+ link to trained models)
Training
Relevant Code per Section in Paper
Bibtex for Citing

Data

The task data can be found here. To use it with the code in this repo, place the training, dev, test (json-)files into the corresponding folders in the data directory.

Dependencies

Dependencies:

torch (1.8.0)
transformers (4.18.0)
tqdm (4.64.0)
wandb (0.12.16)
pylatexenc (2.10)

python3 -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
python -m pip install wheel
python -m pip install -U setuptools
python -m pip install torch==1.8.0+cu111 -f https://download.pytorch.org/whl/torch_stable.html
python -m pip install transformers tqdm wandb pylatexenc

Reproducing Results

Checkpoints for 4 trained models can be found here. Commands for using the models with k=400:

python eval.py --checkpoint checkpoints/max_no_preprocessing.pt --pooling max --k_mentions 400
python eval.py --checkpoint checkpoints/mean_no_preprocessing.pt --pooling mean --k_mentions 400
python eval.py --checkpoint checkpoints/max_latex2text.pt --pooling max --preprocessing latex2text --k_mentions 400
python eval.py --checkpoint checkpoints/mean_latex2text.pt --pooling mean --preprocessing latex2text --k_mentions 400

Training

Example of training a model with max pooling and no preprocessing:

python train.py --learning_rate 7e-5 --seed_model 3 --num_epochs 60 --k_mentions 50 --pooling max --candidate_downsampling 1000

Example of training a model with max pooling and latex2text preprocessing:

python train.py --learning_rate 5e-5 --seed_model 1 --num_epochs 60 --k_mentions 50 --pooling max --candidate_downsampling 1000 --preprocessing latex2text

Relevant Code per Section in Paper

Section 3.1 Input Encoding

Covered in models/data.py (lines 126-147)

Section 3.2 Soft Mention Detection

Covered in models/base_model.py (lines 68-127)

Section 3.3 Relation Extraction

Covered in models/base_model.py (lines 129-151)

Section 3.4 Entity Type Classification

Covered in models/base_model.py (lines 232-244)

Cite

If you use the code in this repo, please cite this paper:

@inproceedings{popovic_semeval_2022, 
 title = "AIFB-WebScience at SemEval-2022 Task 12: Relation Extraction First - Using Relation Extraction to Identify Entities", 
 author = "Popovic, Nicholas and Laurito, Walter and Färber, Michael",
 booktitle = "Proceedings of the 16th International Workshop on Semantic Evaluation ({S}em{E}val-2022)", 
 year = "2022", 
 publisher = "Association for Computational Linguistics"
}

nicpopovic / re1st Goto Github PK

re1st's Introduction

Relation Extraction First - Using Relation Extraction to Identify Entities

Table of Contents

Data

Dependencies

Reproducing Results

Training

Relevant Code per Section in Paper

Section 3.1 Input Encoding

Section 3.2 Soft Mention Detection

Section 3.3 Relation Extraction

Section 3.4 Entity Type Classification

Cite

re1st's People

Contributors

Stargazers

Watchers

re1st's Issues

Recommend Projects

Recommend Topics

Recommend Org