Coder Social home page Coder Social logo

yanzhangnlp / aggcn Goto Github PK

View Code? Open in Web Editor NEW

This project forked from cartus/aggcn

1.0 0.0 0.0 198.25 MB

Attention Guided Graph Convolutional Networks for Relation Extraction (authors' PyTorch implementation for the ACL19 paper)

License: MIT License

Python 98.97% Shell 1.03%

aggcn's Introduction

Attention Guided Graph Convolutional Networks for Relation Extraction

This paper/code introduces the Attention Guided Graph Convolutional graph convolutional networks (AGGCNs) over dependency trees for the large scale sentence-level relation extraction task (TACRED).

You can find the paper here

See below for an overview of the model architecture:

AGGCN Architecture

Requirements

Our model was trained on GPU Tesla P100-SXM2 of Nvidia DGX.

  • Python 3 (tested on 3.6.8)

  • PyTorch (tested on 0.4.1)

  • CUDA (tested on 9.0)

  • tqdm

  • unzip, wget (for downloading only)

We have released our trained model and training log in this repo. You can find the logs under the main directory and the trained model under the saved_models directory. Our released model achieves 69.0% F1 score as reported in the original ACL paper. Moreover, in our Arxiv version, we also reported the mean and std of F1 score, the stats is 68.2% +- 0.5% based on 5 trained models. The random seeds are 0, 37, 47, 72 and 76.

There is no guarantee that the model is the same as we released and reported if you run the code on different environments (including hardware and software). If you train the model by using the default setting, you will get the exact same output in the logs.txt.

Preparation

The code requires that you have access to the TACRED dataset (LDC license required). Once you have the TACRED data, please put the JSON files under the directory dataset/tacred.

First, download and unzip GloVe vectors:

chmod +x download.sh; ./download.sh

Then prepare vocabulary and initial word vectors with:

python3 prepare_vocab.py dataset/tacred dataset/vocab --glove_dir dataset/glove

This will write vocabulary and word vectors as a numpy matrix into the dir dataset/vocab.

Training

To train the AGGCN model, run:

bash train_aggcn.sh 1

Model checkpoints and logs will be saved to ./saved_models/01.

For details on the use of other parameters, please refer to train.py.

Evaluation

Our pretrained model is saved under the dir saved_models/01. To run evaluation on the test set, run:

python3 eval.py saved_models/01 --dataset test

This will use the best_model.pt file by default. Use --model checkpoint_epoch_10.pt to specify a model checkpoint file.

Retrain

Reload a pretrained model and finetune it, run:

python train.py --load --model_file saved_models/01/best_model.pt --optim sgd --lr 0.001

Related Repo

The paper uses the model DCGCN, for detail architecture please refer to the TACL19 paper Densely Connected Graph Convolutional Network for Graph-to-Sequence Learning. Codes are adapted from the repo of the EMNLP18 paper Graph Convolution over Pruned Dependency Trees Improves Relation Extraction.

Citation

@inproceedings{guo2019aggcn,
 author = {Guo, Zhijiang and Zhang, Yan and Lu, Wei},
 booktitle = {Proc. of ACL},
 title = {Attention Guided Graph Convolutional Networks for Relation Extraction},
 year = {2019}
}

aggcn's People

Contributors

cartus avatar yanzhangnlp avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.