Coder Social home page Coder Social logo

joypp / emnlp2017-relation-extraction Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ukplab/emnlp2017-relation-extraction

0.0 2.0 0.0 571 KB

Context-Aware Representations for Knowledge Base Relation Extraction

License: Apache License 2.0

Jupyter Notebook 10.04% Python 74.68% HTML 4.43% CSS 2.41% JavaScript 8.44%

emnlp2017-relation-extraction's Introduction

Context-Aware Representations for Knowledge Base Relation Extraction

Relation extraction on an open-domain knowledge base

Accompanying repository for our EMNLP 2017 paper (full paper). It contains the code to replicate the experiments and the pre-trained models for sentence-level relation extraction. See below for links to other work on knowledge bases, question answering and graph neural networks.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Please use the following citation:

@inproceedings{TUD-CS-2017-0119,
	title = {{Context-Aware Representations for Knowledge Base Relation Extraction}},
	author = {Sorokin, Daniil and Gurevych, Iryna},
	booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
	pages = {1784-1789},
	year = {2017},
	location = {Copenhagen, Denmark},
	publisher = {Association for Computational Linguistics},
	doi = {10.18653/v1/D17-1188}
}

Paper abstract:

We demonstrate that for sentence-level relation extraction it is beneficial to consider other relations in the sentential context while predicting the target relation. Our architecture uses an LSTM-based encoder to jointly learn representations for all relations in a single sentence. We combine the context representations with an attention mechanism to make the final prediction. We use the Wikidata knowledge base to construct a dataset of multiple relations per sentence and to evaluate our approach. Compared to a baseline system, our method results in an average error reduction of 24% on a held-out set of relations.

Please, refer to the paper for more details.

The dataset described in the paper can be found here:

Contacts:

If you have any questions regarding the code, please, don't hesitate to contact the authors or report an issue.

UKP Lab work knowledge bases:

If you came here looking for our other work on linking text to Wikidata you can also find useful the following links

Demo:

You can try out the model on single sentences in our demo:

http://semanticparsing.ukp.informatik.tu-darmstadt.de:5000/relation-extraction/

Wikipedia-Wikidata sentence-level relation data set

  • Download the data set from the paper here. See the data set ReadMe for more information on the format and see the paper on data set construction.

Project structure:

relation_extraction/
├── eval.py
├── model-train-and-test.py
├── notebooks
├── optimization_space.py
├── core
│   ├── parser.py
│   ├── embeddings.py
│   ├── entity_extraction.py
│   └── keras_models.py
├── relextserver
│   └── server.py
├── graph
│   ├── graph_utils.py
│   ├── io.py
│   └── vis_utils.py
├── stanford_tag_dataset.py
└── evaluation
    └── metrics.py
resources/
├── properties-with-labels.txt
└── property_blacklist.txt
FileDescription
relation_extraction/Main Python module
relation_extraction/coreModels for joint relation extraction
relation_extraction/relextserverThe code for the web demo.
relation_extraction/graphIO and processing for relation graphs
relation_extraction/evaluationEvaluation metrics
resources/Necessary resources
data/curves/The precision-recall curves for each model on the held out data

Setup:

  1. We recommend that you setup a new pip environment first: http://docs.python-guide.org/en/latest/dev/virtualenvs/

  2. Check out the repository and run:

pip3 install -r requirements.txt
  1. Set the Keras (deep learning library) backend to TensorFlow with the following command:
export KERAS_BACKEND=tensorflow

You can also permanently change Keras backend (read more: https://keras.io/backend/). Note that in order to reproduce the experiments in the paper you have to use Theano as a backend instead.

  1. Download the data, if you want to replicate the experiments from the paper. Extract the archive inside emnlp2017-relation-extraction/data/wikipedia-wikidata/. The data was preprocessed using Stanford Core NLP 3.7.0 models. See stanford_tag_dataset.py for more information.

  2. Download the GloVe embeddings, glove.6B.zip and put them into the folder emnlp2017-relation-extraction/resources/glove/. You can change the path to word embeddings in the model_params.json file if needed.

Pre-trained models:

  • You can download the models that were used in the experiments here
  • See Using pre-trained models.ipynb for a detailed example on how to use the pre-trained models in your code

Reproducing the experiments from the paper

To reproduce the experiments please refer to the version of the code that was published with the paper: tag emnlp17

In any other case, we recommend using the most recent version.

  1. Complete the setup above

  2. Run python model_train.py in emnlp2017-relation-extraction/relation_extraction/ to see the list of parameters

  3. If you put the data into the default folders you can train the ContextWeighted model with the following command:

python model_train.py model_ContextWeighted train ../data/wikipedia-wikidata/enwiki-20160501/semantic-graphs-filtered-training.02_06.json ../data/wikipedia-wikidata/enwiki-20160501/semantic-graphs-filtered-validation.02_06.json
  1. Run the following command to compute the precision-recall curves:
python precision_recall_curves.py model_ContextWeighted ../data/wikipedia-wikidata/enwiki-20160501/semantic-graphs-filtered-held-out.02_06.json

Notes

  • The web demo code is provided for information only. It is not meant to be run elsewhere.

Requirements:

  • Python 3.6
  • Keras 2.1.5
  • TensorFlow 1.6.0
  • See requirements.txt for library requirements.

License:

  • Apache License Version 2.0

emnlp2017-relation-extraction's People

Contributors

daniilsorokin avatar azpoliak avatar

Watchers

James Cloos avatar Piaopiao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.