Coder Social home page Coder Social logo

heipihanhan / emnlp2017-relation-extraction Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ukplab/emnlp2017-relation-extraction

0.0 2.0 0.0 547 KB

Context-Aware Representations for Knowledge Base Relation Extraction

License: Apache License 2.0

Jupyter Notebook 9.93% Python 74.77% HTML 4.43% CSS 2.41% JavaScript 8.45%

emnlp2017-relation-extraction's Introduction

Context-Aware Representations for Knowledge Base Relation Extraction

Relation extraction on an open-domain knowledge base

Accompanying repository for our EMNLP 2017 paper (full paper). It contains the code to replicate the experiments and the pre-trained models for sentence-level relation extraction.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Please use the following citation:

@inproceedings{TUD-CS-2017-0119,
	title = {Context-Aware Representations for Knowledge Base Relation Extraction},
	author = {Sorokin, Daniil and Gurevych, Iryna},
	booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
	pages = {(to appear)},
	year = {2017},
	location = {Copenhagen, Denmark},
}

Paper abstract:

We demonstrate that for sentence-level relation extraction it is beneficial to consider other relations in the sentential context while predicting the target relation. Our architecture uses an LSTM-based encoder to jointly learn representations for all relations in a single sentence. We combine the context representations with an attention mechanism to make the final prediction. We use the Wikidata knowledge base to construct a dataset of multiple relations per sentence and to evaluate our approach. Compared to a baseline system, our method results in an average error reduction of 24% on a held-out set of relations.

Please, refer to the paper for more details.

The dataset described in the paper can be found here:

Contacts:

If you have any questions regarding the code, please, don't hesitate to contact the authors or report an issue.

Demo:

You can try out the model on single sentences in our demo:

http://semanticparsing.ukp.informatik.tu-darmstadt.de:5000/relation-extraction/

Project structure:

relation_extraction/
├── eval.py
├── model-train-and-test.py
├── notebooks
├── optimization_space.py
├── core
│   ├── parser.py
│   ├── embeddings.py
│   ├── entity_extraction.py
│   └── keras_models.py
├── relextserver
│   └── server.py
├── graph
│   ├── graph_utils.py
│   ├── io.py
│   └── vis_utils.py
├── stanford_tag_dataset.py
└── evaluation
    └── metrics.py
resources/
├── properties-with-labels.txt
└── property_blacklist.txt
FileDescription
relation_extraction/Main Python module
relation_extraction/coreModels for joint relation extraction
relation_extraction/relextserverThe code for the web demo.
relation_extraction/graphIO and processing for relation graphs
relation_extraction/evaluationEvaluation metrics
resources/Necessary resources
data/curves/The precision-recall curves for each model on the held out data

Setup:

  1. We recommend that you setup a new pip environment first: http://docs.python-guide.org/en/latest/dev/virtualenvs/

  2. Check out the repository and run:

pip3 install -r requirements.txt
  1. Set the Keras (deep learning library) backend to TensorFlow with the following command:
export KERAS_BACKEND=tensorflow

You can also permanently change Keras backend (read more: https://keras.io/backend/). Note that in order to reproduce the experiments in the paper you have to use Theano as a backend instead.

  1. Download the data, if you want to replicate the experiments from the paper. Extract the archive inside emnlp2017-relation-extraction/data/wikipedia-wikidata/. The data was preprocessed using Stanford Core NLP 3.7.0 models. See stanford_tag_dataset.py for more information.

  2. Download the GloVe embeddings, glove.6B.zip and put them into the folder emnlp2017-relation-extraction/resources/glove/. You can change the path to word embeddings in the model_params.json file if needed.

Pre-trained models:

  • You can download the models that were used in the experiments here
  • See Using pre-trained models.ipynb for a detailed example on how to use the pre-trained models in your code

Reproducing the experiments from the paper

To reproduce the experiments please refer to the version of the code that was published with the paper: tag emnlp17

In any other case, we recommend using the most recent version.

  1. Complete the setup above

  2. Run python model_train.py in emnlp2017-relation-extraction/relation_extraction/ to see the list of parameters

  3. If you put the data into the default folders you can train the ContextWeighted model with the following command:

python model_train.py model_ContextWeighted train ../data/wikipedia-wikidata/enwiki-20160501/semantic-graphs-filtered-training.02_06.json ../data/wikipedia-wikidata/enwiki-20160501/semantic-graphs-filtered-validation.02_06.json
  1. Run the following command to compute the precision-recall curves:
python precision_recall_curves.py model_ContextWeighted ../data/wikipedia-wikidata/enwiki-20160501/semantic-graphs-filtered-held-out.02_06.json

Notes

  • The web demo code is provided for information only. It is not meant to be run elsewhere.

Requirements:

  • Python 3.4
  • Keras 2.1.5
  • TensorFlow 1.6.0
  • See requirements.txt for library requirements.

License:

  • Apache License Version 2.0

emnlp2017-relation-extraction's People

Contributors

daniilsorokin avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.