Coder Social home page Coder Social logo

supervised-oie's Introduction

Deprecated!

The maintenance of this project has moved to the AllenNLP framework.
Where you can use the model and an online demo. This thin wrapper may also be useful if you want to run the pretrained model.

Table of Contents generated with DocToc

supervised-oie

Code for training a supervised Neural Open IE model, as described in our NAACL2018 paper.
๐Ÿšง Still under construction ๐Ÿšง

Citing ๐Ÿ”–

If you use this software, please cite:

@InProceedings{Stanovsky2018NAACL,
  author    = {Gabriel Stanovsky and Julian Michael and Luke Zettlemoyer and Ido Dagan},
  title     = {Supervised Open Information Extraction},
  booktitle = {Proceedings of The 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT)},
  month     = {June},
  year      = {2018},
  address   = {New Orleans, Louisiana},
  publisher = {Association for Computational Linguistics},
  pages     = {(to appear)},
}

Quickstart ๐Ÿฃ

  1. Install requirements ๐Ÿ™‡
pip install requirements.txt
  1. Download embeddings ๐Ÿšถ
cd ./pretrained_word_embeddings/
./download_external.sh
  1. Train model ๐Ÿƒ
cd ./src
python  ./rnn/confidence_model.py  --train=../data/train.conll  --dev=../data/dev.conll  --test=../data/test.conll --load_hyperparams=../hyerparams/confidence.json```

NOTE: Models are saved by default to the models dir, unless a "--saveto" command line argument is passed. See confidence_model.py for more details.

  1. Predict with a trained model ๐Ÿ‘
python ./trained_oie_extractor.py \
    --model=path/to/model \
    --in=path/to/raw/sentences
    --out=path/to/output/file
    --conll

More scripts ๐Ÿšด

See src/scripts for more handy scripts. Additional documentation coming soon!

supervised-oie's People

Contributors

gabrielstanovsky avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

supervised-oie's Issues

'NoneType' object is not iterable

Hi,

When I am trying to train the model. I got the TypeError problem. Can anyone help? Thanks in advance!!!

Traceback (most recent call last): File "./rnn/confidence_model.py", line 454, in <module> rnn.plot("./confidence_model.jpg", train_fn) File "./rnn/confidence_model.py", line 122, in plot X, Y = self.load_dataset(train_fn) File "./rnn/confidence_model.py", line 139, in load_dataset self.encode_outputs(sents)) File "./rnn/confidence_model.py", line 208, in encode_outputs list(self.transform_labels(sent.label.values)), File "./rnn/confidence_model.py", line 110, in transform_labels classes = list(self.classes_()) TypeError: 'NoneType' object is not iterable

Some questions about training

Dear author,

Firstly, is your training code in Allennlp open? In the link you give, I can only find the prediction instruction and an online demo but without the training instruction.

Secondly, in your training data "train.oie.conll", some instances are exactly the same including the predicate except for their labels. Taking the example from your paper "Barack Obama, a former U.S president, was born in Hawaii.". It can generate 2 training instances with the same predicate "was born in" but with different A0 tags("Brarack Obama" and "a former U.S president"). According to your model, the inputs to the model of this 2 instances are exactly the same but you expect that it will output 2 different sets of label. Could you shed some light to this issue?

Many thanks!

can't train the model

there is no training process when I run the code following the readme.
I find something weird in the confidence_model.py,

X, Y = self.load_dataset(train_fn)

but there is no place to use X, Y?

Error while training

Traceback (most recent call last):
File "./rnn/confidence_model.py", line 462, in
rnn.set_model()
File "./rnn/confidence_model.py", line 378, in set_model
self.model = Model(input = map(itemgetter(0), true_input + corrupt_input),output = [output])
File "/home/amrit/anaconda/lib/python3.6/site-packages/keras/engine/topology.py", line 1794, in init
' (missing Keras metadata).')
TypeError: Input tensors to a Model must be Keras tensors. Found: <map object at 0x7fbb3ee77ef0> (missing Keras metadata)

evaluation script

Hi,

From your NAACL paper,

To that end, we follow He et al. (2015) which judge an argument
as correct if and only if it includes the syntactic head of the gold argument (and similarly for
predicates).For OIE2016, we use the available Penn Treebank gold syntactic trees (Marcus et al., 1993),
 while for the other test sets, we use predicted trees instead. 

I wonder if you have the matching evaluation script with the tree as the input in this repo?
Seems in supervised-oie-benchmark/matcher.py
Matcher.argMatch
Matcher.predMatch
Matcher.argMatch
Matcher.lexicalMatch

None of them correspond to that?

Thanks!

Installation documentation

Hi @gabrielStanovsky,

I just installed this software package and found a couple instructions that need to be updated.

  1. pip install -r requirements.txt
  2. hyperparam directory is missing a p in the command line example
  3. The training, dev, and test files should have a oie in the middle, e.g. train.conll should be train.oie.conll

Users will also need to install a tagger using NLTK:
nltk.download("averaged_perceptron_tagger")

Thanks for releasing this!

pretraining model

hello everyone, hope that you are doing well,

I want just to know if exists an pretrainnig model for OIE

thanks,

Statistics of AW-OIE dataset

What are the #sents and #extractions in train, test, and dev set respectively?

I only found that 12,952 tuples are yielded from QAMR in Section 5.2, but no overall statistics.

Another paper, LSOIE: A Large-Scale Dataset for Supervised Open Information Extraction, shows statistics of several datasets in Table 1. However, their #sents and #extractions (3,180 & 8,477) seem to be incorrect for OIE2016 because 3,200 & 10,359 are reported in the original paper.

Therefore, I am wondering if 3,300 & 17,165 are correct for AW-OIE or not.

How to convert BIO output into open IE (subject, predicate, object) tuples

Hi - I'm using the allennlp implementation and trying to figure out how to properly convert from BIO output into (subject, predicate, object) tuples.

It is clear when you have ARG0 V ARG1, but less clear with the following:

[ARG1: England] is [V: separated] [ARG2: from continental Europe] [ARG0: by The Irish Sea] [ARG2: to] the east and the [ARG0: English Channel] [ARG2: to the south]

I would appreciate some guidance on how to parse this.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.