Coder Social home page Coder Social logo

hoie's Introduction

HOIE-Hybrid Open Information Extraction

Ⅰ、Architecture of HOIE

流程图

Ⅱ、Approach

Pipeline open information extraction (OIE) systems mainly have two sub-tasks (components): named entity recognition (NER) and relation extraction (RE). In pipeline OIE systems, NER task serves as input for all subsequent sub-tasks, which plays an important role.

  1. Implicit information extraction We exploit dependency parsing (DP) to overcome issues that cannot recoginize implicit entities. In more detail, before NER phrase, HoIE exploits DP to get dependency information among constituents of each sentences. Then with these information and a set of domain-independent rules, HoIE could extract implicit information, for example, from "This is not his bike, but Tom's" we can extract implicit triple (Tom; has; bike) and from "Bell , a telecommunication company , which is based in Los Angeles" we can extract triples (Bell; is; a telecommunication company) and (Bell; is based in; Los Angeles)

  2. Compound sentence simplified (C2S)

  3. 8 scenarios to extract relation triples

  4. Distributions of different scenarios (941 sentences totally)

Ⅲ、Dataset

  1. CaRB:CaRB is a dataset cum evaluation framework for benchmarking Open Information Extraction systems, which has a size of 641 sentences.

  2. CaRB-complex-45:CaRB45 is a dataset seleceted from CaRB, which has a size of 45 sentences. Each sentence has following features:

    • It is a compound sentence.
    • It includes at least two fact triples.
    • It doesn't involve reporting verbs like said, told, asked, etc.
  3. BenchIE: BenchIE is a benchmark for measuring performance of Open Information Extraction (OIE) systems, which has a size of 300 sentences. In contrast to CaRB,BenchIE takes into account informational equivalence of extractions. Its gold standard consists of fact synsets, clusters in which all surface forms of the same fact are listed.

  4. OIE2016: TODO

Ⅳ、Evaluation

  1. Performance of HoIE and various OIE systems on CaRB

    System precision recall F1-score
    Ollie 0.505 0.346 0.411
    Props 0.340 0.300 0.319
    OpenIE4 0.553 0.437 0.488
    OpenIE5 0.521 0.424 0.467
    ClauseIE 0.521 0.424 0.450
    HoIE 0.600 0.457 0.518
  2. Performance of HoIE and various OIE systems on BenchIE

    System precision recall F1-score
    ClausIE 0.50 0.25 0.33
    MinIE 0.43 0.28 0.34
    Stanford 0.11 0.16 0.13
    ROIE 0.37 0.08 0.13
    OpenIE6 0.31 0.21 0.25
    MOIE 0.39 0.16 0.23
    HoIE 0.31 0.16 0.21
  3. Comparison of various component-ablated versions of HoIE on CaRB

  4. Comparison of various component-ablated versions of HoIE on CaRB-complex-45

  5. Contribution of different scenarios

问题定义、相关工作、个人工作、结论

hoie's People

Contributors

rvlis avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.