Coder Social home page Coder Social logo

Comments (3)

ceteri avatar ceteri commented on May 18, 2024 1

Thank you @chikubee that's a good catch. The README.md describes one of the motivations for this project as entity linking although those features are a WIP. That has been explored in some tutorials, and there's WIP code in a private repo which is quite active work -- though not exposed as features here yet. We tentatively have a knowledge graph tutorial in collab with https://www.knowledgegraph.tech/ and https://connected-data.london/ scheduled for early December 2020 where that work will be presented.

I've updated the README.md to try to be more clear, as of cb51ba3 and you're added to the kudos for that.

As a simple example, the WordnetAnnotator section of https://github.com/DerwenAI/spaCy_tuTorial/blob/master/spaCy_tuTorial.ipynb gives at least a sketch of how entity linking could work:

  • make use of a KG -- in the spaCy tutorial above, WordNet supplies the semantic relations
  • use domain knowledge to constrain the search space for synsets
  • search the KG's graph neighborhood of a given entity to link hypernyms and hyponyms into the PTR lemma graph
  • benefits:
    • this enhances the centrality measures used to rank keyphrases
    • entity linking of keyphrases => KG is performed in the process

A couple questions for you:

  1. What kind of use cases do you have for entity linking features?
  2. How would you want to have the lemma graph exposed?

from pytextrank.

chikubee avatar chikubee commented on May 18, 2024

@ceteri thanks for your quick and detailed response.
Looking forward to the release., it's a great problem to solve, cheers.

I am trying to build a multi-tenant domain intelligence system.
Intent classification and entity recognition are solved problems. But understanding the utterance to identify links and map them to real world entities is challenging.

I was looking at elegant ways to identify entity groups and links
a. within the text
I want to have pizza with extra cheese, a taco, and 2 diet cokes.
(1 pizza, other: extra cheese), (1 taco), (2 diet cokes)
Who is the manager of Mike? what is his salary?
->here if graph was enriched with coref resolution, salary would get attributed to the manager
of Mike.
b. outside the text (i.e. mapped to real world entity from the domain KG and custom fed entities/attributes.
i.e. recognize ootb entities company names, positions, status, food, etc.

from pytextrank.

ceteri avatar ceteri commented on May 18, 2024

FYI, here are some more related notes and discussion #78 (comment) with introduction to kglab which is intended to provide this kind of KG support in PyTextRank.

To your point above @chikubee then the KG used for the TextRank pipeline would:

  1. enrich its internal lemma graph by importing nodes and edges from the KG, leading to better keyphrase ranking
  2. have entity linking into the KG as a side-effect
  3. then you could query via SPARQL, SHACL, or perhaps even PSL and other probabilistic methods to achieve what you wanted (Mike, his salary, etc.)

from pytextrank.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.