sdm-tib / falcon2.0 Goto Github PK

View Code? Open in Web Editor NEW

109.0 6.0 21.0 4.55 MB

Falcon 2.0 is a joint entity and relation linking tool over Wikidata.

Home Page: https://labs.tib.eu/falcon/falcon2/

License: MIT License

Python 100.00%

entity-linking relation-extraction entity-extraction wikidata dbpedia knowledge-graph natural-language-processing nlp

falcon2.0's People

Contributors

Stargazers

Watchers

falcon2.0's Issues

Some kind of entity sorting error

Hi,

I've encountered errors when querying entities of single digits e.g. Earth is Q2.

The error is logged below.

    for entity in sorted(raw , key=lambda x: (-x[3],-x[2],int(x[1][x[1].rfind("/")+2:-1])))[:k]:
ValueError: invalid literal for int() with base 10: ''

I've managed to fix this with the following indexing where the -1 in the slicing is removed.

    for entity in sorted(raw , key=lambda x: (-x[3],-x[2],int(x[1][x[1].rfind("/")+2:])))[:k]:

I believe this -1 truncates the sorted id by a digit unintentionally at the back. For example:

1. Q2' -> ''
2. 'Q123' -> 12

Hoping to hear if this is a correct change please or if it can affect the overall package. Thanks 😄

Handling `'s` in entity indexing

Had difficulty parsing the following:

print( process_text_E_R("Hong Kong's",rules) )

The resulting error is:

ValueError: 'Kong' is not in list

This seems to come from the mismatch where ["Hong", "Kong's"].index("Kong").

I've tried a fix by adding new rule in the various entity cleaning portions. Hoping to hear if this would make sense with the rules and parsing. Thank you 😸

            for ent in entities: 
                ent=ent.replace("?","")
                ent=ent.replace(".","")
                ent=ent.replace("!","")
                ent=ent.replace("\\","")
                ent=ent.replace("#","")
                ent=ent.replace("'s","") < added new rule in line 439
                if token.text in ent:

FileNotFound Error

Hi @AhmadSakor,
When I set up this code, I got an error called FileNotFoundError: [Errno 2] No such file or directory: 'datasets/results/test_api/falcon_lcquad2.csv' in evaluateFalconAPI.py file. The same error came when running evaluateFalconAPI_entities.py file (falcon_simple_test.csv not found). Pls, be kind enough to provide me with these CSV files or give any solution to solve these errors.

Unable to connect to extended KG

Entity/Relation Linking on brand new question

Hi, thanks for publishing this useful tool!

My question is which function is for entity/relation linking? I want to conduct linking on a brand new question.

Thanks.

Project environment

Hi, I can't install Spacy 2.0.12. . Is this work on Linux only?

Multiple unresolved Merge Issues

The main.py file in the master branch has multiple merge issues that still have not been resolved.

Elasticdump for wikidata dump takes a long time

Hi, I've followed the instructions to use elasticdump to place the wikidata into elasticsearch. However, elasticdump has been running for a long time.

Is there an estimate on how long will it take for the 9gb of data for just the entities?
Is there a smaller dataset that I can try this on?

Thanks.

Small query on the output format

Would like to raise 2 (points / questions):

(1) the doctype should be doc or _doc in the Elastic submodule?
The source code by default reads doc, but the Elasticdump seems to add _doc by default.
It's a small point, but thought it should be raised, in case this affects adding new docs.

(2)
How to interpret the result?
Trying Falcon on random questions provide the following results. How do we interpret the integers that come after the list of links? Thank you.

>>>    process_text_E_R('Who is Michelle Obama?',rules)
>>>    process_text_E_R('Where is Gracht?',rules)
0
['Who is Michelle Obama?', [], [['<http://www.wikidata.org/entity/Q13133>', 'Michelle obama']], 0, 0, 0, 0]
1
['Where is Gracht?', [], [['<http://www.wikidata.org/entity/Q896611>', 'Gracht']], 0, 0, 0, 0]

Named entity recognition

Hi, Does this project have named entity recognition cz um new to this area. If so can you tell me the scripts names which include it

File not found

Hi, thanks for your effort on developing this useful tool~

I follow the instruction to create index by

    propertyIndexAdd()
    entitiesIndexAdd()

but got error
FileNotFoundError: [Errno 2] No such file or directory: '../data/dbpredicateindex.json'

I want to use falcon2 as a relation linking tool, what should I do?

Besides, I find my import speed is very very slow when I import the wikidataentity.json into elasticsearch, do you have any idea about it?

Thanks.

sdm-tib / falcon2.0 Goto Github PK

falcon2.0's People

Contributors

Stargazers

Watchers

Forkers

falcon2.0's Issues

Some kind of entity sorting error

Handling `'s` in entity indexing

FileNotFound Error

Unable to connect to extended KG

Entity/Relation Linking on brand new question

Project environment

Multiple unresolved Merge Issues

Elasticdump for wikidata dump takes a long time

Small query on the output format

Named entity recognition

File not found

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent