askplatypus / wikidata-simplequestions Goto Github PK

View Code? Open in Web Editor NEW

80.0 80.0 18.0 41.73 MB

Mapping of the SimpleQuestions dataset to Wikidata

License: Other

Jupyter Notebook 74.58% Python 25.42%

benchmark freebase question-answering wikidata

wikidata-simplequestions's People

Stargazers

Watchers

Forkers

daniel-mietchen hadyelsahar d063520 pi19404 ningding97 shubhampachori12110095 chenq1114 zychen423 mdtux89 wangdongde rashad101 svakulenk0 janvisahu basaldella yueza mariapass andhmak

wikidata-simplequestions's Issues

The file 'answerable.py' is empty

The size of 'answerable.py' is 0 Bytes. The content is missing.

Connection error happened in the method get_reverse(name).

After I did not use the get_reverse() method ,the result was the property mappings with 323 matches.So how can I do to get the other mappings?Thank you!

about mid_to_qid.tsv

I have three questions:

how do you construct this mid_to_qid.tsv? is there any propery that can link freebase and wikidata?
what is the use of https://www.wikidata.org/wiki/Wikidata:WikiProject_Freebase/Mapping, it seems there are only thousands of mapping from freebase to wikidata
is there any way to convert a wikidata property to DBpedia?

I would really appreciate any help in this regard. Thanks!

Data with String Label Version

Dear authors,

Thank you very much for your work.
Do you have a version of the data with explicitly the string label of the entity and property?
Like this:
Alex Golfis \t place of birth \t Athens \t what city was alex golfis born in
Instead of this:
Q16330302 \t P19 \t Q1524 \t what city was alex golfis born in

Thank you for your attention.

'answerable' questions are not answerable

The files ending with "_answerable" contain only triples that are also in Wikidata.

In the first few lines in annotated_wd_data_test_answerable.txt, there are several issues:

'Which genre of album is harder.....faster?': different result (rock music vs. classic rock)
'what city was alex golfis born in': fine
'what film is by the writer phil hay?': would be fine, but the triple is incorrect in wikidata
'Which equestrian was born in dublin?': There is no 'place of birth' for Mark Kyle in wikidata.
'What is a tv action show?': m/01htzx (Action) is mapped to Q11272426 (some church in the Ukraine)
'what's akbar tandjung's ethnicity': The triple is not part of wikidata.
'Which Swiss conductor's cause of death is myocardial infarction?': fine
'where was padraic mcguinness's place of death': fine
'Who influenced michael mcdowell?': The triple is not part of wikidata.
'which military was involved in the second battle of fort fisher': The triple is not part of wikidata.

So, in these first ten lines, there are three or four correct entries, five which are not answerable and one where the mapping is incorrect.

Scaling that up would mean that I can trust about 40% of all the 'answerable' examples. That's not a lot and makes the dataset unusable in my opinion.

question on the files ends with 'answerable'

This work is very interesting and helpful.
I have a question is that what's files ends with 'answerable' means?

The files ending with "_full" contain only triples that are also in Wikidata.
And there is no file ending with "_full".

Thanks again, and merry Christmas!

QALD format

Hi,
thanks for this data!
I noticed that annotated_wd_data_test_answerable.txt contains 5621 questions, however qald-format/annotated_wd_data_test.json contains 5721 (jq ".questions[].query.answers" annotated_wd_data_test.json | grep entity) Does the qald-format contain the same data as the *_anwserable.txt files ? Further, the qald data contains multiple answers to the questions (if applicable) but in the *_anwserable.txt files there is always exactly one question (and not always the same as in the qald-format files, e.g what is the film genre for snow falling on cedars? has as answers the entities
Q1054574,Q1257444, Q130232 and Q3072039 in qald-format/annotated_wd_data_test.json (and in wikidata.org) but Q1257444 in annotated_wd_data_test.txt (possibly old data from freebase ?)).
Concerning the qald-format directory: What is the difference between annotated_wd_data_*_full.json and annotated_wd_data_*.json (for instance annotated_wd_data_train_full.json is much much large as annotated_wd_data_train.json, for valid and test it is the opposite.

Wrong entities in the qald-format directory

It seems that the queries in the 'qald-format' directory point to the answer entities instead of the question entities in some cases. Eg., in the first example the query contains the link for Saving Shiloh rather than for Warner Bros.

askplatypus / wikidata-simplequestions Goto Github PK

wikidata-simplequestions's People

Stargazers

Watchers

Forkers

wikidata-simplequestions's Issues

The file 'answerable.py' is empty

Connection error happened in the method get_reverse(name).

about mid_to_qid.tsv

Data with String Label Version

'answerable' questions are not answerable

question on the files ends with 'answerable'

QALD format

Wrong entities in the qald-format directory

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent