manojprabhakar / cholan Goto Github PK

32.0 6.0 3.0 42 KB

CHOLAN: A Modular Approach for Neural Entity Linking on Wikipedia and Wikidata

Python 100.00%

entity-linking bert candidate-generation wikipedia wikidata

cholan's Issues

What are the train/test splits of T-Rex used in the paper?

Hi. Thank you for the great work.

I was wondering if you could provide the train/test splits of T-Rex used in the paper?
In your README file, there is a link to download the file CHOLAN-EL-TREX.tsv. But there is no indication of which line in the file belongs to the train set or the test set.

Furthermore, I counted the number of data lines in that file. There were 1,089,661 data lines (except the header line). However, your paper mentions that "the dataset has 983,257 sentences". So was the file the same data you used in your paper?

Thank you.

Use CHOLAN for inference

I find your project very interesting but I find a strong drawback in the fact that you don't provide enough documentation on how to use your model for inference. Do you plan to add this information in future?
Looking forward to your reply :)

How to get the pretrained_bert_ner model?

I tried to run this code, but I didn't know how to get the bert_ner_nocll pre_trained model. Can you give me a detailed description?

Add LICENSE

to clarify reusability

Could you please give a more detailed instruction to run or adapt it to a serve?

I have read your paper,it's an amazing achievement.Now I'm trying to use it in my research experiment with full of expectation.However,it is difficult for me to run it with current brief instructions.I believe that there would be many people also want it.

Files missing

First of all thank you for making your code publicly available. We are working on an evaluation tool for entity linking systems and would love to include your system and reproduce your results. I did however not succeed in running your code and the provided instructions are a bit sparse.

More specificly, when calling python cholan.py as instructed here in the directory CHOLAN/Cholan_T-REx/End2End, I get the error message

Traceback (most recent call last):
  File "cholan.py", line 60, in <module>
    df_target = pd.read_csv(predict_data_dir + "ned_target_data.tsv", sep="\t", encoding='utf-8')
    ...
FileNotFoundError: [Errno 2] No such file or directory: '/data/prabhakar/CG/prediction_data/data_10000/ned_target_data.tsv'

When running python cholan.py in the directory Cholan_CoNLL_AIDA/End2End, I get the error message

Traceback (most recent call last):
  File "cholan.py", line 65, in <module>
    df_ned = pd.read_csv(predict_data_dir + "ned_data.tsv", sep='\t', encoding='utf-8', usecols=['sequence1', 'sequence2', 'label'])
    ...
FileNotFoundError: [Errno 2] No such file or directory: '/data/prabhakar/CG/WNED/msnbc/prediction_data/data_full/Zeroshot/ned_data.tsv'

Neither of these files are included in any of the linked data packages or the linked repositories.

Could you please provide the necessary data and provide some more instructions on how to use your code and reproduce your results?

manojprabhakar / cholan Goto Github PK

cholan's Issues

What are the train/test splits of T-Rex used in the paper?

Use CHOLAN for inference

How to get the pretrained_bert_ner model?

Add LICENSE

Could you please give a more detailed instruction to run or adapt it to a serve?

Files missing

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent