We've had lots of discussions about how to handle the evaluation metrics, between usin

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Evaluation metrics,about coleridge-initiative/rclc

Comments (4)

ceteri commented on June 15, 2024 1

Yes, that's a good way to describe the problem. The labels to be predicted are the dataset IDs identified for each publication.

The dataset IDs within the corpus represent the set of all possible datasets which will appear.

There's a related problem regarding how to crawl the web and discover datasets that might be identified -- i.e., having an open-ended problem where all of the possible datasets aren't known a priori. We aren't attempting to address that problems in this leaderboard competition, although it comes in later.

I'll update the wiki notes, too.

from rclc.

HaritzPuerto commented on June 15, 2024

Hi, evaluating our model we found the following cases:

Third National Health and Nutrition Examination Survey
third NHANES
NHANES III data

All these names refer to the same dataset: Dataset id: 'https://github.com/Coleridge-Initiative/adrf-onto/wiki/Vocabulary#dataset-0a7b604ab2e52411d45a'

However, in dct:alternative we don't find them. dct:alternative = ['NHANES I', 'NHANES II', 'NHANES III', 'NHANES']

We think it is unfair to consider these 3 cases as wrong since it is clear that they are right.

So how about using F1 matching as in SQuAD https://arxiv.org/pdf/1606.05250.pdf?

from rclc.

ceteri commented on June 15, 2024

Hi @HaritzPuerto,

That's a really good point -

The names in the dct:alternative field are just informational. The ML models don't need to use them in any way.

Those alternative names are what our human annotators have encountered when reading PDFs to identify dataset references manually.

I'll make a note in the wiki to explain more about the alternative name.

Thank you,
Paco

from rclc.

HaritzPuerto commented on June 15, 2024

Hi @ceteri

Then, I wonder if the only golden label we can use is the dataset id. If so, then our model should return datasets id. To do this, the model needs to know all possible datasets that can appear in the publications. Is this assumption correct?

Thank you,
Haritz

from rclc.

Recommend Projects

Evaluation metrics about rclc HOT 4 CLOSED

Comments (4)

Related Issues (10)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent