dpaperno / discriminatt Goto Github PK

View Code? Open in Web Editor NEW

15.0 15.0 9.0 538 KB

License: BSD 2-Clause "Simplified" License

Python 100.00%

discriminatt's People

Contributors

Stargazers

Watchers

Forkers

banaee jogonba2 hadyelsahar linron84 msmaltsev rookzok kzinmr wuch15 asyrofist

discriminatt's Issues

unrelated concepts in task description and example data

In the task description it is stated, "... detecting the difference between two unrelated concepts, such as a narwhal and a tractor, would not constitute a very interesting task."

But then in the trial data you have:

fawn,blouse,fast,1
fawn,blouse,leaves,1
fawn,blouse,spots,1
fawn,blouse,eats,1
fawn,blouse,small,1
fawn,blouse,grass,1
fawn,blouse,tail,1
fawn,coat,fast,1
fawn,coat,leaves,1
fawn,coat,spots,1

Is the test data going to include these kind of incomparable concepts ?

Definitions in evaluation script

We have been trying a random baseline, and are confused with the numbers we are getting for precision/recall/F-score.

The evaluation script is returning 0.66 for F-score for the random baseline, which seems a bit odd. Suppose we have a truth file:

a,b,c,1
a,b,c,0
a,b,c,1
a,b,c,0
a,b,c,1
a,b,c,0
a,b,c,1
a,b,c,0
a,b,c,1
a,b,c,0

And a random selection from the classifier:

a,b,c,0
a,b,c,1
a,b,c,1
a,b,c,1
a,b,c,1
a,b,c,0
a,b,c,1
a,b,c,0
a,b,c,0
a,b,c,1

In principle the F-score should be around 0.5, but we get 0.66. We think this is possibly because of how the true/false positives/negatives are calculated.

Based on how we calculate the false positives/negatives we should calculate the true positives/negatives in the same way. Right now we count both true positives and true negatives as true positives, whereas false negatives/positives are split.

Perhaps the evaluation could calculate the numbers for both classes and average ? Or alternatively perhaps the Evaluation page on the CodaLab could be more specific with how these are calculated (i.e. that the evaluation isn't necessarily conducted in the way that might be expected from the name).

UnboundLocalError in evaluation.py

In the evaluation file, f1_positives and f1_negatives are defined within conditionals. I think it is better if they are initialized beforehand as this might cause UnboundLocalError on line 31. Those conditions cannot both hold true all the time (e.g. a system that produces zero true positives but a lot of true negatives).

Open / closed competition?

Is it possible to use external data sources (externally-trained embeddings, Wikipedia, parsing/processing) or should we just train on the data available here ?

dpaperno / discriminatt Goto Github PK

discriminatt's People

Contributors

Stargazers

Watchers

Forkers

discriminatt's Issues

unrelated concepts in task description and example data

Definitions in evaluation script

UnboundLocalError in evaluation.py

Open / closed competition?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent