Comments (5)
We might also consider the other corpora used in http://arxiv.org/abs/1410.7182
from gerbil.
Can the datasets and the new experiment types added within this milestone? #8
from gerbil.
Let's recap, complementing my above list with the corpora used in [1].
NER
- newswire: CoNLL2003
- microposts: Microposts2013, Ritter dataset, UMBC dataset
NEL
- microposts: Microposts2014, Derczynski dataset, WSDM2012
NER+NEL (all together)
- newswire: WEKEX'11
Hence, about scoring:
- NER: shall we adopt the same scoring logic used in CoNLL [2]? It requires to add another evaluation strategy (I see it in milestone 2). Anyway we need a discussion to tackle this. Do you agree?
- NER+NEL: this is the case where we can use the TAC KBP scorer (issue #8). As above, it requires a new evaluation strategy (I see it in milestone 2). Anyway we need a discussion to tackle this. Do you agree?
- NEL: we may use the set of evaluation methods already implemented in BAT. We need just to adopt the expected output to Wikipedia and to align the scorer. I'll try to work on the integration of the Microposts2014 for the 10th (milestone 1). The others we may postpone to milestone 2. Is it ok?
[1] - Derczynski L., Maynard D., Rizzo G., van Erp M., Gorrell G., Troncy R., Petrak J., Bontcheva K. (2014) Analysis of Named Entity Recognition and Linking for Tweets. In: Information Processing and Management
[2] - http://www.cnts.ua.ac.be/conll2000/chunking/conlleval.txt
from gerbil.
- We are moving typing and salience experiments and datasets to milestone 2 because of time constraints.
- I would be happy if you could write a wrapper for Microposts2014 and add a description to the article
- Please also fill-out https://github.com/AKSW/gerbil/wiki/Licences-for-datasets
- I will open separate issues for the above mentioned datasets for milestone 2.
from gerbil.
see #47
from gerbil.
Related Issues (20)
- GERBIL aborts the start of an experiment without informing the user HOT 4
- File-based cache can slow down experiments HOT 3
- QAnswer not displayed in the systems list HOT 5
- Getting timeout while using the automatic upload HOT 2
- Error running experiment with DBPedia Spotlight on N3-Reuters-128 HOT 2
- sorry,i donot know whether this can run on my linux and just put the url in edge/chrome , or i should use the api? or the HOT 5
- Integrate MasakhaNER datasets
- [QA] The about page does not work HOT 5
- deny the annotator HOT 4
- Elasticsearch causes module errors
- SameAs retrieval causes problems in a JUnit test
- Increase Spring version
- Sometimes in experiment results statusCode is 0 but there are no results yet HOT 1
- Add IndQNER
- LP: Implement GERBIL instance for the evaluation of Link Prediction
- Integrate NameTag and other EOSC tools
- [QA] Leaderboard for challenges is too slow
- The link to gerbil_data.zip has to be manually updated in Github actions
- Check file after upload
- User defined description of experiments
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gerbil.