Coder Social home page Coder Social logo

Comments (5)

rtroncy avatar rtroncy commented on August 21, 2024

We might also consider the other corpora used in http://arxiv.org/abs/1410.7182

from gerbil.

RicardoUsbeck avatar RicardoUsbeck commented on August 21, 2024

Can the datasets and the new experiment types added within this milestone? #8

from gerbil.

giusepperizzo avatar giusepperizzo commented on August 21, 2024

Let's recap, complementing my above list with the corpora used in [1].

NER

  • newswire: CoNLL2003
  • microposts: Microposts2013, Ritter dataset, UMBC dataset

NEL

  • microposts: Microposts2014, Derczynski dataset, WSDM2012

NER+NEL (all together)

  • newswire: WEKEX'11

Hence, about scoring:

  • NER: shall we adopt the same scoring logic used in CoNLL [2]? It requires to add another evaluation strategy (I see it in milestone 2). Anyway we need a discussion to tackle this. Do you agree?
  • NER+NEL: this is the case where we can use the TAC KBP scorer (issue #8). As above, it requires a new evaluation strategy (I see it in milestone 2). Anyway we need a discussion to tackle this. Do you agree?
  • NEL: we may use the set of evaluation methods already implemented in BAT. We need just to adopt the expected output to Wikipedia and to align the scorer. I'll try to work on the integration of the Microposts2014 for the 10th (milestone 1). The others we may postpone to milestone 2. Is it ok?

[1] - Derczynski L., Maynard D., Rizzo G., van Erp M., Gorrell G., Troncy R., Petrak J., Bontcheva K. (2014) Analysis of Named Entity Recognition and Linking for Tweets. In: Information Processing and Management
[2] - http://www.cnts.ua.ac.be/conll2000/chunking/conlleval.txt

from gerbil.

RicardoUsbeck avatar RicardoUsbeck commented on August 21, 2024
  • We are moving typing and salience experiments and datasets to milestone 2 because of time constraints.
  • I would be happy if you could write a wrapper for Microposts2014 and add a description to the article
  • Please also fill-out https://github.com/AKSW/gerbil/wiki/Licences-for-datasets
  • I will open separate issues for the above mentioned datasets for milestone 2.

from gerbil.

RicardoUsbeck avatar RicardoUsbeck commented on August 21, 2024

see #47

from gerbil.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.