Coder Social home page Coder Social logo

conlleval's Introduction

conlleval

This is a pure python port of the perl evaluation script for the CoNLL-2000 shard task. Supports both IOB2 and IOBES formats.

Getting Started

Either get this package through PyPI (pip install conlleval) or install it from this repository after cloning it (python setup.py install).

You can run this package directly:

python -m conlleval tests/cases/input-001.txt

or import it as a library:

>>> import conlleval
>>> lines = """Rockwell NNP B-NP I-NP
International NNP I-NP I-NP
Corp. NNP I-NP I-NP
's POS B-NP B-NP
Tulsa NNP I-NP I-NP
unit NN I-NP I-NP
said VBD B-VP B-VP
it PRP B-NP B-NP
signed VBD B-VP B-VP
a DT B-NP B-NP
tentative JJ I-NP I-NP
agreement NN I-NP I-NP
extending VBG B-VP B-VP
its PRP$ B-NP B-NP
contract NN I-NP I-NP
with IN B-PP B-PP
Boeing NNP B-NP I-NP
Co. NNP I-NP I-NP
to TO B-VP B-PP
provide VB I-VP I-VP
structural JJ B-NP I-NP
parts NNS I-NP I-NP
for IN B-PP B-PP
Boeing NNP B-NP I-NP
's POS B-NP B-NP
747 CD I-NP I-NP
jetliners NNS I-NP I-NP
. . O O
""".splitlines()
>>> res = conlleval.evaluate(lines)
>>> import pprint
>>> pprint.pprint(res)
{'overall': {'chunks': {'evals': {'f1': 0.9032258064516129,
                                  'prec': 0.875,
                                  'rec': 0.9333333333333333},
                        'stats': {'correct': 14, 'gold': 15, 'pred': 16}},
             'tags': {'evals': {'f1': 0.8214285714285714,
                                'prec': 0.8214285714285714,
                                'rec': 0.8214285714285714},
                      'stats': {'correct': 23, 'gold': 28, 'pred': 28}}},
 'slots': {'chunks': {'NP': {'evals': {'f1': 1.0, 'prec': 1.0, 'rec': 1.0},
                             'stats': {'correct': 9, 'gold': 9, 'pred': 9}},
                      'PP': {'evals': {'f1': 0.8,
                                       'prec': 0.6666666666666666,
                                       'rec': 1.0},
                             'stats': {'correct': 2, 'gold': 2, 'pred': 3}},
                      'VP': {'evals': {'f1': 0.75, 'prec': 0.75, 'rec': 0.75},
                             'stats': {'correct': 3, 'gold': 4, 'pred': 4}}},
           'tags': {'NP': {'evals': {'f1': 0.8000000000000002,
                                     'prec': 0.8,
                                     'rec': 0.8},
                           'stats': {'correct': 16, 'gold': 20, 'pred': 20}},
                    'PP': {'evals': {'f1': 0.8,
                                     'prec': 0.6666666666666666,
                                     'rec': 1.0},
                           'stats': {'correct': 2, 'gold': 2, 'pred': 3}},
                    'VP': {'evals': {'f1': 0.888888888888889,
                                     'prec': 1.0,
                                     'rec': 0.8},
                           'stats': {'correct': 4, 'gold': 5, 'pred': 4}}}}}
>>> print(conlleval.report(res))
processed 28 tokens with 15 phrases; found: 16 phrases; correct: 14.
accuracy:  82.14%; precision:  87.50%; recall:  93.33%; FB1:  90.32
               NP: precision: 100.00%; recall: 100.00%; FB1: 100.00  9
               PP: precision:  66.67%; recall: 100.00%; FB1:  80.00  3
               VP: precision:  75.00%; recall:  75.00%; FB1:  75.00  4

Breaking Changes in v0.2

  • Now evaluate function returns evaluation results for chunks (consecutive tags of identical types) and tags separately. In the previous version, the distinction wasn't clear, causing confusion regarding the counts shown in ['overall']['stats']['all'] specifically.

Notes

  • The original perl script is not available at the official website anymore. You can access it here instead.
  • Latex format is not supported yet. (Any contribution is welcomed)

conlleval's People

Contributors

kaniblu avatar kqf avatar

Stargazers

KHDK avatar fireindark707 avatar  avatar Liu Yun avatar

Watchers

 avatar

Forkers

kqf cawinchan

conlleval's Issues

Convenience function

Hi, thanks for taking your time and making a python package for this script :D

Is it possible to add a convenience function to your code that accepts two lists of labels? Now I have to do it on my own:

from conlleval import evaluate as conll_lines


def coll_score(y_true, y_pred, metrics=("f1", "prec", "rec"), **kwargs):
    lines = [f"dummy XXX {t} {p}" for pair in zip(y_true, y_pred)
             for t, p in zip(*pair)]
    res = conll_lines(lines)
    return [res["overall"]["tags"]["evals"][m] for m in metrics]

Perhaps it's possible to add something like this to the existing code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.