Coder Social home page Coder Social logo

svirpioj / morphoeval Goto Github PK

View Code? Open in Web Editor NEW
2.0 1.0 0.0 42 KB

Evaluation for unsupervised morphological analysis and segmentation

License: MIT License

Python 100.00%
evaluation morphological-analysis morphological-segmentation subwords

morphoeval's Introduction

morphoeval - Evaluation for morphological analysis and segmentation

Introduction

This package provides re-implementations for the BPR, CoMMA, and EMMA-2 evaluation methods for unsupervised morphological analysis and segmentation introduced by Virpioja et al. (2011).

The BPR (boundary precision and recall) method calculates a macro-average of the segmentation boundary matches over the words and is thus suitable for evaluating unsupervised or supervised morphological segmentation.

The CoMMA and EMMA-2 methods are designed for the task of unsupervised morphological analysis that was the goal in the Morpho Challenge competitions organized between 2005 and 2010 (see Kurimo et al., 2010). The challenge in the evaluation of unsupervised morphological analysis is that the predicted morphemes labels and the labels in gold standard analysis are not directly comparable, as an unsupervised algorithm does not see the gold standard labels, and in contrast to unsupervised morphological segmentation, the labels are not simply subsequences of words.

Both methods start with a bipartite morpheme-word graph that collects the occurrences of the morphemes in the word forms within the test set. The CoMMA methods first use the morpheme-word graphs to create word graphs, with edges as co-occurring morphemes, for the predicted and gold standard analyses, and calculates the precision and recall of the edges. The EMMA (Spiegler and Monson, 2010) and EMMA-2 methods use the morpheme-word graph to make one-to-one or one-to-many assignments between the predicted and gold standard morphemes, and calculates the precision and recall based on the mapped morphemes.

The choice of method

The choice of the evaluation method should depend on the task at hand (segmentation or analysis) and whether the method produces (and the gold standard includes) multiple alternative analyses per word. Here are our recommendations; see Virpioja et al. (2011) for further discussion.

Task Single analysis per word Multiple analyses per word
Morphological segmentation BPR BPR-S
Morphological analysis EMMA-2, CoMMA-B0 EMMA-2, CoMMA-S0

Instructions

Installation

Install the latest release from PyPI:

  • pip install morphoeval

Install from source:

  • pip install .

Usage

Installing the package provides a single command, morphoeval:

$ morphoeval --help
usage: morphoeval [-h] [--metric {comma-b0,comma-b1,comma-s0,comma-s1,emma-2,bpr,bpr-s}] [--verbose]
                     goldfile predfile [output]

Metrics for morphological analysis and segmentation

positional arguments:
  goldfile              gold standard analysis file
  predfile              predicted analysis file
  output                output file

optional arguments:
  -h, --help            show this help message and exit
  --metric {comma-b0,comma-b1,comma-s0,comma-s1,emma-2,bpr,bpr-s}, -m {comma-b0,comma-b1,comma-s0,comma-s1,emma-2,bpr,bpr-s}
                        metric (default comma-b0)
  --beta FLOAT          beta for using F_beta score
  --verbose, -v         increase verbosity

The parameters are simple enough: Use --metric to select the evaluation method that you want to use, and provide the gold standard and predicted analysis files.

The input files should be in the format used in Morpho Challenges: The word and its analyses are separated by a tabular character, any alternative analyses by a comma and a space, and the labels of the analyses by single space. For example:

brush	brush_N
brushes	brush_N +3SG, brush_N +PL

The output is written in YAML format:

files: {predictions: pred.txt, reference: gold.txt}
metric: emma-2
scores: {f-score: 0.9251, precision: 0.8939, recall: 0.9585}

Note: For large (>10k words) input files, running the evaluation may take a considerable amount of memory.

Original scripts

The original scripts are available at http://morpho.aalto.fi/events/morphochallenge/, but do not work with modern Python versions. The current implementation has the following limitations compared to the previous scripts:

  • The original EMMA algorithm with one-to-one mapping between morphemes is not supported.
  • Weighting of each input word is not supported.

References

References as BibTeX:

% Kurimo et al. (2010)
@inproceedings{kurimo-et-al-2010-morpho,
    address = {Uppsala, Sweden},
    author = {Mikko Kurimo and Sami Virpioja and Ville Turunen and Krista Lagus},
    booktitle = {Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology},
    month = {July},
    pages = {87--95},
    publisher = {Association for Computational Linguistics},
    title = {Morpho Challenge 2005-2010: Evaluations and Results},
    url = {https://aclanthology.org/W10-2211},
    year = {2010},
}

% Spiegler and Monson (2010)
@inproceedings{spiegler-monson-2010-emma,
    address = {Beijing, China},
    author = {Sebastian Spiegler and Christian Monson},
    booktitle = {Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)},
    month = {August},
    pages = {1029--1037},
    publisher = {Coling 2010 Organizing Committee},
    title = {{EMMA}: A novel Evaluation Metric for Morphological Analysis},
    url = {https://aclanthology.org/C10-1116},
    year = {2010},
}

% Virpioja et al. (2011)
@article{virpioja-et-al-2011-empirical,
    author = {Sami Virpioja and Ville T. Turunen and Sebastian Spiegler and Oskar Kohonen and Mikko Kurimo},
    journal = {Traitement Automatique des Langues},
    number = {2},
    pages = {45--90},
    publisher = {ATALA},
    title = {Empirical Comparison of Evaluation Methods for Unsupervised Learning of Morphology},
    url = {https://www.atala.org/sites/default/files/TAL_52_2_2.pdf},
    volume = {52},
    year = {2011},
}

morphoeval's People

Contributors

svirpioj avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.