Coder Social home page Coder Social logo

Comments (8)

ricardorei avatar ricardorei commented on May 23, 2024 2

Hi @chanberg

This is the link for 2020 DA's:

wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-da.csv.tar.gz

2020 DA Relative-Ranks:

wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-daRR.csv.tar.gz

And for the MQM data you have it here but I'll try to upload the exact files we used after splitting the data and creating the z-scores. I'll try to do that later today or tomorrow..

from comet.

ricardorei avatar ricardorei commented on May 23, 2024 2

@chanberg I also prepared the WMT20 MQM annotated data.

The entire dataset with MQM sentence scores and the corresponding z-score:

wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-MQM.csv.tar.gz

The train split we used::

wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-MQM.train.csv.tar.gz

The corresponding test split:

wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-MQM.test.csv.tar.gz

from comet.

ricardorei avatar ricardorei commented on May 23, 2024 2

Don't forget to cite Markus paper if you use this MQM data from 2020:

@article{50397,
title	= {Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation},
author	= {Markus Freitag and George Foster and David Grangier and Viresh Ratnakar and Qijun Tan and Wolfgang Macherey},
year	= {2021},
URL	= {https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00437/108866/Experts-Errors-and-Context-A-Large-Scale-Study-of},
journal	= {Transactions of the Association for Computational Linguistics},
pages	= {1460-1474},
volume	= {9}
}

and the WMT Metrics/News Translation tasks if you use the direct assessments!

from comet.

ricardorei avatar ricardorei commented on May 23, 2024

The download command it's not supported anymore. I'll add a readme with download links for data.

I have to do that for this year's shared task models also.

Meanwhile, you can use the links from the previous version:

Apequest:

wget https://unbabel-experimental-data-sets.s3-eu-west-1.amazonaws.com/comet/hter/apequest.zip

QT21:

wget https://unbabel-experimental-data-sets.s3-eu-west-1.amazonaws.com/comet/hter/qt21.zip

WMT 17-> 19:
This includes relative ranks and DA scores.

wget https://unbabel-experimental-data-sets.s3-eu-west-1.amazonaws.com/comet/da/wmt-metrics.zip

from comet.

isabelcachola avatar isabelcachola commented on May 23, 2024

@ricardorei This is exactly what I needed. Thank you!

from comet.

chanberg avatar chanberg commented on May 23, 2024

Hi @ricardorei,

Are there already any similar download links for the 2021 shared task?
Thank you!

Cheers,
Chantal

from comet.

chanberg avatar chanberg commented on May 23, 2024

@ricardorei thank you so much!

from comet.

chanberg avatar chanberg commented on May 23, 2024

@ricardorei this is great! thank you so much!

from comet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.