Comments (8)
Hi @chanberg
This is the link for 2020 DA's:
wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-da.csv.tar.gz
2020 DA Relative-Ranks:
wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-daRR.csv.tar.gz
And for the MQM data you have it here but I'll try to upload the exact files we used after splitting the data and creating the z-scores. I'll try to do that later today or tomorrow..
from comet.
@chanberg I also prepared the WMT20 MQM annotated data.
The entire dataset with MQM sentence scores and the corresponding z-score:
wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-MQM.csv.tar.gz
The train split we used::
wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-MQM.train.csv.tar.gz
The corresponding test split:
wget https://unbabel-experimental-data-sets.s3.eu-west-1.amazonaws.com/wmt/2020-MQM.test.csv.tar.gz
from comet.
Don't forget to cite Markus paper if you use this MQM data from 2020:
@article{50397,
title = {Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation},
author = {Markus Freitag and George Foster and David Grangier and Viresh Ratnakar and Qijun Tan and Wolfgang Macherey},
year = {2021},
URL = {https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00437/108866/Experts-Errors-and-Context-A-Large-Scale-Study-of},
journal = {Transactions of the Association for Computational Linguistics},
pages = {1460-1474},
volume = {9}
}
and the WMT Metrics/News Translation tasks if you use the direct assessments!
from comet.
The download command it's not supported anymore. I'll add a readme with download links for data.
I have to do that for this year's shared task models also.
Meanwhile, you can use the links from the previous version:
Apequest:
wget https://unbabel-experimental-data-sets.s3-eu-west-1.amazonaws.com/comet/hter/apequest.zip
QT21:
wget https://unbabel-experimental-data-sets.s3-eu-west-1.amazonaws.com/comet/hter/qt21.zip
WMT 17-> 19:
This includes relative ranks and DA scores.
wget https://unbabel-experimental-data-sets.s3-eu-west-1.amazonaws.com/comet/da/wmt-metrics.zip
from comet.
@ricardorei This is exactly what I needed. Thank you!
from comet.
Hi @ricardorei,
Are there already any similar download links for the 2021 shared task?
Thank you!
Cheers,
Chantal
from comet.
@ricardorei thank you so much!
from comet.
@ricardorei this is great! thank you so much!
from comet.
Related Issues (20)
- tensor_lru_cache is limited to tensors with at least 2-Dimensions HOT 5
- `Unbabel/wmt22-comet-da` model not working as part of Huggingface evaluate HOT 2
- Can't reproduce Cometinho model scores HOT 3
- Do system scores above 100 really "differ"? HOT 4
- wmt20-comet-qe-da do not work under huggingface guides (possible version conflicts) HOT 3
- Support pandas 2
- [QUESTION] How to finetune `wmt22-comet-da` and have results scaled to 0-1 range HOT 1
- UnifiedMetric test failing: test_multitask_with_references HOT 1
- Segmentation error when tring to reproduce wmt22 results HOT 1
- How to reproduce Unbabel/wmt22-comet-da model HOT 2
- Evaluate lines with newline characters
- [QUESTION] Does COMET support Scoring multiple refs like scarebleu? HOT 2
- [QUESTION] How are the older models supposed to be used?
- v1.x and v2.x have different scores for wmt20-comet-qe-da model HOT 2
- [QUESTION]When I train my COMET model, I have the following problem when I am almost successful, it seems to be stuck HOT 2
- Unbabel/wmt22-cometkiwi-da doesn't work from Huggingface HOT 1
- pretrained_model setting in hparams.yaml has no effect HOT 1
- [QUESTION] Why does training speed go down? HOT 5
- Multi-GPU training HOT 10
- Using comet-mbr for Multi-Model Translation Ranking: Questions About Input Format and GPU Disabling HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from comet.