Comments (7)
Ok, the scores make sense! HTER and DA's have different scales. HTER is a measure that you want to minimize. It reflects the effort required to "correct" the translation output in order to be semantically equivalent to the reference (higher HTER reflects more effort).
DA is a continuous scale of "how good is a translation" (a high DA score means that the translation is good).
Both models are telling you that your MT is not good. For a SOTA MT system, you should expect your HTER score to be close to 0 while the DA score should be between 0.6 and 1
Its all here: https://unbabel.github.io/COMET/html/models.html
from comet.
If you want to read more about HTER: Snover et al., 2006
and DA's: Graham et al., 2013
from comet.
This issue label is exactly for this type of questions! I am happy to help
What are the scores exactly?
Sometimes when comparing two systems with similar quality these two models (wmt-large-da-estimator-1719 and wmt-large-hter-estimator) can differ regarding "which model is better". Yet, when scoring a single MT the scores should point into the same direction...
from comet.
You are testing the model with 70k translations? can you compute a Pearson correlation between wmt-large-da-estimator-1719 and wmt-large-hter-estimator scores?
from comet.
This issue label is exactly for this type of questions! I am happy to help
What are the scores exactly?
Sometimes when comparing two systems with similar quality these two models (wmt-large-da-estimator-1719 and wmt-large-hter-estimator) can differ regarding "which model is better". Yet, when scoring a single MT the scores should point into the same direction...
Please find below the results:
wmt-large-da-estimator-1719 | wmt-large-hter-estimator | emnlp-base-da-ranker | |
---|---|---|---|
Score | -0.21418807 | 0.212977027 | 0.145221945 |
Translations Count (same MT) | 70544 | 70544 | 70544 |
Thanks for the support!
from comet.
You are testing the model with 70k translations? can you compute a Pearson correlation between wmt-large-da-estimator-1719 and wmt-large-hter-estimator scores?
Unfortunately we don't have within our team experience with this type of computation but I will ask our engineers to have a look.
from comet.
Thank you very much Ricardo! Makes sense now.
from comet.
Related Issues (20)
- if tgt is same with src, the score is still high HOT 2
- [QUESTION] Train UnifiedMetric/XCOMET with word level predictions. HOT 1
- Sparsemax not actually used in COMET-KIWI, XCOMET-XL/XXL HOT 4
- Invalid link reference of reference-free model in readme
- Minimizing cpu RAM vs only use GPU RAM HOT 1
- what is the precision when load_from_checkpoint?
- Runtime error when loading wmt23-cometkiwi-da-xl HOT 1
- Different scores from different COMET package versions 1.1.2 and 2.2.1 HOT 2
- Different versions of COMET code give different scores with the same model and date.
- [QUESTION] large file scoring HOT 3
- [QUESTION] Splitting big models over multiple GPUs HOT 6
- [QUESTION] Memory footprint HOT 21
- [INPUT] Text Length of Input (source, reference, and hypothesis) HOT 2
- Change the global variable logger to comet_logger HOT 1
- Training script for XCOMET HOT 1
- Safetensors Support
- [QUESTION] OOM when load XCOMET-XXL in A100 with 40G memory for prediction HOT 4
- [QUESTION] why num_layers = num_hidden_layers + 1 HOT 1
- [QUESTION] Comet kiwi architecture HOT 11
- Training data and scripts used for wmt22-cometkiwi-da HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from comet.