Comments (9)
@erip Thats a good point thanks!
The next release will be 2.0 and we are also going to replace the default models with new ones.
We will make it clear that scores (with default settings) won't be directly comparable to the previous version (1.1.3).
There will be backward compatibility but default options will probably change for all 3 commands: comet-score
, comet-mbr
and comet-compare
from comet.
I am adding a flag for t_test alternative and setting it by default to "less".
from comet.
This will be available on the next release
from comet.
Perhaps for the sake of not breaking the API, setting the default to "two-sided"
might be a better option until the next major release? I'd hate to be the cause of people's papers comparing apples and oranges. I think with the ability to configure it, this should cover the issue well.
from comet.
I agree with you that people typically want to know if baseline mean score is less than sys1's mean score. I think its a good call to change the t_test to less
. What you think?
Atm this is only updated in the fix-multigpu
branch and not merged into master.
from comet.
It seems very reasonable to me. I'm hoping there's not some nuance that I've overlooked here. I can look at sacrebleu to see what their alternative hypothesis is in their tests (for sake of consistency more than correctness).
from comet.
Perfect! Thanks!
from comet.
Unless I'm misreading their code, it seems like they're testing using a two-sided alternative hypothesis due to the absolute value.
from comet.
@erip I looked a bit more into this and indeed two-sided t_test is more usual and results made more sense in my tests. Nonetheless I am keeping the option to change that in the command line. I am going to merge v2.0 into master.
The release was delayed but at least master will contain the new changes
from comet.
Related Issues (20)
- [QUESTION] Train UnifiedMetric/XCOMET with word level predictions. HOT 1
- Sparsemax not actually used in COMET-KIWI, XCOMET-XL/XXL HOT 4
- Invalid link reference of reference-free model in readme
- Minimizing cpu RAM vs only use GPU RAM HOT 1
- what is the precision when load_from_checkpoint?
- Runtime error when loading wmt23-cometkiwi-da-xl HOT 1
- Different scores from different COMET package versions 1.1.2 and 2.2.1 HOT 2
- Different versions of COMET code give different scores with the same model and date.
- [QUESTION] large file scoring HOT 3
- [QUESTION] Splitting big models over multiple GPUs HOT 6
- [QUESTION] Memory footprint HOT 21
- [INPUT] Text Length of Input (source, reference, and hypothesis) HOT 2
- Change the global variable logger to comet_logger HOT 1
- Training script for XCOMET HOT 1
- Safetensors Support
- [QUESTION] OOM when load XCOMET-XXL in A100 with 40G memory for prediction HOT 4
- [QUESTION] why num_layers = num_hidden_layers + 1 HOT 1
- [QUESTION] Comet kiwi architecture HOT 11
- Training data and scripts used for wmt22-cometkiwi-da HOT 4
- Add missing library stubs or py.typed marker
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from comet.