Comments (1)
Hi Gerard,
These numbers make sense.
- LM vs MLM is due to 1 copy vs. L parallel copies. Look at our maskless finetuning to mitigate this; or perhaps unmasked scores might suit your needs (e.g., this new paper: https://arxiv.org/abs/2010.09535)
- While ALBERT has fewer unique parameters, inference still passes sequentially through 12 layers (plus the factorized embedding, which breaks one computation into two). PyTorch models are also slower right now since we go via MXNet's dataloader. Did you try increasing the batch size? You should be able to load more as the model is smaller.
- DistilBERT's inference speed is in line with the 40% reduction from their paper (Table 3: https://arxiv.org/pdf/1910.01108.pdf). You could also increase the batch size here.
Hope this helps!
from mlm-scoring.
Related Issues (20)
- if apply domain MLM-finetuning for rescoring HOT 1
- Update to transformers 4.x HOT 3
- ValueError: Model 'BertForMaskedLMOptimized' is not supported by the scorer 'RegressionFinetuner'. HOT 1
- ERROR: No matching distribution found for mxnet-mkl HOT 4
- Where is vocab file?
- Where is vocab file?
- how to integrate models not available via huggingface or gluon? HOT 3
- How to Compute Perplexity of a Sentence HOT 2
- "NotImplementedError" When trying to fine-tune any bert model
- PyTorch models HOT 3
- can I train new LM from scratch with hugging face roberta then use mlm scoring? HOT 1
- Can't load tokenizer for 'xlm-roberta-large' HOT 1
- Help for Distil Roberta
- Using mlm-scoring with other public PyTorch RoBERTa model
- Increasing batch size when using command-line mlm score HOT 1
- xlm-roberta example? HOT 2
- How to integrate this with MuRIL? HOT 2
- IndexError: too many indices for tensor of dimension 1 HOT 2
- Hardcoded GPU 0? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlm-scoring.