Comments (5)
这有可能是你下载的模型文件不完整。或者是错的。
from whisper-finetune.
我把openai/whisper-small/的[flax_model.msgpack][model.safetensors][pytorch_model.bin][tf_model.h5]四个模型都下载下来了,都不行,这是为什么。没有md5也没法校验是否不一致,但下载过程都没有报错
from whisper-finetune.
@lichq5 不止这几个文件,还有很多文件的
from whisper-finetune.
我现在在训练的时候会报这个错:
raise ValueError(
"Asking to pad but the tokenizer does not have a padding token. "
"Please select a token to use as pad_token
(tokenizer.pad_token = tokenizer.eos_token e.g.)
"
"or add a new pad token via tokenizer.add_special_tokens({'pad_token': '[PAD]'})
."
)
如果我手动修改源码,加上self.pad_token="[PAD]"这个代码,会影响训练效果吗
from whisper-finetune.
这样应该是不行的。 你还是要下载完整的文件去读取里面的token
from whisper-finetune.
Related Issues (20)
- 正常数据和空数据一起训练的格式 HOT 9
- 2卡训练速度比单卡训练快很多 HOT 15
- 如何随机化模型参数,从头开始训练 HOT 2
- whisper large v3 Fine-Tune 後變得不太能辨識語音 HOT 28
- 微调时的奇怪问题,训练集变大之后,准确度反而下降了 HOT 5
- config.json文件与huggingface上的config.json不一样 HOT 1
- 資料格式的 language 設定 HOT 1
- 如何转换V3版本 HOT 4
- 可以微调新的语种吗? HOT 1
- 123 HOT 2
- 可以单独导出我们微调模型训练出的数据吗 HOT 2
- WhisperForConditionalGeneration 與 AutoModelForSpeechSeq2Seq 差異 HOT 1
- 使用 ct2 轉換後掉辨識率 HOT 2
- 可以使用initial_prompt做微调吗 HOT 1
- 用BELLE-2/Belle-whisper-large-v2-zh识别中文音频,效果还不如Systran/faster-whisper-large-v2? HOT 2
- evaluate运行时出现问题 HOT 4
- 请问加速推理可以离线使用吗 HOT 1
- [CONTRIBUTION] Speech dataset Generator
- 模型微调后出现乱码 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from whisper-finetune.