Comments (3)
@abulice 已解决 vim /usr/local/lib/python3.10/site-packages/deepspeed/runtime/zero/stage_1_and_2.py:line1902,增加loss.requires_grad_()
from llama2-chinese.
from llama2-chinese.
try this https://github.com/huggingface/peft/issues/137#issuecomment-1445912413
from llama2-chinese.
Related Issues (20)
- git clone Llama2-Chinese-13b时报错 HOT 1
- 微调的训练数据怎么准备? HOT 2
- 预训练和全量参数微调该如何选择?
- 微调后"vocab_size": 32001 HOT 1
- AttributeError: 'NoneType' object has no attribute 'to' HOT 1
- https://llama.family/chat#/ 选择 Atom-7B 是否使用 https://huggingface.co/FlagAlpha/Llama2-Chinese-7b-Chat/模型 HOT 3
- 微调过程中学习率的问题
- Atom-7B-32K模型是不是有问题?
- ValueError: We were not able to get the tokenizer using `AutoTokenizer.from_pretrained` with the string that you have passed /data/mlops/Qwen-7B-Chat. If you have a custom tokenizer, you can pass it as input. For now, we only support quantization for text model. Support for vision, speech and multimodel will come later.
- requirements.txt里面标注版本号吧 HOT 1
- 训练损失从1.4下降到了0.5,训练5个epoch了,从开始到现在验证集ACCURACY一直是64% HOT 1
- 运行sh脚本后,提示:ds: error: the following arguments are required: user_script, user_args
- LLAMA 2 HF tokenizer len is 32001, 迅雷7B model异常需更新。 HOT 1
- Vocab size mismatch causing model convert failure
- 如何创建对话的template?
- TypeError: Object of type Tensor is not JSON serializable HOT 1
- 关于atom-7b-chat长文本微调应如何进行? HOT 5
- ollama上run本地部署的atom-7b-chat模型 报错"error loading model" HOT 2
- llama-2-13b多卡推理报错 RuntimeError: CUDA error: device-side assert triggered Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. HOT 4
- 请问llama大模型实践指南纸质版中第一章第18页文献[1]从哪里看?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from llama2-chinese.