Comments (10)
Have you figure out this problem? I've faced the same question as you.
I update transformers to 4.27.1, then the problom is solved.
refer to: https://github.com/huggingface/transformers/blob/main/src/transformers/modeling_utils.py#L1150
from chatglm-tuning.
This class should be modified:
class ChatGLMPreTrainedModel(PreTrainedModel):
"""
An abstract class to handle weights initialization and
a simple interface for downloading and loading pretrained models.
"""
is_parallelizable = False
supports_gradient_checkpointing = True # change to True
config_class = ChatGLMConfig
base_model_prefix = "transformer"
_no_split_modules = ["GLM6BBlock"]
def __init__(self, *inputs, **kwargs):
super().__init__(*inputs, **kwargs)
def _init_weights(self, module: nn.Module):
"""Initialize the weights."""
return
# add this
def _set_gradient_checkpointing(self, module, value=False):
if isinstance(module, ChatGLMForConditionalGeneration):
module.gradient_checkpointing = value
another refer:
https://colab.research.google.com/drive/1jCkpikz0J2o20FBQmYmAGdiKmJGOMo-o?usp=sharing#scrollTo=cg3fiQOvmI3Q
from chatglm-tuning.
It seems that there may be a problem with the API version. Can you copy the version in it?
from chatglm-tuning.
I have used version 4.26.1 of transforms, which is consistent with the requirement specified in the requirement.txt file. However, I still encountered an issue. I tried upgrading to version 4.27.1, but the problem persisted.
from chatglm-tuning.
老哥能给个requirements文件吗,好多包冲突
from chatglm-tuning.
老哥能给个requirements文件吗,好多包冲突
环境问题我晚点用colab测试下,晚点更新出来
from chatglm-tuning.
老哥能给个requirements文件吗,好多包冲突
@Chenzongchao 已更新,可以参考
https://github.com/mymusise/ChatGLM-Tuning/blob/master/requirements.txt
from chatglm-tuning.
Have you figure out this problem? I've faced the same question as you.
from chatglm-tuning.
Close it, reopen it if needed~
from chatglm-tuning.
This class should be modified:
class ChatGLMPreTrainedModel(PreTrainedModel): """ An abstract class to handle weights initialization and a simple interface for downloading and loading pretrained models. """ is_parallelizable = False supports_gradient_checkpointing = True # change to True config_class = ChatGLMConfig base_model_prefix = "transformer" _no_split_modules = ["GLM6BBlock"] def __init__(self, *inputs, **kwargs): super().__init__(*inputs, **kwargs) def _init_weights(self, module: nn.Module): """Initialize the weights.""" return # add this def _set_gradient_checkpointing(self, module, value=False): if isinstance(module, ChatGLMForConditionalGeneration): module.gradient_checkpointing = valueanother refer: https://colab.research.google.com/drive/1jCkpikz0J2o20FBQmYmAGdiKmJGOMo-o?usp=sharing#scrollTo=cg3fiQOvmI3Q
Could you share the name of this file? I don't know where to modify it. Thank you~
from chatglm-tuning.
Related Issues (20)
- 请问大佬是否有计划可以支持下qlora? HOT 1
- 修改max_seq_length好像并没有生效? HOT 1
- 如何支持多卡跑
- 请教一个问题,data_collator中不需要实现attention mask么? HOT 2
- ChatGLM LoRA微调之后,量化quantize=8显存、推理耗时都反向增加 HOT 1
- finetune数据使用data_collator时报错 KeyError:seq_len HOT 2
- 微调语料格式转换出现乱码 HOT 1
- 请问如何读取checkpoint继续训练? HOT 1
- AttributeError: 'ChatGLMModel' object has no attribute 'lm_head' HOT 3
- 请问下如果想让模型学到某个领域的数据集,大概需要多大的数据量呢?
- 这个项目停更了吗
- 问题请教
- 问题请教:将prompt token设置为-100即可不计算loss
- [数据预处理-tokenization时报错] datasets.builder.DatasetGenerationError
- 请问这个项目支持chatglm3吗
- 请问在训练过程中输出的日志中loss、learning_rate和epoch分别代表什么含义
- 在colab上运行finetune.ipynb的时候会报一个huggingface登录的错误,有人遇到同样的错误吗? HOT 1
- 关于保存的adapter_model.bin无实际推理效果的问题 HOT 2
- 基于3af1bfd提交在3090上跑起来的requirements.txt
- 小白,求大神解答,ImportError: cannot import name 'soft_unicode' from 'markupsafe HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chatglm-tuning.