你好,是不是还不支持chatGLM2-6B,使用发现好多错误.有没有使用过多卡跑成功的大佬,要怎么修改

看了下别人的代码，<a href="https://github.com/beyondguo/LLM-Tuning/blob/master/cha

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

是不是还不支持chatGLM2-6B? about chatglm-6b-qlora HOT 7 CLOSED

shenmadouyaowen commented on September 11, 2024

是不是还不支持chatGLM2-6B?

from chatglm-6b-qlora.

Comments (7)

shuxueslpi commented on September 11, 2024

单卡应该没有问题。
多卡我手边还没有机器搞……

from chatglm-6b-qlora.

shenmadouyaowen commented on September 11, 2024

单卡应该没有问题。多卡我手边还没有机器搞……

好的,大佬看看以下报错要怎么修改

`use_cache=True` is incompatible with gradient checkpointing. Setting `use_cache=False`...
Traceback (most recent call last):
  File "/media/ubuntu/chat/chatGLM-6B-QLoRA-main/train_qlora.py", line 206, in <module>
    train(args)
  File "/media/ubuntu/chat/chatGLM-6B-QLoRA-main/train_qlora.py", line 200, in train
    trainer.train(resume_from_checkpoint=resume_from_checkpoint)
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/transformers/trainer.py", line 1645, in train
    return inner_training_loop(
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/transformers/trainer.py", line 1938, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/transformers/trainer.py", line 2759, in training_step
    loss = self.compute_loss(model, inputs)
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/transformers/trainer.py", line 2784, in compute_loss
    outputs = model(**inputs)
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/peft/peft_model.py", line 857, in forward
    return self.base_model(
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/root/.cache/huggingface/modules/transformers_modules/chatglm2-6b/modeling_chatglm.py", line 954, in forward
    loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/torch/nn/modules/loss.py", line 1174, in forward
    return F.cross_entropy(input, target, weight=self.weight,
  File "/root/anaconda3/envs/ql/lib/python3.9/site-packages/torch/nn/functional.py", line 3029, in cross_entropy
    return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:0! (when checking argument for argument target in method wrapper_CUDA_nll_loss_forward)

from chatglm-6b-qlora.

shuxueslpi commented on September 11, 2024

看了下别人的代码，https://github.com/beyondguo/LLM-Tuning/blob/master/chatglm2_lora_tuning.py
102-106行的注释，似乎是和你一样的问题，看看能不能解决，但我觉得后面这块官方代码应该会更新解决的吧。
等我搞到机器我再试试多卡的……

from chatglm-6b-qlora.

shenmadouyaowen commented on September 11, 2024

看了下别人的代码，https://github.com/beyondguo/LLM-Tuning/blob/master/chatglm2_lora_tuning.py 102-106行的注释，似乎是和你一样的问题，看看能不能解决，但我觉得后面这块官方代码应该会更新解决的吧。等我搞到机器我再试试多卡的……

辛苦拉,他这个我跑通了....
大佬加油

from chatglm-6b-qlora.

shenmadouyaowen commented on September 11, 2024

看了下别人的代码，https://github.com/beyondguo/LLM-Tuning/blob/master/chatglm2_lora_tuning.py 102-106行的注释，似乎是和你一样的问题，看看能不能解决，但我觉得后面这块官方代码应该会更新解决的吧。等我搞到机器我再试试多卡的……

per_device_train_batch_size 填2都能爆内存,大佬有什么建议优化一下么

from chatglm-6b-qlora.

shuxueslpi commented on September 11, 2024

@shenmadouyaowen 他这个代码好像是lora的，8bit训练，我这个是qlora，4bit的，理论上我这个代码应该显存占用比他小，两个建议：
1、chatglm2-6b的官方模型代码重新拉一下最新的，因为之前的代码没有实现activation checkpointing，最新的代码实现了
2、如果1完成了，仍然爆显存，把那几段注释的里的代码加到我这个代码里试试，我明天估计能搞到多卡的机器，到时候我也试试
3、相对的，也可以把我qlora的配置加到他的代码里
其实我觉得就单机的话，24G显存我这个qlora应该可以很大batchsize了吧，我在12G的卡上，跑ADGEN数据集都可以到8的batchsize，都没用满

from chatglm-6b-qlora.

shenmadouyaowen commented on September 11, 2024

是的,我用他代码加入到你的代码了,然后显示爆内存,修改per_device_train_batch_size=1就可以跑了,已经拉取了最新文件,坐等大佬优化.

from chatglm-6b-qlora.

是不是还不支持chatGLM2-6B? about chatglm-6b-qlora HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent