merge后模型Loading checkpoint shards Killed about chatglm-6b-qlora HOT 6 OPEN

Derican commented on September 11, 2024

merge后模型Loading checkpoint shards Killed

from chatglm-6b-qlora.

Comments (6)

shuxueslpi commented on September 11, 2024

使用下面这段代码可以正常推理吗？

from transformers import AutoModel, AutoTokenizer

model_path = '/tmp/merged_qlora_model_4bit'

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()

input_text = '类型#裙*版型#显瘦*风格#文艺*风格#简约*图案#印花*图案#撞色*裙下摆#压褶*裙长#连衣裙*裙领型#圆领'
response, history = model.chat(tokenizer=tokenizer, query=input_text)
print(response)

from chatglm-6b-qlora.

Derican commented on September 11, 2024

使用下面这段代码可以正常推理吗？

from transformers import AutoModel, AutoTokenizer

model_path = '/tmp/merged_qlora_model_4bit'

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()

input_text = '类型#裙*版型#显瘦*风格#文艺*风格#简约*图案#印花*图案#撞色*裙下摆#压褶*裙长#连衣裙*裙领型#圆领'
response, history = model.chat(tokenizer=tokenizer, query=input_text)
print(response)

不行，只能推理原来的模型，微调后merge的不行，也是报Loading checkpoint shards: 0%| Killed

from chatglm-6b-qlora.

shuxueslpi commented on September 11, 2024

你的机器环境和依赖包的版本分别是什么样的？

from chatglm-6b-qlora.

Derican commented on September 11, 2024

Win11 WSL2 Python3.10.12
RTX 4060Ti 16G
依赖包版本除了peft==0.4.0和bitsandbytes==0.41.1外与requirements.txt一致

from chatglm-6b-qlora.

shuxueslpi commented on September 11, 2024

我把bitsandbytes升级到0.41.1貌似也没有问题，我不太能确定是不是windows的平台的问题，我自己的环境是ubuntu上的docker容器，加载起来是这样的：

In [1]: from transformers import AutoModel, AutoTokenizer

In [2]: model_path = '/tmp/t1fp'

In [3]: tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
   ...: model = AutoModel.from_pretrained(model_path, trust_remote_code=True).half().cuda()
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:10<00:00,  3.42s/it]

from chatglm-6b-qlora.

sevenandseven commented on September 11, 2024

求问ChatGLM2-6B，我用数据集微调后，使用adapter推理成功了，但是merge之后使用官方cli_demo会直接Loading checkpoint shards: 0%| Killed，看了一下fp32合并后的模型有23.2G，换成fp16后的模型为11.6G，但是同样会出现killed问题。训练配置：

{
    "output_dir": "saved_files/chatGLM_6B_QLoRA_t32",
    "per_device_train_batch_size": 4,
    "gradient_accumulation_steps": 8,
    "per_device_eval_batch_size": 4,
    "learning_rate": 1e-3,
    "num_train_epochs": 1.0,
    "lr_scheduler_type": "linear",
    "warmup_ratio": 0.1,
    "logging_steps": 100,
    "save_strategy": "steps",
    "save_steps": 500,
    "evaluation_strategy": "steps",
    "eval_steps": 500,
    "optim": "adamw_torch",
    "fp16": false,
    "remove_unused_columns": false,
    "ddp_find_unused_parameters": false,
    "seed": 42
}

训练命令：

python3 train_qlora.py --train_args_json chatGLM_6B_QLoRA.json --model_name_or_path chatglm2-6b --train_data_path data/train.jsonl --eval_data_path data/dev.jsonl --lora_rank 4 --lora_dropout 0.05 --compute_dtype fp32

合并命令：

python3 merge_lora_and_quantize.py --lora_path QLoRA_20230811_2500 --output_path output_merged/QLoRA_20230811_2500 --remote_scripts_dir remote_scripts/chatglm2-6b --device auto --qbits 0

你好，我想问一下，lora和qlora微调与初始模型合并方法有什么区别，需要修改什么参数吗？

from chatglm-6b-qlora.

merge后模型Loading checkpoint shards Killed about chatglm-6b-qlora HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent