Comments (3)
示例里,微调的参数在模型输出路径文件夹下的checkpoint-1000文件夹中。按照示例的微调方法运行之后,微调结果输出路径不同,为runs/Jan27_01-06-17_autodl-container-049a448514-394ad272/,其中文件也不同。 请问这里该怎么处理
请问您最后解决了吗,我也遇到这个问题了
from self-llm.
示例里,微调的参数在模型输出路径文件夹下的checkpoint-1000文件夹中。按照示例的微调方法运行之后,微调结果输出路径不同,为runs/Jan27_01-06-17_autodl-container-049a448514-394ad272/,其中文件也不同。 请问这里该怎么处理
请问您最后解决了吗,我也遇到这个问题了
我的微调参数是这样设置的:
data_collator = DataCollatorForSeq2Seq(
tokenizer,
model=model,
label_pad_token_id=-100,
pad_to_multiple_of=None,
padding=False
)
args = TrainingArguments(
output_dir="/root/autodl-tmp/huan_dataset/output",#相对路径无法生成check-points文件夹
per_device_train_batch_size=1,
gradient_accumulation_steps=8,
logging_steps=5,
num_train_epochs=1,
save_strategy='steps',
save_steps=10,
learning_rate=1e-4,
#gradient_checkpointing=True,这句解开会报错
)
加载微调后模型的代码是这样的:
from peft import PeftModel
from transformers import AutoTokenizer, AutoModelForCausalLM, DataCollatorForSeq2Seq, TrainingArguments, Trainer
import torch
tokenizer = AutoTokenizer.from_pretrained("/root/autodl-tmp/ZhipuAI/chatglm3-6b", use_fast=False, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("/root/autodl-tmp/ZhipuAI/chatglm3-6b", trust_remote_code=True, low_cpu_mem_usage=True)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
p_model = PeftModel.from_pretrained(model, model_id="/root/autodl-tmp/huan_dataset/output/checkpoint-400/") # 将训练所得的LoRa权重加载起来
ipt = tokenizer("<|system|>\n现在你要扮演皇帝身边的女人--甄嬛\n<|user|>\n {}\n{}".format("你是谁?", "").strip() + "<|assistant|>\n", return_tensors="pt").to(model.device)
tokenizer.decode(p_model.generate(**ipt, max_length=128, do_sample=True)[0], skip_special_tokens=True)
from self-llm.
示例里,微调的参数在模型输出路径文件夹下的checkpoint-1000文件夹中。按照示例的微调方法运行之后,微调结果输出路径不同,为runs/Jan27_01-06-17_autodl-container-049a448514-394ad272/,其中文件也不同。 请问这里该怎么处理
请问您最后解决了吗,我也遇到这个问题了
我的微调参数是这样设置的: data_collator = DataCollatorForSeq2Seq( tokenizer, model=model, label_pad_token_id=-100, pad_to_multiple_of=None, padding=False ) args = TrainingArguments( output_dir="/root/autodl-tmp/huan_dataset/output",#相对路径无法生成check-points文件夹 per_device_train_batch_size=1, gradient_accumulation_steps=8, logging_steps=5, num_train_epochs=1, save_strategy='steps', save_steps=10, learning_rate=1e-4, #gradient_checkpointing=True,这句解开会报错 ) 加载微调后模型的代码是这样的: from peft import PeftModel from transformers import AutoTokenizer, AutoModelForCausalLM, DataCollatorForSeq2Seq, TrainingArguments, Trainer import torch tokenizer = AutoTokenizer.from_pretrained("/root/autodl-tmp/ZhipuAI/chatglm3-6b", use_fast=False, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("/root/autodl-tmp/ZhipuAI/chatglm3-6b", trust_remote_code=True, low_cpu_mem_usage=True) device = torch.device("cuda" if torch.cuda.is_available() else "cpu") model.to(device) p_model = PeftModel.from_pretrained(model, model_id="/root/autodl-tmp/huan_dataset/output/checkpoint-400/") # 将训练所得的LoRa权重加载起来 ipt = tokenizer("<|system|>\n现在你要扮演皇帝身边的女人--甄嬛\n<|user|>\n {}\n{}".format("你是谁?", "").strip() + "<|assistant|>\n", return_tensors="pt").to(model.device) tokenizer.decode(p_model.generate(**ipt, max_length=128, do_sample=True)[0], skip_special_tokens=True)
谢谢好兄弟!!就是请问你直接生成的是checkpoint格式的输出吗,为什么我生成的是events.out.tfevents.1709949130.autodl-container-7d27418359-b92bcb1c.3146.0这种格式的输出,请问可以麻烦贴出您的全部代码吗,我也是按着教程来的啊
from self-llm.
Related Issues (20)
- Qwen1.5-7B-Chat vLLM 部署调用-速度测试 hf命令错误 HOT 1
- Qwen1.5-7B Lora微调报错 HOT 1
- 微调出来会有不礼貌或攻击性的言语 HOT 3
- 我在微调LLAMA3的时候出现NotImplementedError: Cannot copy out of meta tensor; no data! HOT 8
- Qwen1.5-7B Lora微调报错:RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn HOT 2
- 【XVERSE-7B-chat WebDemo 部署】报错 torch.cuda.OutOfMemoryError: CUDA out of memory. HOT 2
- llama3 API调用的问题 HOT 1
- 使用 llama3 的 lora 微调报错:NotImplementedError: Cannot copy out of meta tensor; no data! HOT 3
- chatglm3,lora微调报错 HOT 1
- 在纯 CPU 上可以运行吗?比如苹果电脑没有 cuda? HOT 1
- 04-Qwen-7B-Chat Lora 微调时报错 HOT 1
- deepseek-v2部署请求 HOT 2
- 请问有多模态LLM的部署/微调文档吗,未来有相关更新计划吗 HOT 1
- llama3 api报错 HOT 1
- 微调Qwen1.5-0.5b报错 PermissionError: [Errno 13] Permission denied: './output/Qwen1.5\checkpoint-100' HOT 5
- peft微调llama3 8b,从第10补开始loss一直都是0 HOT 5
- 模型微调时报错,报内核版本问题 Detected kernel version 5.4.0 HOT 1
- 如懿传
- 考虑出一些LLM时序预测模型的相关教程吗 HOT 1
- 01-Qwen1.5-7B-Chat FastApi 部署调用.md 传入数据错误 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from self-llm.