Coder Social home page Coder Social logo

Comments (5)

shuxueslpi avatar shuxueslpi commented on September 11, 2024

@huangqingyi-code
1、效果的比对直接看原始论文吧,那个更客观一些
2、可能要试下,先加载的base model,再用peft加载adapter model:

……
base_model = AutoModel.from_pretrained(config.base_model_name_or_path,
                                                                     quantization_config=q_config,
                                                                     trust_remote_code=True,
                                                                     device_map='auto')

model1 = PeftModel.from_pretrained(base_model, peft_model_path1)
model2 = PeftModel.from_pretrained(base_model, peft_model_path2)

from chatglm-6b-qlora.

huangqingyi-code avatar huangqingyi-code commented on September 11, 2024

1.qlora原始论文一顿吹比lora好,实际很多人反馈不是这样
2.这样内存中就有model1,model2两个model比较占资源,如果lora比较多10来个就会有问题了。我的想法是不先合并加载好模型,推理计算用到哪个lora再进行简单的weight相加。

from chatglm-6b-qlora.

shuxueslpi avatar shuxueslpi commented on September 11, 2024

@huangqingyi-code
1、个人观点,还是针对具体问题吧,没有最好的方法,只有针对某个数据集最好的方法。直观来说,从fp32->fp16->int8->int4确实是有损失的,甚至全参数调参->lora等一众高效微调,直观上看,也是有损失的,所以这个效果得看具体问题和资源的匹配了。
2、你说的应该是这种:https://huggingface.co/docs/peft/package_reference/peft_model#peft.PeftModel.set_adapter

from chatglm-6b-qlora.

huangqingyi-code avatar huangqingyi-code commented on September 11, 2024

是的,谢谢

from chatglm-6b-qlora.

valkryhx avatar valkryhx commented on September 11, 2024

model1 = PeftModel.from_pretrained(base_model, peft_model_path1)

model1 = PeftModel.from_pretrained(base_model, peft_model_path1)
这行代码运行之后 base_model 参数也会更改的 ,也会变成 peft model1 的参数 ,可以打印model 各个layer 对比,这个问题之前还坑了我几次。

from chatglm-6b-qlora.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.