Coder Social home page Coder Social logo

合并问题 about chatglm-6b-qlora HOT 2 CLOSED

shuxueslpi avatar shuxueslpi commented on September 11, 2024
合并问题

from chatglm-6b-qlora.

Comments (2)

shuxueslpi avatar shuxueslpi commented on September 11, 2024

@RuSignalFlag
训练过程加载的量化模型是base model,真正的lora adapter其实不是量化的,所以保存的adapter也不是量化的。
然后在merge的时候,加载的base model是fp16的,和adapter合并之后再量化。
所以真正的反量化应该是发生在训练/推理时的模型4bit矩阵乘法和fp16乘法的结果合并时,可以看我参考的关于这部分用时间换空间的文档。
下面是训练加载量化模型,再调用peft后得到的peft model的张量打印,可以看到lora部分确实不是量化的:

# qkv量化部分
In [12]: model.base_model.model.transformer.layers[0].attention.query_key_value.weight
Out[12]: 
Parameter containing:
Parameter(Params4bit([[184],
            [215],
            [183],
            ...,
            [148],
            [ 54],
            [249]], device='cuda:0', dtype=torch.uint8))

# lora部分,非量化
In [17]: model.base_model.model.transformer.layers[0].attention.query_key_value.lora_A.default.weight
Out[17]: 
Parameter containing:
tensor([[-0.0144,  0.0076, -0.0096,  ...,  0.0113, -0.0150,  0.0060],
        [-0.0096,  0.0049, -0.0014,  ..., -0.0060, -0.0140,  0.0115],
        [-0.0101,  0.0134, -0.0066,  ..., -0.0097,  0.0116, -0.0127],
        [-0.0054,  0.0090, -0.0131,  ...,  0.0082,  0.0068,  0.0122]],
       device='cuda:0', requires_grad=True)

In [18]: model.base_model.model.transformer.layers[0].attention.query_key_value.lora_A.default.weight.dtype
Out[18]: torch.float32

from chatglm-6b-qlora.

ShayDuane avatar ShayDuane commented on September 11, 2024

@shuxueslpi 后来看代码看明白了,谢谢。保存的只是adapter,合并的时候重新load全精度的基础模型然后应用这个训练好的adapter实例化peftmodel,这个peftmodel Lora层的原始权重是fp16,lora_a和lora_b都是fp32,这个时候就可以相加了。

from chatglm-6b-qlora.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.