Hi All - has anyone come across this error when saving there lora weights? Notice the

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I opened <a class="issue-link js-issue-link" data-error-text="Failed to load title" da

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Killed when saving LoRA weights about litgpt HOT 9 CLOSED

alistairwgillespie commented on June 11, 2024 1

Killed when saving LoRA weights

from litgpt.

Comments (9)

rasbt commented on June 11, 2024 1

Haven't seen that yet but it could be due to memory limitations when merging with the original model. Wdyt @awaelchli ?

Btw otherwise it looks like everything was successful. You can try to manually merge the weights via the litgpt merge_lora command.

from litgpt.

awaelchli commented on June 11, 2024

Yes, that means you probably ran out of CPU memory. Right now, merging LoRA parameters requires the entire checkpoint to fit in memory.

from litgpt.

carmocca commented on June 11, 2024

@alistairwgillespie How much RAM does your system have?

from litgpt.

carmocca commented on June 11, 2024

I opened #1189 which seems to help for me. It would be interesting if you could try it, Alistair.

from litgpt.

Andrei-Aksionov commented on June 11, 2024

I think that the biggest issue is that the training was done with quantization. But merge_lora.py doesn't support it.
We can load the model in quantized form, merge weights. But when it comes to saving the model we have a couple of options:

If we want to save the model in quantized form there shouldn't be any significant problems. Plus the latest BNB should support it.
If we want to save in dequantized form, then we have to add incremental dequantization and saving. Otherwise the whole model has to be dequantized right before saving, which significantly increases memory consumption.
Lightning-AI/pytorch-lightning#19242
Or just stick to the current approach and ~~hope~~ assume that a user have more (much more in fact) free CPU RAM than GPU VRAM.

I want to also mention that we might have a problem with merging quantized weights: #935

from litgpt.

ecatkins commented on June 11, 2024

@carmocca Was having the same issue - and your PR resolved it for me (at least on a small test training run)

from litgpt.

carmocca commented on June 11, 2024

@Andrei-Aksionov For now the easiest thing is to save it dequantized. The LoRA merge already supports this (you added this!). If it's not working well then we should fix it

from litgpt.

Andrei-Aksionov commented on June 11, 2024

Yes, I even remember adding it 😆.
You can quantize a model upon loading (thanks to Fabric), merge while keeping the model in quantized form (guess here kudos goes to me), but if you want to save a model in a dequantized form you need to first to dequantied the whole model.

If it's not working well then we should fix it

The fix would be an incremental dequantization/saving:

Take a layer that we want to save
Dequantize it
Save it
Go back to step 1

from litgpt.

alistairwgillespie commented on June 11, 2024

Apologies for the late reply, All. I updated the hardware notes in the original issue for audit purposes. Boosting the hardware fixed the issue. The additional feedback was helpful too.

from litgpt.

Killed when saving LoRA weights about litgpt HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent