Coder Social home page Coder Social logo

Comments (12)

BlinkDL avatar BlinkDL commented on July 20, 2024

You are converting the bf16 model to fp32 so it becomes much larger.

Try using "cpu fp32i8" as strategy in convert_model.py and chat.py and then it will be much smaller.

from chatrwkv.

djaffer avatar djaffer commented on July 20, 2024

Tried but it still runs out of memory. Maybe this needs some tuning or needs a gpu to really have it work properly.

from chatrwkv.

balisujohn avatar balisujohn commented on July 20, 2024

I can't load the model to convert it with 32GB of RAM either. Definitely interested in hearing if anyone finds a way to do it.

from chatrwkv.

balisujohn avatar balisujohn commented on July 20, 2024

@djaffer ok so if you have around 35 gigs of RAM including swap you can call

python3 ./convert.py  ... strategy "cpu bf16"

And I didn't run out of memory doing this.

Then you can also run the model you save this way with chat.py on CPU and it's actually surprisingly fast on CPU afaik. All of it seems to work without touching GPU I think, though I haven't been able to get it to work with my CUDA 12 installation.

from chatrwkv.

djaffer avatar djaffer commented on July 20, 2024

I only have 32 gb

from chatrwkv.

balisujohn avatar balisujohn commented on July 20, 2024

@djaffer I have 33.4 GB of actual RAM and 2.1 GB of swap RAM and it works for me at reasonable speed. If you have fast storage you can try making a swapfile of several gigabytes, but definitely read up on it first since it's not necesarilly great for your storage to constantly do IO on it.

from chatrwkv.

djaffer avatar djaffer commented on July 20, 2024

I converted the model as you suggested. Using hugging space upgraded cpu.

bdsz6 2023-04-02T06:31:31.804Z Traceback (most recent call last):
bdsz6 2023-04-02T06:31:31.804Z File "/home/user/.local/lib/python3.8/site-packages/rwkv/model.py", line 102, in init
bdsz6 2023-04-02T06:31:31.804Z w['emb.weight'] = F.layer_norm(w['emb.weight'], (args.n_embd,), weight=w['blocks.0.ln0.weight'], bias=w['blocks.0.ln0.bias'])
bdsz6 2023-04-02T06:31:31.804Z KeyError: 'blocks.0.ln0.weight'
bdsz6 2023-04-02T06:31:31.804Z
bdsz6 2023-04-02T06:31:31.804Z During handling of the above exception, another exception occurred:
bdsz6 2023-04-02T06:31:31.804Z
bdsz6 2023-04-02T06:31:31.804Z Traceback (most recent call last):
bdsz6 2023-04-02T06:31:31.804Z File "app.py", line 25, in
bdsz6 2023-04-02T06:31:31.804Z model = RWKV(model=model_path, strategy='cpu bf16')
bdsz6 2023-04-02T06:31:31.804Z File "/home/user/.local/lib/python3.8/site-packages/torch/jit/_script.py", line 292, in init_then_script
bdsz6 2023-04-02T06:31:31.804Z original_init(self, *args, **kwargs)
bdsz6 2023-04-02T06:31:31.804Z File "/home/user/.local/lib/python3.8/site-packages/rwkv/model.py", line 104, in init
bdsz6 2023-04-02T06:31:31.805Z w['emb.weight'] = F.layer_norm(w['emb.weight'].float(), (args.n_embd,), weight=w['blocks.0.ln0.weight'].float(), bias=w['blocks.0.ln0.bias'].float())
bdsz6 2023-04-02T06:31:31.805Z KeyError: 'blocks.0.ln0.weight'

from chatrwkv.

balisujohn avatar balisujohn commented on July 20, 2024

ok so youre saying convert worked using --strategy 'cpu bf16'?

Are you using the chat.py file inside v2?

from chatrwkv.

balisujohn avatar balisujohn commented on July 20, 2024

You will need to change the model path in v2/chat.py it to point to the saved output of the convert.py call

from chatrwkv.

zhaoqf123 avatar zhaoqf123 commented on July 20, 2024

I only have 32 gb

I have 32 gb RAM and encounter the same problem. Then I tried to increase the SWAP memory to 16 GB, and then 32 GB, and finally it is solved. The method to increase the swap memory can be found here.

from chatrwkv.

djaffer avatar djaffer commented on July 20, 2024

Yeah the performance is not that good. Turned of the machine. Curious if op can enable inference on hugging face.

from chatrwkv.

BlinkDL avatar BlinkDL commented on July 20, 2024

https://github.com/saharNooby/rwkv.cpp
Now with efficient CPU inference (WIP) by community

from chatrwkv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.