I hope to switch llama2-7b-chat and llama3-8b models. But it cost a lot of memory

how to switch to load multiple llm models in a streamlit page? about bigdl HOT 3 OPEN

JamieVC commented on June 12, 2024

how to switch to load multiple llm models in a streamlit page?

from bigdl.

Comments (3)

sgwhat commented on June 12, 2024

You may clear the model with del llm_model.

from bigdl.

JamieVC commented on June 12, 2024

Thanks for the good idea del llm_model , but I have another question.
The create_model() is set @st.cache_resource like source code below. In my understandings, the function create_model() just run once. After I delete the old model, I'd like to create a new model with create_model(). How do I make it rerun?

@st.cache_resource
def create_model(model_name):
    llm_model = IpexLLM.from_model_id(
        model_name=model_name,
        tokenizer_name=tokenizer_name,
        context_window=4096,
        max_new_tokens=512,
        load_in_low_bit='asym_int4',
        completion_to_prompt=completion_to_prompt,
        generate_kwargs={
        "do_sample": True, 'temperature': 0.1,
        "eos_token_id": [tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>")]},
        #messages_to_prompt=messages_to_prompt,
        device_map='xpu',
    )

from bigdl.

sgwhat commented on June 12, 2024

You may use st.cache_resource.clear() to rerun to create a new model as below:

model = create_model(name1)

del model
st.cache_resource.clear()

model = create_model(name2)

from bigdl.

how to switch to load multiple llm models in a streamlit page? about bigdl HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent