Comments (5)
Hey @KrisWongz @thincal, #327 adds support for Qwen2 models. I haven't yet tested the LoRA loading as I haven't found a public LoRA adapter for these models yet, but if you know of one, happy to test it out.
from lorax.
Thanks @thincal, I can definitely take a look to see what has changed in this version and hopefully put together a quick PR, if no one gets to it first.
from lorax.
@tgaddair please help review this request, thanks.
from lorax.
Is qwen1.5 now supported?
An error occurred when running qwen1.5-14b-chat with adapter:
Traceback (most recent call last):
File "/home/admin/Wangze/WZ_test/lorax/test_lorax_qwen.py", line 19, in
print(client.generate(prompt, max_new_tokens=128, temperature=0.7, stop_sequences=["<|endoftext|>"], adapter_id=adapter_id,adapter_source=adapter_source).generated_text)
File "/home/admin/anaconda3/envs/llama_factory/lib/python3.10/site-packages/lorax/client.py", line 184, in generate
raise parse_error(resp.status_code, payload)
lorax.errors.GenerationError: Request failed during generation: Server error: This model does not support adapter loading.
And:
ue: router/src/queue.rs:463: loading adapter local:/data/240312-3-epoch-256-yingxiao_req1_2_similar_and_real_labels with cost 0 (memory budget remaining: 1)
2024-03-13T02:57:36.163210Z ERROR lorax_launcher: server.py:222 Error when loading adapter
Traceback (most recent call last):
File "/opt/conda/bin/lorax-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 330, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/lib/python3.10/site-packages/grpc_interceptor/server.py", line 165, in invoke_intercept_method
return await self.intercept(
File "/opt/conda/lib/python3.10/site-packages/lorax_server/interceptor.py", line 38, in intercept
return await response
File "/opt/conda/lib/python3.10/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py", line 73, in _unary_interceptor
return await behavior(request_or_iterator, context)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 218, in LoadAdapter
self.model.load_adapter(adapter_parameters, adapter_source, adapter_index, api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/model.py", line 168, in load_adapter
raise ValueError("This model does not support adapter loading.")
ValueError: This model does not support adapter loading.
2024-03-13T02:57:36.163721Z ERROR lorax_launcher: interceptor.py:41 Method LoadAdapter encountered an error.
Traceback (most recent call last):
File "/opt/conda/bin/lorax-server", line 8, in
sys.exit(app())
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/lib/python3.10/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/opt/conda/lib/python3.10/site-packages/lorax_server/cli.py", line 89, in serve
server.serve(
File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 330, in serve
asyncio.run(
File "/opt/conda/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 636, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 603, in run_forever
self._run_once()
File "/opt/conda/lib/python3.10/asyncio/base_events.py", line 1909, in _run_once
handle._run()
File "/opt/conda/lib/python3.10/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/opt/conda/lib/python3.10/site-packages/grpc_interceptor/server.py", line 165, in invoke_intercept_method
return await self.intercept(
File "/opt/conda/lib/python3.10/site-packages/lorax_server/interceptor.py", line 38, in intercept
return await response
File "/opt/conda/lib/python3.10/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py", line 82, in _unary_interceptor
raise error
File "/opt/conda/lib/python3.10/site-packages/opentelemetry/instrumentation/grpc/_aio_server.py", line 73, in _unary_interceptor
return await behavior(request_or_iterator, context)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/server.py", line 218, in LoadAdapter
self.model.load_adapter(adapter_parameters, adapter_source, adapter_index, api_token)
File "/opt/conda/lib/python3.10/site-packages/lorax_server/models/model.py", line 168, in load_adapter
raise ValueError("This model does not support adapter loading.")
ValueError: This model does not support adapter loading.
2024-03-13T02:57:36.163892Z ERROR lorax_client: router/client/src/lib.rs:34: Server error: This model does not support adapter loading.
2024-03-13T02:57:36.163906Z INFO lorax_router::loader: router/src/loader.rs:207: FAILED loading adapter local:/data/240312-3-epoch-256-yingxiao_req1_2_similar_and_real_labels
2024-03-13T02:57:36.163919Z INFO lorax_router::queue: router/src/queue.rs:139: set adapter local:/data/240312-3-epoch-256-yingxiao_req1_2_similar_and_real_labels status to Errored
2024-03-13T02:57:36.163965Z INFO lorax_router::loader: router/src/loader.rs:277: terminating adapter local:/data/240312-3-epoch-256-yingxiao_req1_2_similar_and_real_labels loader.
Without adapter can work.
from lorax.
I'll be taking a look at this today. Hope to have a PR up soon!
from lorax.
Related Issues (20)
- Retrieve all lora models from Huggingface hub by base model setting. HOT 2
- Add all launcher args as optional in the Helm charts
- AutoTokenzier.from_pretrains needs setting with `trust_remote_code` inside `load_module_map` HOT 2
- Ensure api_token is not included in the response on error HOT 3
- [QUESTION] How to change HuggingFace model download Path in Lorax When deployed to Kubernetes through HelmChart HOT 1
- Bug Report: lorax-launcher failed with --source "s3" for model_id "mistralai/Mistral-7B-Instruct-v0.2" HOT 1
- Improve warmup checking for max new tokens when using speculative decoding
- Support inference on INF2 instance
- Reject unknown fields from API requests
- When caching adapters, cache the adapter ID + the API token pair HOT 4
- Add HTTP status codes to docs HOT 1
- Quantized KV Cache
- `make install` insufficient for running llama3-8B-Instruct HOT 4
- Fail to run Phi-3 HOT 9
- Quickstart example not working HOT 3
- AssertionError when using model "google/gemma-2b" with multi-gpus
- can't run lorax with docker. HOT 1
- Why are qlora (4bit) and lora (16bit) adapter file sizes the same?
- Fail to load special token in phi-3
- Add Support for AutoModelForSequenceClassification Models
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lorax.