Comments (4)
Hey @mcdorians, are you running this on bear metal or using the official Docker images?
from lorax.
in another docker image based on nvidia/cuda 12 ubuntu 22.04. then on the terminal of the running container i execute the commands from the guide for linux,
from lorax.
Hey @mcdorians, I just put together PR #328 to clean up this error message and prevent it from hiding the true error. Hopefully this will help with further debugging this issue.
from lorax.
Thanks now it gives me another error. I'll create another Issue.
`(lorax) (base) jovyan@9d31d2d7e0aa:~/tests/multi_lora/lorax$ lorax-launcher --model-id mistralai/Mistral-7B-v0.1
2024-03-25T11:51:44.674023Z INFO lorax_launcher: Args { model_id: "mistralai/Mistral-7B-v0.1", adapter_id: None, source: "hub", adapter_source: "hub", revision: None, validation_workers: 2, sharded: None, num_shard: None, quantize: None, compile: false, dtype: None, trust_remote_code: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1024, max_total_tokens: 2048, waiting_served_ratio: 1.2, max_batch_prefill_tokens: 4096, max_batch_total_tokens: None, max_waiting_tokens: 20, max_active_adapters: 1024, adapter_cycle_time_s: 2, adapter_memory_fraction: 0.1, hostname: "0.0.0.0", port: 3000, shard_uds_path: "/tmp/lorax-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: None, weights_cache_override: None, disable_custom_kernels: false, cuda_memory_fraction: 1.0, json_output: false, otlp_endpoint: None, cors_allow_origin: [], cors_allow_header: [], cors_expose_header: [], cors_allow_method: [], cors_allow_credentials: None, watermark_gamma: None, watermark_delta: None, ngrok: false, ngrok_authtoken: None, ngrok_edge: None, env: false, download_only: false }
2024-03-25T11:51:44.674240Z INFO download: lorax_launcher: Starting download process.
2024-03-25T11:51:47.920648Z INFO lorax_launcher: cli.py:110 Files are already present on the host. Skipping download.
2024-03-25T11:51:48.480916Z INFO download: lorax_launcher: Successfully downloaded weights.
2024-03-25T11:51:48.481372Z INFO shard-manager: lorax_launcher: Starting shard rank=0
2024-03-25T11:51:51.971442Z WARN lorax_launcher: neox_modeling.py:61 We're not using custom kernels.
2024-03-25T11:51:52.415512Z ERROR lorax_launcher: server.py:274 Error when initializing model
Traceback (most recent call last):
File "/opt/conda/envs/lorax/bin/lorax-server", line 8, in
sys.exit(app())
File "/opt/conda/envs/lorax/lib/python3.9/site-packages/typer/main.py", line 311, in call
return get_command(self)(*args, **kwargs)
File "/opt/conda/envs/lorax/lib/python3.9/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/opt/conda/envs/lorax/lib/python3.9/site-packages/typer/core.py", line 778, in main
return _main(
File "/opt/conda/envs/lorax/lib/python3.9/site-packages/typer/core.py", line 216, in _main
rv = self.invoke(ctx)
File "/opt/conda/envs/lorax/lib/python3.9/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/envs/lorax/lib/python3.9/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/envs/lorax/lib/python3.9/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/envs/lorax/lib/python3.9/site-packages/typer/main.py", line 683, in wrapper
return callback(**use_params) # type: ignore
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/cli.py", line 89, in serve
server.serve(
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/server.py", line 324, in serve
asyncio.run(
File "/opt/conda/envs/lorax/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/envs/lorax/lib/python3.9/asyncio/base_events.py", line 634, in run_until_complete
self.run_forever()
File "/opt/conda/envs/lorax/lib/python3.9/asyncio/base_events.py", line 601, in run_forever
self._run_once()
File "/opt/conda/envs/lorax/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once
handle._run()
File "/opt/conda/envs/lorax/lib/python3.9/asyncio/events.py", line 80, in _run
self._context.run(self._callback, *self._args)
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/server.py", line 270, in serve_inner
model = get_model(
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/models/init.py", line 179, in get_model
from lorax_server.models.flash_mistral import FlashMistral
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/models/flash_mistral.py", line 21, in
from lorax_server.models.custom_modeling.flash_mistral_modeling import (
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/models/custom_modeling/flash_mistral_modeling.py", line 30, in
import dropout_layer_norm
ImportError: /opt/conda/envs/lorax/lib/python3.9/site-packages/dropout_layer_norm-0.1-py3.9-linux-x86_64.egg/dropout_layer_norm.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops10zeros_like4callERKNS_6TensorESt8optionalIN3c1010ScalarTypeEES5_INS6_6LayoutEES5_INS6_6DeviceEES5_IbES5_INS6_12MemoryFormatEE
2024-03-25T11:51:53.089548Z ERROR shard-manager: lorax_launcher: Shard complete standard error output:
exllamav2_kernels not installed.
Traceback (most recent call last):
File "/opt/conda/envs/lorax/bin/lorax-server", line 8, in
sys.exit(app())
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/cli.py", line 89, in serve
server.serve(
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/server.py", line 324, in serve
asyncio.run(
File "/opt/conda/envs/lorax/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/opt/conda/envs/lorax/lib/python3.9/asyncio/base_events.py", line 647, in run_until_complete
return future.result()
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/server.py", line 270, in serve_inner
model = get_model(
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/models/init.py", line 179, in get_model
from lorax_server.models.flash_mistral import FlashMistral
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/models/flash_mistral.py", line 21, in
from lorax_server.models.custom_modeling.flash_mistral_modeling import (
File "/home/jovyan/tests/multi_lora/lorax/server/lorax_server/models/custom_modeling/flash_mistral_modeling.py", line 30, in
import dropout_layer_norm
ImportError: /opt/conda/envs/lorax/lib/python3.9/site-packages/dropout_layer_norm-0.1-py3.9-linux-x86_64.egg/dropout_layer_norm.cpython-39-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops10zeros_like4callERKNS_6TensorESt8optionalIN3c1010ScalarTypeEES5_INS6_6LayoutEES5_INS6_6DeviceEES5_IbES5_INS6_12MemoryFormatEE
rank=0
2024-03-25T11:51:53.186967Z ERROR lorax_launcher: Shard 0 failed to start
2024-03-25T11:51:53.186999Z INFO lorax_launcher: Shutting down shards
Error: ShardCannotStart`
from lorax.
Related Issues (20)
- Refactor the quantization config for weights
- log arbitrary headers HOT 1
- Async client to backoff when model overloaded
- Can't run Mistral quantized on T4 HOT 4
- Support loading `.pt` weights HOT 1
- Error: Warmup(Generation("'bool' object has no attribute 'dtype'")) HOT 3
- Inference with AWQ quantized base model + compile enabled results in the <unk> tokens
- Combining multiple LoRA adapters HOT 1
- 10s latency of lora inference caused by None base_model_name_or_path in adapter_config
- [Question] Usage about the `adapter-memory-fraction` HOT 1
- Improve the latency of `load_batched_adapter_weights` HOT 1
- Fix PyTorch CUDA version in Docker
- Idefics2 and LLaVA
- Fallback to Flash Attention v1 for pre-Ampere GPUs HOT 1
- Private LORA Adapter Error - Server error: No valid adapter config file found: tried None and None HOT 1
- Llama3-8b-Instruct won't stop generating HOT 1
- Speculative tokens fails during warmup in some scenarios HOT 1
- Batch inference endpoint (OpenAI compatible)
- Add HF authentication instructions to lorax-launcher docs HOT 6
- Improve async load for adapters to avoid main thread lockups in server
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lorax.