[BUG]docker logs 一直提示Triton 正在启动 about qanything HOT 5 OPEN

misslxs commented on July 21, 2024

[BUG]docker logs 一直提示Triton 正在启动

from qanything.

Comments (5)

YinSonglin1997 commented on July 21, 2024

补充一下，我和楼主同样的问题，我把QAEnsemble.log贴出来。
I0119 02:05:18.197207 86 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f9e5c000000' with size 268435456
I0119 02:05:18.201188 86 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864
I0119 02:05:18.208520 86 model_lifecycle.cc:462] loading: rerank:1
I0119 02:05:18.208561 86 model_lifecycle.cc:462] loading: embed:1
I0119 02:05:18.208588 86 model_lifecycle.cc:462] loading: base:1
I0119 02:05:18.211636 86 onnxruntime.cc:2504] TRITONBACKEND_Initialize: onnxruntime
I0119 02:05:18.211702 86 onnxruntime.cc:2514] Triton TRITONBACKEND API version: 1.12
I0119 02:05:18.211721 86 onnxruntime.cc:2520] 'onnxruntime' TRITONBACKEND API version: 1.12
I0119 02:05:18.211736 86 onnxruntime.cc:2550] backend configuration:
{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}}
I0119 02:05:18.277019 86 onnxruntime.cc:2608] TRITONBACKEND_ModelInitialize: rerank (version 1)
I0119 02:05:18.277589 86 onnxruntime.cc:2608] TRITONBACKEND_ModelInitialize: embed (version 1)
I0119 02:05:18.277767 86 onnxruntime.cc:666] skipping model configuration auto-complete for 'rerank': inputs and outputs already specified
I0119 02:05:18.278371 86 onnxruntime.cc:2651] TRITONBACKEND_ModelInstanceInitialize: rerank (GPU device 0)
I0119 02:05:18.278735 86 onnxruntime.cc:666] skipping model configuration auto-complete for 'embed': inputs and outputs already specified
I0119 02:05:18.280363 86 onnxruntime.cc:2651] TRITONBACKEND_ModelInstanceInitialize: embed (GPU device 0)
I0119 02:05:18.758885 86 libfastertransformer.cc:459] Before Loading Weights:
terminate called after throwing an instance of 'std::length_error'
what(): basic_string::_M_create
[d46a4f8365f8:00086] *** Process received signal ***
[d46a4f8365f8:00086] Signal: Aborted (6)
[d46a4f8365f8:00086] Signal code: (-6)
[d46a4f8365f8:00086] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f9eab095520]
[d46a4f8365f8:00086] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f9eab0e99fc]
[d46a4f8365f8:00086] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f9eab095476]
[d46a4f8365f8:00086] [ 3] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f9eab07b7f3]
[d46a4f8365f8:00086] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7f9eab31db9e]
[d46a4f8365f8:00086] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7f9eab32920c]
[d46a4f8365f8:00086] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7f9eab329277]
[d46a4f8365f8:00086] [ 7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7f9eab3294d8]
[d46a4f8365f8:00086] [ 8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZSt20__throw_length_errorPKc+0x40)[0x7f9eab320449]
[d46a4f8365f8:00086] [ 9] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x14bc69)[0x7f9eab3c6c69]
[d46a4f8365f8:00086] [10] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(+0xa6ba3c)[0x7f9e1dbf2a3c]
[d46a4f8365f8:00086] [11] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer21loadWeightFromBinFuncI6__halfS1_EEiPT_St6vectorImSaImEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x187)[0x7f9e1dc0b227]
[d46a4f8365f8:00086] [12] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer17loadWeightFromBinI6__halfEEiPT_St6vectorImSaImEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_14FtCudaDataTypeE+0x282)[0x7f9e1dc0ed12]
[d46a4f8365f8:00086] [13] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer11LlamaWeightI6__halfE16loadEncryptModelENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x184)[0x7f9e1d7cb0b4]
[d46a4f8365f8:00086] [14] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN16LlamaTritonModelI6__halfE19createSharedWeightsEii+0x2ad)[0x7f9e1d7b219d]
[d46a4f8365f8:00086] [15] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253)[0x7f9eab357253]
[d46a4f8365f8:00086] [16] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7f9eab0e7ac3]
[d46a4f8365f8:00086] [17] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x126660)[0x7f9eab179660]
[d46a4f8365f8:00086] *** End of error message ***

from qanything.

yydxlv commented on July 21, 2024

Triton服务同样显示启动失败，进入容器内检查/model_repos/QAEnsemble_base/QAEnsemble_base.log 发现：nohup: failed to run command '/opt/tritonserver/bin/tritonserver': No such file or directory

from qanything.

xixihahaliu commented on July 21, 2024

Triton服务同样显示启动失败，进入容器内检查/model_repos/QAEnsemble_base/QAEnsemble_base.log 发现：nohup: failed to run command '/opt/tritonserver/bin/tritonserver': No such file or directory

可以贴出完整的log文件吗？方便排查，另外可以看下FAQ_zh.md，可能存在帮助

from qanything.

xixihahaliu commented on July 21, 2024

补充一下，我和楼主同样的问题，我把QAEnsemble.log贴出来。 I0119 02:05:18.197207 86 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7f9e5c000000' with size 268435456 I0119 02:05:18.201188 86 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864 I0119 02:05:18.208520 86 model_lifecycle.cc:462] loading: rerank:1 I0119 02:05:18.208561 86 model_lifecycle.cc:462] loading: embed:1 I0119 02:05:18.208588 86 model_lifecycle.cc:462] loading: base:1 I0119 02:05:18.211636 86 onnxruntime.cc:2504] TRITONBACKEND_Initialize: onnxruntime I0119 02:05:18.211702 86 onnxruntime.cc:2514] Triton TRITONBACKEND API version: 1.12 I0119 02:05:18.211721 86 onnxruntime.cc:2520] 'onnxruntime' TRITONBACKEND API version: 1.12 I0119 02:05:18.211736 86 onnxruntime.cc:2550] backend configuration: {"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}} I0119 02:05:18.277019 86 onnxruntime.cc:2608] TRITONBACKEND_ModelInitialize: rerank (version 1) I0119 02:05:18.277589 86 onnxruntime.cc:2608] TRITONBACKEND_ModelInitialize: embed (version 1) I0119 02:05:18.277767 86 onnxruntime.cc:666] skipping model configuration auto-complete for 'rerank': inputs and outputs already specified I0119 02:05:18.278371 86 onnxruntime.cc:2651] TRITONBACKEND_ModelInstanceInitialize: rerank (GPU device 0) I0119 02:05:18.278735 86 onnxruntime.cc:666] skipping model configuration auto-complete for 'embed': inputs and outputs already specified I0119 02:05:18.280363 86 onnxruntime.cc:2651] TRITONBACKEND_ModelInstanceInitialize: embed (GPU device 0) I0119 02:05:18.758885 86 libfastertransformer.cc:459] Before Loading Weights: terminate called after throwing an instance of 'std::length_error' what(): basic_string::_M_create [d46a4f8365f8:00086] *** Process received signal *** [d46a4f8365f8:00086] Signal: Aborted (6) [d46a4f8365f8:00086] Signal code: (-6) [d46a4f8365f8:00086] [ 0] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f9eab095520] [d46a4f8365f8:00086] [ 1] /usr/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f9eab0e99fc] [d46a4f8365f8:00086] [ 2] /usr/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f9eab095476] [d46a4f8365f8:00086] [ 3] /usr/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f9eab07b7f3] [d46a4f8365f8:00086] [ 4] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7f9eab31db9e] [d46a4f8365f8:00086] [ 5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7f9eab32920c] [d46a4f8365f8:00086] [ 6] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7f9eab329277] [d46a4f8365f8:00086] [ 7] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7f9eab3294d8] [d46a4f8365f8:00086] [ 8] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(_ZSt20__throw_length_errorPKc+0x40)[0x7f9eab320449] [d46a4f8365f8:00086] [ 9] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x14bc69)[0x7f9eab3c6c69] [d46a4f8365f8:00086] [10] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(+0xa6ba3c)[0x7f9e1dbf2a3c] [d46a4f8365f8:00086] [11] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer21loadWeightFromBinFuncI6__halfS1_EEiPT_St6vectorImSaImEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x187)[0x7f9e1dc0b227] [d46a4f8365f8:00086] [12] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer17loadWeightFromBinI6__halfEEiPT_St6vectorImSaImEENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS_14FtCudaDataTypeE+0x282)[0x7f9e1dc0ed12] [d46a4f8365f8:00086] [13] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN17fastertransformer11LlamaWeightI6__halfE16loadEncryptModelENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x184)[0x7f9e1d7cb0b4] [d46a4f8365f8:00086] [14] /opt/tritonserver/backends/qa_ensemble/libqa-shared.so(_ZN16LlamaTritonModelI6__halfE19createSharedWeightsEii+0x2ad)[0x7f9e1d7b219d] [d46a4f8365f8:00086] [15] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253)[0x7f9eab357253] [d46a4f8365f8:00086] [16] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7f9eab0e7ac3] [d46a4f8365f8:00086] [17] /usr/lib/x86_64-linux-gnu/libc.so.6(+0x126660)[0x7f9eab179660] [d46a4f8365f8:00086] *** End of error message ***

原因2：如果发现显存够用，那是因为新版模型与部分显卡型号不兼容。
解决方案：请更换为兼容模型和镜像，手动下载模型文件解压并替换models目录，然后重启服务即可。
- 将docker-compose-xxx.yaml中的freeren/qanyxxx:v1.0.9改为freeren/qanyxxx:v1.0.8
- git clone https://www.wisemodel.cn/Netease_Youdao/qanything.git
- cd qanything
- git reset --hard 79b3da3bbb35406f0b2da3acfcdb4c96c2837faf
- unzip models.zip
- 替换掉现有的models目录

可以尝试上述解决方案，另外部分显卡型号不支持当前模型，请提前确认，在显存足够的前提下，目前已确认支持的显卡包括Nvidia 2080Ti，30系，40系，A30，A40，A100

from qanything.

xixihahaliu commented on July 21, 2024

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

我已经搜索过FAQ | I have searched FAQ

当前行为 | Current Behavior

从日志来看所有的服务均启动成功，但curl -s -w "%{http_code}" http://localhost:10000/v2/health/ready -o /dev/null) 检测一直不通过。超时后容器停止后也没有/model_repos/QAEnsemble_base/QAEnsemble_base.log 这个日志文件。

期望行为 | Expected Behavior

No response

运行环境 | Environment
- OS: ubuntu 22.04 x86
- NVIDIA Driver: 535.146.02
- CUDA:12.2
- Docker Compose:v2.24.0-birthday.10
- NVIDIA GPU Memory:16GB
QAnything日志 | QAnything logs

root@f1376869a3c5:/workspace/qanything_local# cat api.log UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:17 +0800] [91] [INFO] Sanic v23.6.0 [2024-01-19 09:56:17 +0800] [91] [INFO] Goin' Fast @ http://0.0.0.0:8777 [2024-01-19 09:56:17 +0800] [91] [INFO] mode: production, w/ 4 workers [2024-01-19 09:56:17 +0800] [91] [INFO] server: sanic, HTTP/1.1 [2024-01-19 09:56:17 +0800] [91] [INFO] python: 3.10.12 [2024-01-19 09:56:17 +0800] [91] [INFO] platform: Linux-6.5.0-14-generic-x86_64-with-glibc2.35 [2024-01-19 09:56:17 +0800] [91] [INFO] packages: sanic-routing==23.12.0, sanic-ext==23.6.0 UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:27 +0800] [658] [INFO] Sanic Extensions: [2024-01-19 09:56:27 +0800] [658] [INFO] > injection [0 dependencies; 0 constants] [2024-01-19 09:56:27 +0800] [658] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-01-19 09:56:27 +0800] [658] [INFO] > http [2024-01-19 09:56:27 +0800] [658] [INFO] > templating [jinja2==3.1.3] UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:27 +0800] [657] [INFO] Sanic Extensions: [2024-01-19 09:56:27 +0800] [657] [INFO] > injection [0 dependencies; 0 constants] [2024-01-19 09:56:27 +0800] [657] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-01-19 09:56:27 +0800] [657] [INFO] > http [2024-01-19 09:56:27 +0800] [657] [INFO] > templating [jinja2==3.1.3] UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:27 +0800] [659] [INFO] Sanic Extensions: [2024-01-19 09:56:27 +0800] [659] [INFO] > injection [0 dependencies; 0 constants] [2024-01-19 09:56:27 +0800] [659] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-01-19 09:56:27 +0800] [659] [INFO] > http [2024-01-19 09:56:27 +0800] [659] [INFO] > templating [jinja2==3.1.3] init local_doc_qa in local init local_doc_qa in local UPLOAD_ROOT_PATH: /workspace/qanything_local/QANY_DB/content rerank_port: 10001 embed_port: 10001 [2024-01-19 09:56:27 +0800] [660] [INFO] Sanic Extensions: [2024-01-19 09:56:27 +0800] [660] [INFO] > injection [0 dependencies; 0 constants] [2024-01-19 09:56:27 +0800] [660] [INFO] > openapi [http://0.0.0.0:8777/docs] [2024-01-19 09:56:27 +0800] [660] [INFO] > http [2024-01-19 09:56:27 +0800] [660] [INFO] > templating [jinja2==3.1.3] init local_doc_qa in local init local_doc_qa in local [2024-01-19 09:56:27 +0800] [658] [INFO] Starting worker [658] [2024-01-19 09:56:27 +0800] [657] [INFO] Starting worker [657] [2024-01-19 09:56:27 +0800] [659] [INFO] Starting worker [659] [2024-01-19 09:56:27 +0800] [660] [INFO] Starting worker [660]

复现方法 | Steps To Reproduce

No response

备注 | Anything else?

No response

目前单卡启动和双卡启动的日志文件位置不同，因为单卡启动多个tritonserver服务会同时启动，节省显存，目前看你应该是单卡启动的，请贴出/model_repos/QAEnsemble/QAEnsemble.log的详细内容，这里应该会有更多信息

from qanything.

[BUG]docker logs 一直提示Triton 正在启动 about qanything HOT 5 OPEN

Comments (5)

是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this?

该问题是否在FAQ中有解答？ | Is there an existing answer for this in FAQ?

当前行为 | Current Behavior

期望行为 | Expected Behavior

运行环境 | Environment

QAnything日志 | QAnything logs

复现方法 | Steps To Reproduce

备注 | Anything else?

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent