Coder Social home page Coder Social logo

fastertransformer not available about fauxpilot HOT 7 OPEN

Doonut avatar Doonut commented on May 3, 2024
fastertransformer not available

from fauxpilot.

Comments (7)

github-actions avatar github-actions commented on May 3, 2024

Hello there, thanks for opening your first issue. We welcome you to the FauxPilot community!

from fauxpilot.

thakkarparth007 avatar thakkarparth007 commented on May 3, 2024

Could you paste the full log?

from fauxpilot.

TheJambo avatar TheJambo commented on May 3, 2024

Not OP, but getting the same issue here:
FauxPilotIssue.txt

from fauxpilot.

sfwn avatar sfwn commented on May 3, 2024

Same issue at line 2. Wait until model ready, it may take a while.

fauxpilot-copilot_proxy-1  | [StatusCode.UNAVAILABLE] failed to connect to all addresses
fauxpilot-copilot_proxy-1  | WARNING: Model 'fastertransformer' is not available. Please ensure that `model` is set to either 'fastertransformer' or 'py-model' depending on your installation
fauxpilot-copilot_proxy-1  | Returned completion in 1.4257431030273438 ms
fauxpilot-copilot_proxy-1  | INFO:     2023-04-15 15:09:53,893 :: 100.105.61.13:59652 - "POST /v1/engines/codegen/completions HTTP/1.1" 200 OK
fauxpilot-triton-1         | I0415 15:10:10.021414 89 libfastertransformer.cc:321] After Loading Model:
fauxpilot-triton-1         | after allocation, free 17.82 GB total 31.74 GB
fauxpilot-triton-1         | I0415 15:10:10.021800 89 libfastertransformer.cc:537] Model instance is created on GPU Tesla V100-SXM2-32GB
fauxpilot-triton-1         | I0415 15:10:10.022000 89 model_repository_manager.cc:1345] successfully loaded 'fastertransformer' version 1
fauxpilot-triton-1         | I0415 15:10:10.022091 89 server.cc:556]
fauxpilot-triton-1         | +------------------+------+
fauxpilot-triton-1         | | Repository Agent | Path |
fauxpilot-triton-1         | +------------------+------+
fauxpilot-triton-1         | +------------------+------+
fauxpilot-triton-1         |
fauxpilot-triton-1         | I0415 15:10:10.022142 89 server.cc:583]
fauxpilot-triton-1         | +-------------------+-----------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-triton-1         | | Backend           | Path                                                                        | Config                                                                                                                                                         |
fauxpilot-triton-1         | +-------------------+-----------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-triton-1         | | fastertransformer | /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
fauxpilot-triton-1         | +-------------------+-----------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-triton-1         |
fauxpilot-triton-1         | I0415 15:10:10.022178 89 server.cc:626]
fauxpilot-triton-1         | +-------------------+---------+--------+
fauxpilot-triton-1         | | Model             | Version | Status |
fauxpilot-triton-1         | +-------------------+---------+--------+
fauxpilot-triton-1         | | fastertransformer | 1       | READY  |
fauxpilot-triton-1         | +-------------------+---------+--------+

After it shows ready, you still need to take some time for it to become available(no tip). And then it works.

fauxpilot-copilot_proxy-1  | INFO:     2023-04-15 15:14:01,120 :: 100.105.61.13:37942 - "POST /v1/engines/codegen/completions HTTP/1.1" 200 OK
fauxpilot-triton-1         | W0415 15:17:33.322818 89 libfastertransformer.cc:1397] model fastertransformer, instance fastertransformer_0, executing 1 requests
fauxpilot-triton-1         | W0415 15:17:33.322852 89 libfastertransformer.cc:638] TRITONBACKEND_ModelExecute: Running fastertransformer_0 with 1 requests
fauxpilot-triton-1         | W0415 15:17:33.322861 89 libfastertransformer.cc:693] get total batch_size = 1
fauxpilot-triton-1         | W0415 15:17:33.322874 89 libfastertransformer.cc:1051] get input count = 16
fauxpilot-triton-1         | W0415 15:17:33.322892 89 libfastertransformer.cc:1117] collect name: start_id size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.322903 89 libfastertransformer.cc:1117] collect name: input_ids size: 8 bytes
fauxpilot-triton-1         | W0415 15:17:33.322914 89 libfastertransformer.cc:1117] collect name: bad_words_list size: 8 bytes
fauxpilot-triton-1         | W0415 15:17:33.322924 89 libfastertransformer.cc:1117] collect name: random_seed size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.322935 89 libfastertransformer.cc:1117] collect name: end_id size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.322945 89 libfastertransformer.cc:1117] collect name: input_lengths size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.322955 89 libfastertransformer.cc:1117] collect name: request_output_len size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.322965 89 libfastertransformer.cc:1117] collect name: runtime_top_k size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.322974 89 libfastertransformer.cc:1117] collect name: runtime_top_p size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.322984 89 libfastertransformer.cc:1117] collect name: is_return_log_probs size: 1 bytes
fauxpilot-triton-1         | W0415 15:17:33.322992 89 libfastertransformer.cc:1117] collect name: stop_words_list size: 24 bytes
fauxpilot-triton-1         | W0415 15:17:33.323003 89 libfastertransformer.cc:1117] collect name: temperature size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.323012 89 libfastertransformer.cc:1117] collect name: len_penalty size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.323021 89 libfastertransformer.cc:1117] collect name: beam_width size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.323032 89 libfastertransformer.cc:1117] collect name: beam_search_diversity_rate size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.323042 89 libfastertransformer.cc:1117] collect name: repetition_penalty size: 4 bytes
fauxpilot-triton-1         | W0415 15:17:33.323050 89 libfastertransformer.cc:1130] the data is in CPU
fauxpilot-triton-1         | W0415 15:17:33.323058 89 libfastertransformer.cc:1137] the data is in CPU
fauxpilot-triton-1         | W0415 15:17:33.323075 89 libfastertransformer.cc:999] before ThreadForward 0
fauxpilot-triton-1         | W0415 15:17:33.323133 89 libfastertransformer.cc:1006] after ThreadForward 0
fauxpilot-triton-1         | I0415 15:17:33.323163 89 libfastertransformer.cc:834] Start to forward
fauxpilot-triton-1         | I0415 15:17:33.585172 89 libfastertransformer.cc:836] Stop to forward
fauxpilot-triton-1         | W0415 15:17:33.585278 89 libfastertransformer.cc:1161] Get output_tensors 0: output_ids
fauxpilot-triton-1         | W0415 15:17:33.585301 89 libfastertransformer.cc:1171]     output_type: UINT32
fauxpilot-triton-1         | W0415 15:17:33.585313 89 libfastertransformer.cc:1191]     output shape: [1, 1, 102]
fauxpilot-triton-1         | W0415 15:17:33.585391 89 libfastertransformer.cc:1161] Get output_tensors 1: sequence_length
fauxpilot-triton-1         | W0415 15:17:33.585400 89 libfastertransformer.cc:1171]     output_type: INT32
fauxpilot-triton-1         | W0415 15:17:33.585408 89 libfastertransformer.cc:1191]     output shape: [1, 1]
fauxpilot-triton-1         | W0415 15:17:33.585439 89 libfastertransformer.cc:1206] PERFORMED GPU copy: NO
fauxpilot-triton-1         | W0415 15:17:33.585456 89 libfastertransformer.cc:780] get response size = 1
fauxpilot-triton-1         | W0415 15:17:33.585628 89 libfastertransformer.cc:795] response is sent
fauxpilot-copilot_proxy-1  | Returned completion in 264.94908332824707 ms
fauxpilot-copilot_proxy-1  | INFO:     2023-04-15 15:17:33,586 :: 172.19.0.1:47588 - "POST /v1/engines/codegen/completions HTTP/1.1" 200 OK

from fauxpilot.

TheJambo avatar TheJambo commented on May 3, 2024

I mean, I'm not sure how long you're expected to leave it, but it doesn't look like it's getting far after 15 minutes.

fauxpilot-windows-main-triton-1         | [FT][WARNING] Custom All Reduce only supports 8 Ranks currently. Using NCCL as Comm.
fauxpilot-windows-main-triton-1         | after allocation, free 10.56 GB total 12.00 GB
fauxpilot-windows-main-triton-1         | [WARNING] gemm_config.in is not found; using default GEMM algo
fauxpilot-windows-main-triton-1         | I0415 15:35:30.032622 88 libfastertransformer.cc:321] After Loading Model:
fauxpilot-windows-main-triton-1         | after allocation, free 4.95 GB total 12.00 GB
fauxpilot-windows-main-triton-1         | I0415 15:35:30.032867 88 libfastertransformer.cc:537] Model instance is created on GPU NVIDIA GeForce RTX 3080 Ti
fauxpilot-windows-main-triton-1         | I0415 15:35:30.033259 88 model_repository_manager.cc:1345] successfully loaded 'fastertransformer' version 1
fauxpilot-windows-main-triton-1         | I0415 15:35:30.033333 88 server.cc:556]
fauxpilot-windows-main-triton-1         | +------------------+------+
fauxpilot-windows-main-triton-1         | | Repository Agent | Path |
fauxpilot-windows-main-triton-1         | +------------------+------+
fauxpilot-windows-main-triton-1         | +------------------+------+
fauxpilot-windows-main-triton-1         |
fauxpilot-windows-main-triton-1         | I0415 15:35:30.033379 88 server.cc:583]
fauxpilot-windows-main-triton-1         | +-------------------+-----------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-windows-main-triton-1         | | Backend           | Path                                                                        | Config                                                                                                                                                         |
fauxpilot-windows-main-triton-1         | +-------------------+-----------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-windows-main-triton-1         | | fastertransformer | /opt/tritonserver/backends/fastertransformer/libtriton_fastertransformer.so | {"cmdline":{"auto-complete-config":"false","min-compute-capability":"6.000000","backend-directory":"/opt/tritonserver/backends","default-max-batch-size":"4"}} |
fauxpilot-windows-main-triton-1         | +-------------------+-----------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-windows-main-triton-1         |
fauxpilot-windows-main-triton-1         | I0415 15:35:30.033630 88 server.cc:626]
fauxpilot-windows-main-triton-1         | +-------------------+---------+--------+
fauxpilot-windows-main-triton-1         | | Model             | Version | Status |
fauxpilot-windows-main-triton-1         | +-------------------+---------+--------+
fauxpilot-windows-main-triton-1         | | fastertransformer | 1       | READY  |
fauxpilot-windows-main-triton-1         | +-------------------+---------+--------+
fauxpilot-windows-main-triton-1         |
fauxpilot-windows-main-triton-1         | I0415 15:35:30.045193 88 metrics.cc:650] Collecting metrics for GPU 0: NVIDIA GeForce RTX 3080 Ti
fauxpilot-windows-main-triton-1         | I0415 15:35:30.045348 88 tritonserver.cc:2159]
fauxpilot-windows-main-triton-1         | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-windows-main-triton-1         | | Option                           | Value                                                                                                                                                                                        |
fauxpilot-windows-main-triton-1         | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-windows-main-triton-1         | | server_id                        | triton                                                                                                                                                                                       |
fauxpilot-windows-main-triton-1         | | server_version                   | 2.23.0                                                                                                                                                                                       |
fauxpilot-windows-main-triton-1         | | server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics trace |
fauxpilot-windows-main-triton-1         | | model_repository_path[0]         | /model                                                                                                                                                                                       |
fauxpilot-windows-main-triton-1         | | model_control_mode               | MODE_NONE                                                                                                                                                                                    |
fauxpilot-windows-main-triton-1         | | strict_model_config              | 1                                                                                                                                                                                            |
fauxpilot-windows-main-triton-1         | | rate_limit                       | OFF                                                                                                                                                                                          |
fauxpilot-windows-main-triton-1         | | pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                                    |
fauxpilot-windows-main-triton-1         | | cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                                     |
fauxpilot-windows-main-triton-1         | | response_cache_byte_size         | 0                                                                                                                                                                                            |
fauxpilot-windows-main-triton-1         | | min_supported_compute_capability | 6.0                                                                                                                                                                                          |
fauxpilot-windows-main-triton-1         | | strict_readiness                 | 1                                                                                                                                                                                            |
fauxpilot-windows-main-triton-1         | | exit_timeout                     | 30                                                                                                                                                                                           |
fauxpilot-windows-main-triton-1         | +----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
fauxpilot-windows-main-triton-1         |
fauxpilot-windows-main-triton-1         | I0415 15:35:30.049895 88 grpc_server.cc:4587] Started GRPCInferenceService at 0.0.0.0:8001
fauxpilot-windows-main-triton-1         | I0415 15:35:30.050969 88 http_server.cc:3303] Started HTTPService at 0.0.0.0:8000
fauxpilot-windows-main-triton-1         | I0415 15:35:30.137683 88 http_server.cc:178] Started Metrics Service at 0.0.0.0:8002
fauxpilot-windows-main-copilot_proxy-1  | [StatusCode.UNAVAILABLE] failed to connect to all addresses
fauxpilot-windows-main-copilot_proxy-1  | WARNING: Model 'fastertransformer' is not available. Please ensure that `model` is set to either 'fastertransformer' or 'py-model' depending on your installation
fauxpilot-windows-main-copilot_proxy-1  | Returned completion in 4.260063171386719 ms
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 15:49:30,615 :: 172.18.0.1:50712 - "POST /v1/engines/codegen/completions HTTP/1.1" 200 OK
fauxpilot-windows-main-copilot_proxy-1  | [StatusCode.UNAVAILABLE] failed to connect to all addresses
fauxpilot-windows-main-copilot_proxy-1  | WARNING: Model 'fastertransformer' is not available. Please ensure that `model` is  set to either 'fastertransformer' or 'py-model' depending on your installation
fauxpilot-windows-main-copilot_proxy-1  | Returned completion in 3.4265518188476562 ms
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 15:51:35,327 :: 172.18.0.1:49786 - "POST /v1/engines/codegen/completions HTTP/1.1" 200 OK```

Even using the API webpage, on the /codegen/completions endpoint, the "Try it out" execution fails too.

from fauxpilot.

sfwn avatar sfwn commented on May 3, 2024

@TheJambo my server configuration is 12c 90g v100, it takes about 1m30s total.

from fauxpilot.

TheJambo avatar TheJambo commented on May 3, 2024
fauxpilot-windows-main-copilot_proxy-1  | INFO:     Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit)
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 16:35:52,778 :: 172.18.0.1:54426 - "GET / HTTP/1.1" 200 OK
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 16:35:53,149 :: 172.18.0.1:54426 - "GET /openapi.json HTTP/1.1" 200 OK
fauxpilot-windows-main-copilot_proxy-1  | [StatusCode.UNAVAILABLE] failed to connect to all addresses
fauxpilot-windows-main-copilot_proxy-1  | WARNING: Model 'fastertransformer' is not available. Please ensure that `model` is set to either 'fastertransformer' or 'py-model' depending on your installation
fauxpilot-windows-main-copilot_proxy-1  | Returned completion in 1.5850067138671875 ms
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 16:35:57,042 :: 172.18.0.1:54426 - "POST /v1/engines/codegen/completions HTTP/1.1" 200 OK
fauxpilot-windows-main-copilot_proxy-1  | [StatusCode.UNAVAILABLE] failed to connect to all addresses
fauxpilot-windows-main-copilot_proxy-1  | WARNING: Model 'fastertransformer' is not available. Please ensure that `model` is set to either 'fastertransformer' or 'py-model' depending on your installation
fauxpilot-windows-main-copilot_proxy-1  | Returned completion in 1.0571479797363281 ms
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 16:36:01,395 :: 172.18.0.1:54426 - "POST /v1/engines/codegen/completions HTTP/1.1" 200 OK
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 16:36:20,462 :: 172.18.0.1:54428 - "GET /copilot_internal/v2/token HTTP/1.1" 200 OK
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 18:10:03,355 :: 172.18.0.1:37994 - "GET / HTTP/1.1" 200 OK
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 18:10:03,724 :: 172.18.0.1:37994 - "GET /openapi.json HTTP/1.1" 200 OK
fauxpilot-windows-main-copilot_proxy-1  | [StatusCode.UNAVAILABLE] failed to connect to all addresses
fauxpilot-windows-main-copilot_proxy-1  | WARNING: Model 'fastertransformer' is not available. Please ensure that `model` is set to either 'fastertransformer' or 'py-model' depending on your installation
fauxpilot-windows-main-copilot_proxy-1  | Returned completion in 8.57996940612793 ms
fauxpilot-windows-main-copilot_proxy-1  | INFO:     2023-04-15 18:10:08,561 :: 172.18.0.1:37994 - "POST /v1/engines/codegen/completions HTTP/1.1" 200 OK

Maybe it can be time based in some instances, but not here! πŸ˜†

from fauxpilot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.