LocalAI version: LocalAI version: v2.10.1-23-gbd25d80 <p dir

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

LocalAI sends empty chunk to chatbot_ui and closes stream about localai HOT 5 CLOSED

ga-it commented on June 11, 2024 4

LocalAI sends empty chunk to chatbot_ui and closes stream

from localai.

Comments (5)

gillbates commented on June 11, 2024 1

same problem here.

from localai.

ga-it commented on June 11, 2024

Hi @mudler

I appreciate all your great work and workload

Any word on the above? Is it my misconfiguration or is this a bone fide bug?

I am stuck without a resolution path.

Regards

from localai.

teto commented on June 11, 2024

I usually wouldn't add anything but because of the label "unconfirmed" I wanted to say "me too". I haven't been able to find the rootcause, a same version works but all of a sudden doesn't anymore. I might have updated my system inbetween, which could explain that.
I use my GPU nvidia with the https://github.com/Robitx/gp.nvim plugin. It fails all the time now, even on new sessions.

...
$ nix run .#local-ai-cublas -- --models-path ~/localai-models --autoload-galleries --address ":11111" --debug
....
<|im_start|>assistant

[127.0.0.1]:51000 200 - POST /v1/chat/completions
1:23AM DBG Sending chunk: {"created":1711585346,"object":"chat.completion.chunk","id":"868f2609-0af6-4e96-9e92-ff3d7fc84aca","model":"mistral","choices":[{"index":0,"finish_reason":"","delta":{"role":"assistant","content":""}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
This

'data: {"created":1711585346,"object":"chat.completion.chunk","id":"868f2609-0af6-4e96-9e92-ff3d7fc84aca","model":"mistral","choices":[{"index":0,"finish_reason":"","delta":{"role":"assistant","content":""}}],"usage":{"prompt_tokens":0,"completion_tokens":
0,"total_tokens":0}}\ndata: {"created":1711585346,"object":"chat.completion.chunk","id":"868f2609-0af6-4e96-9e92-ff3d7fc84aca","model":"mistral","choices":[{"index":0,"finish_reason":"stop","delta":{"content":""}}],"usage":{"prompt_tokens":0,"completion_to
kens":0,"total_tokens":0}}\ndata: [DONE]\n'

and as I was writing this message, I realized I started adding recently the --autoload-galleries and without it localAI now works again \o/ I am not sure what the flag does but looks like a tricky one !

from localai.

s0undy commented on June 11, 2024

Same issue here. Im able to send 1-2 messages and get responses back then it just stops.

Logs

`2024-04-05 20:21:19 6:21PM DBG Model already loaded in memory: 5c7cd056ecf9a4bb5b527410b97f48cb
2024-04-05 20:21:19 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"role":"assistant","content":""}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:19 
2024-04-05 20:21:19 6:21PM DBG Model '5c7cd056ecf9a4bb5b527410b97f48cb' already loaded
2024-04-05 20:21:19 6:21PM DBG GRPC(5c7cd056ecf9a4bb5b527410b97f48cb-127.0.0.1:43219): stdout {"timestamp":1712341279,"level":"INFO","function":"launch_slot_with_data","line":884,"message":"slot is processing task","slot_id":0,"task_id":58}
2024-04-05 20:21:19 6:21PM DBG GRPC(5c7cd056ecf9a4bb5b527410b97f48cb-127.0.0.1:43219): stdout {"timestamp":1712341279,"level":"INFO","function":"update_slots","line":1783,"message":"kv cache rm [p0, end)","slot_id":0,"task_id":58,"p0":0}
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"\n"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"\n"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"U"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"n"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"d"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"e"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"r"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":" "}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"k"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"o"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"m"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"m"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"u"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"n"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"f"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"u"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"l"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"l"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"m"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:21:23 
2024-04-05 20:21:23 6:21PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"ä"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

Keeps going like this untill it stops

2024-04-05 20:23:53 6:23PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"\n"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

2024-04-05 20:23:53 
2024-04-05 20:23:53 6:23PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"\n"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:23:53 
2024-04-05 20:23:53 6:23PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"\n"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:23:53 
2024-04-05 20:23:54 6:23PM DBG Sending chunk: {"created":1712341017,"object":"chat.completion.chunk","id":"d28dfe6e-75ec-4fea-b74a-a69f6e2afafd","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"content":"\n"}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
2024-04-05 20:23:54 
2024-04-05 20:23:54 6:23PM DBG GRPC(5c7cd056ecf9a4bb5b527410b97f48cb-127.0.0.1:43219): stdout {"timestamp":1712341434,"level":"INFO","function":"print_timings","line":327,"message":"prompt eval time     =    3762.01 ms /  1559 tokens (    2.41 ms per token,   414.41 tokens per second)","slot_id":0,"task_id":58,"t_prompt_processing":3762.013,"num_prompt_tokens_processed":1559,"t_token":2.413093649775497,"n_tokens_second":414.4057981724146}
2024-04-05 20:23:54 6:23PM DBG GRPC(5c7cd056ecf9a4bb5b527410b97f48cb-127.0.0.1:43219): stdout {"timestamp":1712341434,"level":"INFO","function":"print_timings","line":341,"message":"generation eval time =  150698.70 ms /  2048 runs   (   73.58 ms per token,    13.59 tokens per second)","slot_id":0,"task_id":58,"t_token_generation":150698.697,"n_decoded":2048,"t_token":73.58334814453124,"n_tokens_second":13.59003123961981}
2024-04-05 20:23:54 6:23PM DBG GRPC(5c7cd056ecf9a4bb5b527410b97f48cb-127.0.0.1:43219): stdout {"timestamp":1712341434,"level":"INFO","function":"print_timings","line":351,"message":"          total time =  154460.71 ms","slot_id":0,"task_id":58,"t_prompt_processing":3762.013,"t_token_generation":150698.697,"t_total":154460.71}
2024-04-05 20:23:54 6:23PM DBG GRPC(5c7cd056ecf9a4bb5b527410b97f48cb-127.0.0.1:43219): stdout {"timestamp":1712341434,"level":"INFO","function":"update_slots","line":1594,"message":"slot released","slot_id":0,"task_id":58,"n_ctx":4096,"n_past":3606,"n_system_tokens":0,"n_cache_tokens":3607,"truncated":false}

LocalAI version:
Docker using docker-compose:
Image version: 7e498578e3fd

version: "3.9"
services:
  api:
    image: localai/localai:latest-aio-gpu-nvidia-cuda-12
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
      interval: 1m
      timeout: 20m
      retries: 5
    ports:
      - 8080:8080
    environment:
      - DEBUG=true
      # ...
    volumes:
      - ./models:/build/models:cached
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Environment, CPU architecture, OS, and Version:
WSL2- Ubuntu 22.04
Linux GIBBSTATION 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux

CPU info:
2024-04-05 20:33:03 model name : AMD Ryzen 5 5600X 6-Core Processor
2024-04-05 20:33:03 flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext perfctr_core ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm

NVIDIA GPU detected via WSL2
2024-04-05 20:33:03 Fri Apr 5 18:33:03 2024
2024-04-05 20:33:03 +---------------------------------------------------------------------------------------+
2024-04-05 20:33:03 | NVIDIA-SMI 545.23.06 Driver Version: 545.92 CUDA Version: 12.3 |
2024-04-05 20:33:03 |-----------------------------------------+----------------------+----------------------+
2024-04-05 20:33:03 | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
2024-04-05 20:33:03 | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
2024-04-05 20:33:03 | | | MIG M. |
2024-04-05 20:33:03 |=========================================+======================+======================|
2024-04-05 20:33:03 | 0 NVIDIA GeForce RTX 3070 On | 00000000:2B:00.0 On | N/A |
2024-04-05 20:33:03 | 56% 46C P3 47W / 270W | 1189MiB / 8192MiB | 27% Default |
2024-04-05 20:33:03 | | | N/A |
2024-04-05 20:33:03 +-----------------------------------------+----------------------+----------------------+
2024-04-05 20:33:03
2024-04-05 20:33:03 +---------------------------------------------------------------------------------------+
2024-04-05 20:33:03 | Processes: |
2024-04-05 20:33:03 | GPU GI CI PID Type Process name GPU Memory |
2024-04-05 20:33:03 | ID ID Usage |
2024-04-05 20:33:03 |=======================================================================================|
2024-04-05 20:33:03 | 0 N/A N/A 31 G /Xwayland N/A |
2024-04-05 20:33:03 | 0 N/A N/A 33 G /Xwayland N/A |
2024-04-05 20:33:03 | 0 N/A N/A 35 G /Xwayland N/A |
2024-04-05 20:33:03 +---------------------------------------------------------------------------------------+
2024-04-05 20:33:03 NVIDIA GPU detected. Attempting to find memory size...
2024-04-05 20:33:03 Total GPU Memory: 8192 MiB

EDIT
After doing some troubleshooting it seems that the issue only occurs when trying to have a conversation around embedded documents. When curling the API with sentences manually im able to have it working without it throwing any errors.

from localai.

localai-bot commented on June 11, 2024

It seems like the issue you're experiencing is related to sending multiple messages in a conversation involving embedded documents. This issue might be due to a bug in one of the recent updates of the LocalAI version you are using. However, I have found a workaround for you to continue using the chat functionality while this issue is being investigated and fixed.

Workaround: When sending multiple messages in a conversation involving embedded documents, split the messages into separate chat completion requests with a delay between them. For example, instead of sending:

{
  "object": "chat.completion.chunk",
  "choices": [
  {
    "index": 0,
    "finish_reason": "",
    "delta": {
      "content": 'Message 1...',
      "children": [
        {
          "object

from localai.

LocalAI sends empty chunk to chatbot_ui and closes stream about localai HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent