Coder Social home page Coder Social logo

Comments (5)

mitya52 avatar mitya52 commented on May 14, 2024

@Johnz86 hi!

'Required memory exceeds the GPU's memory' is just a warning, it does not affect inference at all. But CONTRASTcode/3B requires ~8Gb of VRAM at full context. Using 4Gb for large files can lead to OOM. This warning is unclear and we're fix it in the future.
I think your problem is in infurl: change https to http.
SERVER_API_TOKEN is not using by new docker container, you can remove it.

from refact.

Johnz86 avatar Johnz86 commented on May 14, 2024

I tried it with http, the Invalid HTTP request received. does no longer appear in logs. The issues is that no inference is happening, and I can not determine from logs or any response, what is the state of the process.
image
Is there any way to determine, If I should wait for the inference, or when the process does not work at all?

from refact.

mitya52 avatar mitya52 commented on May 14, 2024

@Johnz86 you can check error occured in refact.ai below chat (yellow box). Also please give server logs, it should be OOM or something like this.

from refact.

Johnz86 avatar Johnz86 commented on May 14, 2024

Here is an example of docker container logs:

PS C:\Users\z0034zpz> docker run -d --rm -p 8008:8008 -v perm-storage:/perm_storage --gpus all smallcloud/refact_self_hosting
9b1d43ca00bbe3f68e05876e2c18da266348a81cdd851553494643b27ae9afcc
PS C:\Users\z0034zpz> docker logs -f 9b1d

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

20230907 09:06:14 adding job model-contrastcode-3b-multi-0.cfg
20230907 09:06:14 adding job enum_gpus.cfg
20230907 09:06:14 adding job filetune.cfg
20230907 09:06:14 adding job filetune_filter_only.cfg
20230907 09:06:14 adding job process_uploaded.cfg
20230907 09:06:14 adding job webgui.cfg
20230907 09:06:14 CVD=0 starting python -m self_hosting_machinery.inference.inference_worker --model CONTRASTcode/3b/multi --compile
 -> pid 31
20230907 09:06:14 CVD= starting python -m self_hosting_machinery.scripts.enum_gpus
 -> pid 32
20230907 09:06:14 CVD= starting python -m self_hosting_machinery.webgui.webgui
 -> pid 33
-- 33 -- 20230907 09:06:14 WEBUI Started server process [33]
-- 33 -- 20230907 09:06:14 WEBUI Waiting for application startup.
-- 33 -- 20230907 09:06:14 WEBUI Application startup complete.
-- 33 -- 20230907 09:06:14 WEBUI Uvicorn running on http://0.0.0.0:8008 (Press CTRL+C to quit)
-- 31 -- 20230907 09:06:19 MODEL STATUS loading model
-- 31 -- 20230907 09:07:03 MODEL STATUS test batch
20230907 09:07:46 31 finished python -m self_hosting_machinery.inference.inference_worker --model CONTRASTcode/3b/multi @:gpu00, retcode 0
/finished compiling as recognized by watchdog
20230907 09:07:47 CVD=0 starting python -m self_hosting_machinery.inference.inference_worker --model CONTRASTcode/3b/multi
 -> pid 111
-- 111 -- 20230907 09:07:50 MODEL STATUS loading model
-- 33 -- 20230907 09:07:51 WEBUI 172.17.0.1:41986 - "GET /v1/login HTTP/1.1" 200
-- 111 -- 20230907 09:08:17 MODEL STATUS test batch
-- 111 -- 20230907 09:08:52 MODEL STATUS serving CONTRASTcode/3b/multi
-- 111 -- 20230907 09:09:02 MODEL 10008.3ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:09:02 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200
-- 111 -- 20230907 09:09:12 MODEL 10004.0ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:09:12 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200
-- 111 -- 20230907 09:09:22 MODEL 10003.5ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:09:22 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200
-- 33 -- 20230907 09:09:26 WEBUI comp-SvVQSvapACW6 model resolve "gpt3.5" -> error "model is not loaded (2)" from XXX
-- 33 -- 20230907 09:09:26 WEBUI 172.17.0.1:41990 - "POST /v1/chat HTTP/1.1" 400
-- 111 -- 20230907 09:09:32 MODEL 10005.4ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:09:32 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200
-- 111 -- 20230907 09:09:42 MODEL 10005.0ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:09:42 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200
-- 111 -- 20230907 09:09:52 MODEL 10002.7ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:09:52 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200
-- 111 -- 20230907 09:10:02 MODEL 10002.7ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:10:02 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200
-- 111 -- 20230907 09:10:12 MODEL 10003.0ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:10:12 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200
-- 111 -- 20230907 09:10:22 MODEL 10006.3ms http://127.0.0.1:8008/infengine-v1/completions-wait-batch WAIT
-- 33 -- 20230907 09:10:22 WEBUI 127.0.0.1:41988 - "POST /infengine-v1/completions-wait-batch HTTP/1.1" 200

Here is how it looks in ui:
image

from refact.

olegklimov avatar olegklimov commented on May 14, 2024

model is not loaded (2) -- it can't access the model, according the logs.

I guess the good way to go about solving this -- react to configuration changes faster.

#158

from refact.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.