Coder Social home page Coder Social logo

c0sogi / llmchat Goto Github PK

View Code? Open in Web Editor NEW
225.0 7.0 37.0 59.72 MB

A full-stack Webui implementation of Large Language model, such as ChatGPT or LLaMA.

License: MIT License

Python 87.36% Shell 0.03% HTML 2.99% Batchfile 0.07% JavaScript 1.17% Jupyter Notebook 8.38%
chatbot chatgpt fastapi flutter fullstack mysql python redis restapi sqlalchemy

llmchat's People

Contributors

c0sogi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

llmchat's Issues

Too many prompts

Why so many prompts for a simple 1 sentence chat? The embeddings are fine, but I think app is sending same chat message to openai in a loop

6:00 AM 16 requests 6:05 AM Local time: May 21, 2023, 2:05 AM gpt-4-0314, 1 request 271 prompt + 6 completion = 277 tokens 6:05 AM Local time: May 21, 2023, 2:05 AM text-embedding-ada-002-v2, 4 requests 600 prompt + 0 completion = 600 tokens 6:10 AM Local time: May 21, 2023, 2:10 AM gpt-4-0314, 3 requests 1,498 prompt + 155 completion = 1,653 tokens 6:10 AM Local time: May 21, 2023, 2:10 AM text-embedding-ada-002-v2, 2 requests 37 prompt + 0 completion = 37 tokens 6:15 AM Local time: May 21, 2023, 2:15 AM gpt-4-0314, 4 requests 5,737 prompt + 148 completion = 5,885 tokens 6:15 AM Local time: May 21, 2023, 2:15 AM text-embedding-ada-002-v2, 2 requests 14 prompt + 0 completion = 14 tokens

Are you open to a private chat

I have an idea for company/product. I think it's winner take all situation but it requires a first mover advantage. Btw I live in Canada

chat box not expandable

The chat box should increase in height if you add more content. Fixed height makes it hard to use long prompts

Save embeddings metadata to MySQL

Admin should be able to examine and delete redundant vectors by means of stored metadata. Date of inserts, author , title etc should be saved in mysql. Saving in Redis may pollute the vector?

docker-compose-lan

Trying to install into a LAN network so everyone within the network can use. I added IP as HOST_MAIN="192.168.2.202". The chat interface is reachable but authentication is denied

Can't login in web UI

Anyone used it?
I launch it through docker command, there is user/password login page in left sidebar. I can't register a new user, it always report XMLHTTPRequest error.

Then I go to mysql docker, and create the table info, and find users/api_keys table, so I insert one row into users table:

INSERT INTO users(status,email,password, marketing_agree, created_at, updated_at)
VALUES ('admin','[email protected]','12341234', 1, now(), now() );

I saw new user exists, but I still can't login from UI. Any detail instruction?

Referencing Models

In llms.py your example model reference location is model_path="./llama_models/ggml/Wizard-Vicuna-7B-Uncensored.ggmlv2.q4_1.bin".
This leads me to believe the model should be at LLMChat/app/llama_models/ggml/Wizard-Vicuna-7B-Uncensored.ggmlv2.q4_1.bin,
but the app is not able to load the model.
i created the llama_models and ggml directories as there were none.
Thank you

API_ENV=?

For production what is the value
API_ENV="prod" or API_ENV="production"

Modular architecture

So far we have achieved a lot with this codebase. Perhaps its time, we pause new features and refactor code base to allow for core/plugin structure. Adopting modular architecture, will give this project incredible flexibility. I want us to be the "drupal" of this space, but we must structure it now before code base grow too big.

Lets standardize what a module is and push:

  • embedding/document querying
  • browsing
  • Prompts (prefix/suffix)
  • All no core features

This allows developers to add, maintain modules without interfering with core. Project is useable as is. I will send you an email shortly. Your thoughts

[Feature Request] Support InternLM

Dear LLMChat developer,

Greetings! I am vansinhu, a community developer and volunteer at InternLM. Your work has been immensely beneficial to me, and I believe it can be effectively utilized in InternLM as well. Welcome to add Discord https://discord.gg/gF9ezcmtM3 . I hope to get in touch with you.

Best regards,
vansinhu

Overlap embeddings

chunk_overlap: int = 0, overlap should be about 100 to preserve meanings cutoff by chunk_size

Replace Redis

Lets implement qdrant for embeddings. Use Redis for what it's good at - caching chats. Qdrant is fast and stable and excellent at search and filtering see benchmark.
Redis single thread execution is bad for vertical scaling. Down the road we should allow BYOD (bring your own db)

Please implement as separate docker for independent scaling docker run -p 6333:6333 qdrant/qdrant

/query as defaut

Implement /bypass to search via LLM without hitting vector. Otherwise chats must check vector for embeddings before interacting with LLM. The purpose of this app I believe is to grant longer context memory. Being forced to add /query in front of every chat is tiresome. Embeddings can prompt LLM how to behave on chat initiation.

How can I switch to local LLM engine

In Chat UI, there is a long list of LLM model. The default one is GPT 3.5 Turbo, which is openAI as I guess.
I configure openAI api key in .env, so it should be used, as the answer is very fast.

When I try to switch it to Llama 7B, it report:

An error occurred while generating text: Model llama-7b-GGML is currently booting.

I setup another llm engine "vllm" based on llama-2-7b-chat model, and expose in port 3000, it is compatible with openAI API.
how can I configure it to use this new engine?

Integrate gptCache

Lets save some $$$ by implementing GPTcache
There is a docker image and I think it may already work with Redis
$ docker pull zilliz/gptcache:latest $ docker run -p 8000:8000 -it zilliz/gptcache:latest

Admin can use temperature settings to bypass cache.

Important: cache must maintain user privacy. Admin can add sitewide cache. This will make FAQ generation a breeze and cost nothing to retrieve cached info each time.

using sub.domain.tld as HOST_MAIN

Hi,
I want to install this as production to a subdomain. Using sub.domain.tld as HOST_MAIN should work I am guessing?
Thanks for this app, I will test and give my 2 cents of feedback.

Message prefix/suffix

Ability to add prefix/suffix instruction to first message in chat. Best as .env variable. I want to instruct openAI about policies. Kindly give me a pointer so I can attempt this.

Chatroles.system default startup TMPL

How to use a DESCRIPTION_TMPL for openai "chatroles.system" models. I see DESCRIPTION being passed as "description" in other modals. Gpt needs this feature, if it exist how do I trigger it.?

Fail to run api in docker

When I run everything in docker, the api container fail to build llama.cpp as cmake didn't exist.
Does it require the external gcc to be 11?

>docker-compose -f docker-compose-local.yaml up api
[+] Running 3/0
 ✔ Container llmchat-cache-1  Running                                                                                                                              0.0s 
 ✔ Container llmchat-db-1     Running                                                                                                                              0.0s 
 ✔ Container llmchat-api-1    Created                                                                                                                              0.0s 
Attaching to llmchat-api-1, llmchat-cache-1, llmchat-db-1
llmchat-api-1    | None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
llmchat-api-1    | [2023-09-19 07:50:40,323] SQLAlchemy:CRITICAL - Current DB connection of LocalConfig: db/traffic@traffic_admin
llmchat-api-1    | INFO:     Started server process [1]
llmchat-api-1    | INFO:     Waiting for application startup.
llmchat-api-1    | [2023-09-19 07:50:41,191] ApiLogger:CRITICAL - ⚙️ Booting up...
llmchat-api-1    | [2023-09-19 07:50:41,191] ApiLogger:CRITICAL - MySQL DB connected!
llmchat-api-1    | [2023-09-19 07:50:41,195] ApiLogger:CRITICAL - Redis CACHE connected!
llmchat-api-1    | [2023-09-19 07:50:41,195] ApiLogger:CRITICAL - uvloop installed!
llmchat-api-1    | [2023-09-19 07:50:41,195] ApiLogger:CRITICAL - Llama CPP server monitoring started!
llmchat-api-1    | INFO:     Application startup complete.
llmchat-api-1    | INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
llmchat-api-1    | [2023-09-19 07:50:41,200] ApiLogger:ERROR - Llama CPP server is not available
llmchat-api-1    | [2023-09-19 07:50:41,200] ApiLogger:CRITICAL - Starting Llama CPP server
llmchat-api-1    | - Loaded .env file successfully.
llmchat-api-1    | - API_ENV: local
llmchat-api-1    | - DOCKER_MODE: True
llmchat-api-1    | - Parsing function for function calling: control_browser
llmchat-api-1    | - Parsing function for function calling: control_web_page
llmchat-api-1    | - Parsing function for function calling: web_search
llmchat-api-1    | - Parsing function for function calling: vectorstore_search
llmchat-api-1    | Using openai embeddings
llmchat-api-1    | 🦙 llama.cpp DLL not found, building it...
llmchat-api-1    | 🦙 Trying to build llama.cpp DLL: /app/repositories/llama_cpp/llama_cpp/build-llama-cpp-cublas.sh
llmchat-api-1    | /app/repositories/llama_cpp/llama_cpp/build-llama-cpp-cublas.sh: line 2: cd: /app/repositories/llama_cpp/vendor/llama.cpp: No such file or directory
llmchat-api-1    | /app/repositories/llama_cpp/llama_cpp/build-llama-cpp-cublas.sh: line 6: cmake: command not found
llmchat-api-1    | /app/repositories/llama_cpp/llama_cpp/build-llama-cpp-cublas.sh: line 7: cmake: command not found
llmchat-api-1    | cp: cannot stat '/app/repositories/llama_cpp/vendor/llama.cpp/build/bin/Release/libllama.so': No such file or directory
llmchat-api-1    | 🦙 Could not build llama.cpp DLL!
llmchat-api-1    | 🦙 Trying to build llama.cpp DLL: /app/repositories/llama_cpp/llama_cpp/build-llama-cpp-default.sh
llmchat-api-1    | /app/repositories/llama_cpp/llama_cpp/build-llama-cpp-default.sh: line 2: cd: /app/repositories/llama_cpp/vendor/llama.cpp: No such file or directory
llmchat-api-1    | /app/repositories/llama_cpp/llama_cpp/build-llama-cpp-default.sh: line 6: cmake: command not found
llmchat-api-1    | /app/repositories/llama_cpp/llama_cpp/build-llama-cpp-default.sh: line 7: cmake: command not found
llmchat-api-1    | cp: cannot stat '/app/repositories/llama_cpp/vendor/llama.cpp/build/bin/Release/libllama.so': No such file or directory
llmchat-api-1    | 🦙 Could not build llama.cpp DLL!
llmchat-api-1    | [2023-09-19 07:50:41,256] ApiLogger:WARNING - 🦙 Could not import llama-cpp-python repository: 🦙 Could not build llama.cpp DLL!
llmchat-api-1    | ...trying to import installed llama-cpp package...
llmchat-api-1    | INFO:     10.101.7.43:42488 - "GET / HTTP/1.1" 304 Not Modified
llmchat-api-1    | INFO:     10.101.7.43:42488 - "GET /main.dart.js HTTP/1.1" 200 OK
llmchat-api-1    | INFO:     10.101.7.43:49942 - "GET / HTTP/1.1" 304 Not Modified
llmchat-api-1    | INFO:     10.101.7.43:49942 - "GET /main.dart.js HTTP/1.1" 304 Not Modified

Token conservation

I propose we use a two LLMs approach to cut them on the cost of using gpt4 and all expensive future variants.

This mostly applies if you are using GTP4, but why use anything else :)

  • GTP 4 handles current inquiries and gpt3 summarizes past histories.
  • As the conversation approaches admin's set token limit (e.g.2000), use gpt3 to create a summary.
  • use this summary for next conversation except if user hits regenerate then use original index of full history
  • Admin on/off switch
  • index 0/ first prompt is never deleted

You have:

  1. Initial prompt
  2. summary
  3. last question

This may even get gpt4 to be more focus and on point

Trash bin

Deleted chats should be moved to trash bin and deleted after 30 days. Add "trash" link to side bar. This should load a list of deleted chats in the main column. Lets mimic the gmail interface for this feature. I think people forget openai will soon change their ui to match true and tested email interface.

Default system message

I altered the "user" method and added a "content= ChatConfig.chat_role_system_message"

` async def user(msg: str, translate: bool, buffer: BufferedUserContext) -> None:
"""Handle user message, including translation"""
if len(buffer.current_user_message_histories) == 0 and UTC.check_string_valid(buffer.current_chat_room_name):
buffer.current_chat_room_name = msg[:20]
await CacheManager.update_profile(user_chat_context=buffer.current_user_chat_context)

    # Add default system message at the start of a conversation
        await MessageManager.add_message_history_safely(
            user_chat_context=buffer.current_user_chat_context,
            content= ChatConfig.chat_role_system_message,
            role=ChatRoles.SYSTEM,
        )
        await SendToWebsocket.init(buffer=buffer, send_chat_rooms=True, wait_next_query=True)`

and in ChatConfig I added

chat_role_system_message: Optional[str] = environ["CHAT_ROLE_SYSTEM_MESSAGE"]

Now the ai behaviour is more predictable

production fails on debian 11

This is the error I get:
File "/usr/local/lib/python3.11/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "<frozen importlib._bootstrap>", line 1206, in _gcd_import File "<frozen importlib._bootstrap>", line 1178, in _find_and_load File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 690, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 940, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/app/main.py", line 39, in <module> from app.common.app_settings import create_app File "/app/app/common/app_settings.py", line 7, in <module> from app.auth.admin import MyAuthProvider File "/app/app/auth/admin.py", line 5, in <module> from app.common.config import config File "/app/app/common/config.py", line 177, in <module> config = Config.get() ^^^^^^^^^^^^ File "/app/app/common/config.py", line 122, in get _config = { ^ KeyError: '"prod"'

Icons tool tips

Add tooltips to icons so that hovering displays hints. Or add labels

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.