jasonaidm / ai_webui Goto Github PK

View Code? Open in Web Editor NEW

161.0 3.0 50.0 77.07 MB

AI-WEBUI: A universal web interface for AI creation, 一款好用的图像、音频、视频AI处理工具

License: MIT License

Python 20.65% Shell 0.01% Jupyter Notebook 79.35%

ai chatbot inpainting sam segment-anything speech-recognition speech-synthesis chatglm chatgpt video-clip

ai_webui's Introduction

AI-WEBUI: A universal web interface for AI creation, a handy tool for image, audio, and video processing

⭐ If it helps you, please give it a star, thank you! 🤗 中文文档

🌟 1. Introduction

ai-webui is a browser-based interface designed to provide a universal AI creation platform.

This project provides basic functionalities such as image segmentation, object tracking, image restoration, speech recognition, speech synthesis, as well as advanced features such as chatbot, video translation, and video watermark removal, which greatly improve the efficiency of short video creation.

⚡2. Installation

To install and use AI-WebUI, follow these steps:

2.1 Clone this project to your local machine

git clone https://github.com/jasonaidm/ai_webui.git

2.2 Enter the project directory

cd ai_webui

2.3 Create a virtual environment

conda create -n aiwebui python=3.11
conda activate aiwebui

2.4 Install the required dependencies

apt install ffmpeg -y 
pip install -r requirements.txt

🚀3. Quick Start

Using AI-WebUI is very simple. Just follow the instructions on the interface. You can input creative elements by uploading videos, audios, images, or entering text, and interact with the model's output.

python webui.py -c ./configs/webui_configs.yml

After starting, open http://localhost:9090/?__theme=dark in your browser to see the project interface.

3.1 Single Function Demo

Considering the GPU performance issues of some users' personal computers, we provide single function demos that allow users to run a specific AI function without starting the entire project.

Image Segmentation

Panorama segmentation
Segmentation based on points coordinates
Segmentation based on textual prompts

python webui.py -c ./configs/segmentation_demo.yml

Speech Recognition

Multilingual speech recognition (e.g., Chinese and English)

python webui.py -c ./configs/asr_demo.yml

Speech Synthesis

Multilingual speech synthesis (e.g., Chinese and English)

python webui.py -c ./configs/tts_demo.yml

3.2 Combined Function Demo

More complex functions can be obtained by combining multiple AI models, requiring higher GPU resources.

Chatbot

Text-based chatbot
Voice-based chatbot

python webui.py -c ./configs/chatbot_demo.yml

Video Restoration

Watermark removal
Mosaic removal
Object tracking
Object removal in videos

python webui.py -c ./configs/video_inpainter_demo.yml

Video Conversion

Audio-video separation
Image cropping
Image noise addition
Frame extraction
Speech recognition
Subtitle translation
Speech synthesis
BGM addition
One-click video generation (automatic video replication from the internet)

python webui.py -c ./configs/video_convertor_demo.yml

3.3 Full-function Online

Open all AI functions by running the following command:

python webui.py -c ./configs/webui_configs.yml

Since model loading takes a long time, it is recommended to load the models during the first inference after starting. You can control the loading strategy of each AI model through the "init_model_when_start_server" option in the configs/base.yml configuration file.

🔥4. Model Files

4.1 Model File Downloads

Model	Model File Size	Small Model List	Download Link
chatglm2-6b-int4	3.7G	✅	Baidu Netdisk
chatglm2-6b	12G		Tsinghua University Cloud Disk
sam_vit_b	358M	✅	Baidu Netdisk
sam_vit_h	2.4G		Baidu Netdisk
FastSAM-s	23M	✅	Baidu Netdisk
FastSAM-x	138M		Baidu Netdisk
ProPainter	150M	✅	Baidu Netdisk
raft-things	20M	✅	Baidu Netdisk
recurrent_flow_completion	19M	✅	Baidu Netdisk
cutie	134M	✅	Baidu Netdisk
whisper-samll	461M	✅	Baidu Netdisk
whisper-large-v3	2.9G		Baidu Netdisk

The extraction code for Baidu Netdisk is: zogk

4.2 Directory Structure of Model Weight Files

model_weights/
├── chatglm
│   └── chatglm2-6b-int4
│       ├── config.json
│       ├── configuration_chatglm.py
│       ├── modeling_chatglm.py
│       ├── pytorch_model.bin
│       ├── quantization.py
│       ├── tokenization_chatglm.py
│       ├── tokenizer.model
│       └── tokenizer_config.json
├── fastsam
│   ├── FastSAM-s.pt
│   └── FastSAM-x.pt
├── propainter
│   ├── ProPainter.pth
│   ├── cutie-base-mega.pth
│   ├── raft-things.pth
│   └── recurrent_flow_completion.pth
├── sam
│   ├── sam_vit_b.pth
│   └── sam_vit_h.pth
└── whisper
    ├── large-v3.pt
    └── small.pt

If the GPU memory is less than 8G, you may need to use the small models to run the project; however, the performance of the small models may not be ideal, so it is recommended to run the large models if possible.

5. Contributing

If you have any suggestions or feature requests, please feel free to create an issue.

6. References

Segment-ant-Track-Anything
ProPainter
ChatGLM2-6B
segment-anything
FastSAM
whisper

ai_webui's People

Contributors

Stargazers

Watchers

Forkers

zhoulingjie hhy5277 hellowarcraft polyhands zerotogohere keyman9848 bref0 lizhunkg liuxing9848 hanksbao peichangliang123 mooremok chatgpt-cn pzinko xhuzy night-zk angle2046 lixianshengchao dcl1310 sitloveyou liuliucai jasonzhaojh garricklin a279780399 gamerjohn666 imacoduh jameswilliam1977 2018zyl fh843121519 jing-li justinzyh qinghongdev guotianjun l1-j5n w492969105 duanshuaimin lzk90s ziven-qin lyhiving xinfz china-tigger zesicus poer2023 qq571293749 davidyuan666 jacinli xuzhitong89 ericwonghua ifule

ai_webui's Issues

add new feature for supporting openai api

ModuleNotFoundError: No module named 'tools.visualchat_handler'

~/ai_webui$ python webui.py -c ./configs/webui_configs.yml
Traceback (most recent call last):
File "/home6/wutuo/ai_webui/webui.py", line 4, in
from tools import AIWrapper
File "/home6/wutuo/ai_webui/tools/init.py", line 6, in
from .ai_wrapper import AIWrapper
File "/home6/wutuo/ai_webui/tools/ai_wrapper.py", line 3, in
from .visualchat_handler import VisualChatHandler
ModuleNotFoundError: No module named 'tools.visualchat_handler'

Attempting to deserialize object on CUDA device 1 but torch.cuda.device_count() is 1

Traceback (most recent call last): File "D:\anaconda3\envs\aiwebui\Lib\site-packages\gradio\routes.py", line 442, in run_predict output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\gradio\blocks.py", line 1392, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\gradio\blocks.py", line 1097, in call_function prediction = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\anyio\to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\anyio\_backends\_asyncio.py", line 2134, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\anyio\_backends\_asyncio.py", line 851, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\gradio\utils.py", line 703, in wrapper response = f(*args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "F:\AI\webui\tools\ai_wrapper.py", line 76, in clip_video asr_result= self.whisper_handler.infer(audio_stream_file2) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "F:\AI\webui\tools\whisper_handler.py", line 16, in infer self.init_model() File "F:\AI\webui\tools\whisper_handler.py", line 54, in init_model self.model = whisper.load_model(name=self.model_name, device=self.device, download_root=self.model_dir) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\whisper\__init__.py", line 146, in load_model checkpoint = torch.load(fp, map_location=device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\torch\serialization.py", line 1014, in load raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None File "D:\anaconda3\envs\aiwebui\Lib\site-packages\torch\serialization.py", line 1422, in _load unpickler.persistent_load = persistent_load ^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\torch\serialization.py", line 1392, in persistent_load nbytes = numel * torch._utils._element_size(dtype) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\torch\serialization.py", line 1366, in load_tensor typed_storage = torch.storage.TypedStorage( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\torch\serialization.py", line 1296, in restore_location def restore_location(storage, location): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\torch\serialization.py", line 381, in default_restore_location result = fn(storage, location) ^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\torch\serialization.py", line 274, in _cuda_deserialize device = validate_cuda_device(location) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\anaconda3\envs\aiwebui\Lib\site-packages\torch\serialization.py", line 265, in validate_cuda_device raise RuntimeError('Attempting to deserialize object on CUDA device ' RuntimeError: Attempting to deserialize object on CUDA device 1 but torch.cuda.device_count() is 1. Please use torch.load with map_location to map your storages to an existing device.