sillytavern / sillytavern-extras Goto Github PK

View Code? Open in Web Editor NEW

516.0 11.0 121.0 57.14 MB

Extensions API for SillyTavern.

License: GNU Affero General Public License v3.0

Python 98.74% Dockerfile 0.02% Cython 0.36% Cuda 0.68% C++ 0.10% Shell 0.11%

sillytavern-extras's Introduction

SillyTavern - Extras

Extras project is discontinued and won't receive any new updates or modules. The vast majority of modules are available natively in the main SillyTavern application. You may still install and use it but don't expect to get immediate support if you face any issues.

Recent news
What is this
How to run
Modules
Options
Coqui TTS
ChromaDB
API Endpoints

Recent news

April 24 2024 - The project is officially discontinued.
November 20 2023 - The project is relicensed as AGPLv3 to comply with the rest of ST organization policy. If you have any concerns about that, please raise a discussion in the appropriate channel.
November 16 2023 - Requirement files were remade from scratch to simplify the process of local installation.
- Removed requirements-complete.txt, please use requirements.txt instead.
- Unlocked versions of all requirements unless strictly necessary.
- Coqui TTS requirements moved to requirements-coqui.txt.
July 25 2023 - Now extras require Python 3.11 to run, some of the new modules will be incompatible with old Python 3.10 installs. To migrate using conda, please remove old environment using conda remove --name extras --all and reinstall using the instructions below.

What is this

A set of APIs for various SillyTavern extensions.

You need to run the latest version of SillyTavern. Grab it here: How to install, Git repository

All modules, except for Stable Diffusion, run on the CPU by default. However, they can alternatively be configured to use CUDA (with --cuda command line option). When running all modules simultaneously, you can expect a usage of approximately 6 GB of RAM. Loading Stable Diffusion adds an additional couple of GB to the memory usage.

Some modules can be configured to use CUDA separately from the rest (e.g. --talkinghead-gpu, --coqui-gpu command line options). This is useful in low-VRAM setups, such as on a gaming laptop.

Try on Colab (will give you a link to Extras API):

Colab link: https://colab.research.google.com/github/SillyTavern/SillyTavern/blob/release/colab/GPU.ipynb

Documentation: https://docs.sillytavern.app/

How to run

❗ IMPORTANT! Requirement files explained

Default requirements.txt installs PyTorch CUDA by default.
If you run on AMD GPU, use requirements-rocm.txt file instead.
If you run on Apple Silicon (ARM series), use the requirements-silicon.txt file instead.
If you want to use Coqui TTS, install requirements-coqui.txt after choosing the requirements from the list above.
If you want to use RVC, install requirements-rvc.txt after choosing the requirements from the list above.
BE WARNED THAT:
- Coqui package is extremely unstable and may break other packages or not work at all in your environment.
- It's not really worth it.

Common errors when installing requirements

ERROR: Could not build wheels for hnswlib, which is required to install pyproject.toml-based projects

Installing the chromadb package requires one of the following:

Have Visual C++ build tools installed: https://visualstudio.microsoft.com/visual-cpp-build-tools/
Installing hnswlib from conda: conda install -c conda-forge hnswlib

❗ IMPORTANT! The chromadb package is used only by the chromadb module for the old Smart Context extension, which is deprecated. You will likely not need it.

Missing modules reported by SillyTavern extensions menu?

You must specify a list of module names to be run in the --enable-modules command (caption provided as an example). See Modules section.

☁️ Colab

Open colab link
Select desired "extra" options and start the cell
Wait for it to finish
Get an API URL link from colab output under the ### SillyTavern Extensions LINK ### title
Start SillyTavern with extensions support: set enableExtensions to true in config.conf
Navigate to SillyTavern extensions menu and put in an API URL and tap "Connect" to load the extensions

What about mobile/Android/Termux? 🤔

There are some folks in the community having success running Extras on their phones via Ubuntu on Termux. This project wasn't made with mobile support in mind, so this guide is provided strictly for your information only: https://rentry.org/STAI-Termux#downloading-and-running-tai-extras

❗ IMPORTANT!

We will NOT provide any support for running Extras on Android. Direct all your questions to the creator of the guide linked above.

💻 Locally

Option 1 - Conda (recommended) 🐍

PREREQUISITES

Install Miniconda: https://docs.conda.io/en/latest/miniconda.html
(Important!) Read how to use Conda: https://conda.io/projects/conda/en/latest/user-guide/getting-started.html
Install git: https://git-scm.com/downloads

EXECUTE THESE COMMANDS ONE BY ONE IN THE CONDA COMMAND PROMPT.

TYPE/PASTE EACH COMMAND INTO THE PROMPT, HIT ENTER AND WAIT FOR IT TO FINISH!

Before the first run, create an environment (let's call it extras):

conda create -n extras

Now activate the newly created env

conda activate extras

Install Python 3.11

conda install python=3.11

Install the required system packages

conda install git

Clone this repository

git clone https://github.com/SillyTavern/SillyTavern-extras

Navigated to the freshly cloned repository

cd SillyTavern-extras

Install the project requirements

pip install -r requirements.txt

Run the Extensions API server

python server.py --enable-modules=caption,summarize,classify

Copy the Extra's server API URL listed in the console window after it finishes loading up. On local installs, this defaults to http://localhost:5100.
Open your SillyTavern config.conf file (located in the base install folder), and look for a line "const enableExtensions". Make sure that line has "= true", and not "= false".
Start your SillyTavern server
Open the Extensions panel (via the 'Stacked Blocks' icon at the top of the page), paste the API URL into the input box, and click "Connect" to connect to the Extras extension server.
To run again, simply activate the environment and run these commands. Be sure to the additional options for server.py (see below) that your setup requires.

conda activate extras
python server.py

❗ IMPORTANT! Talkinghead

Installation requirements for Talkinghead changed in January 2024. The live mode - i.e. the talkinghead module that powers the Talkinghead mode of Character Expressions - no longer needs any additional packages.

However, a manual poser app has been added, serving two purposes. First, it is a GUI editor for the Talkinghead emotion templates. Secondly, it can batch-generate static emotion sprites from a single Talkinghead image. The latter can be convenient if you want the convenience of AI-powered posing (e.g. if you make new characters often), but don't want to run the live mode.

The manual poser app, and only that app, still requires the installation of an additional package that is not installed automatically due to incompatibility with Colab. If you want to be able to use the manual poser app, then run this after you have installed other requirements:

conda activate extras
pip install wxpython==4.2.1

The installation of the wxpython package can easily take half an hour on a fast CPU, as it needs to compile a whole GUI toolkit.

More information about Talkinghead can be found in its full documentation.

Option 2 - Vanilla 🍦

Install Python 3.11: https://www.python.org/downloads/release/python-3114/
Install git: https://git-scm.com/downloads
Clone the repo:

git clone https://github.com/SillyTavern/SillyTavern-extras
cd SillyTavern-extras

Run python -m pip install -r requirements.txt
Run python server.py --enable-modules=caption,summarize,classify
Get the API URL. Defaults to http://localhost:5100 if you run locally.
Start SillyTavern with extensions support: set enableExtensions to true in config.conf
Navigate to the SillyTavern extensions menu and put in an API URL and tap "Connect" to load the extensions

Modules

Name	Used by	Description
`caption`		Image captioning
`chromadb`	Smart Context	Vector storage server
`classify`	Character Expressions	Text sentiment classification
`coqui-tts`		Coqui TTS server
`edge-tts`		Microsoft Edge TTS client
`embeddings`	Vector Storage	The Extras vectorization source
`rvc`		Real-time voice cloning
`sd`		Stable Diffusion image generation (remote A1111 server by default)
`silero-tts`		Silero TTS server
`summarize`	Summarize	The Extras API backend
`talkinghead`	Character Expressions	AI-powered character animation (see full documentation)
`websearch`	Websearch	Google or DuckDuckGo search using Selenium headless browser

❗ IMPORTANT!

Character Expressions can connect to two Extras modules, classify and talkinghead.
- classify updates the expression of the AI character's avatar automatically based on text sentiment analysis.
- talkinghead provides AI-powered character animation. It also takes its expression from the Extras classify.
  - To use Talkinghead, Extensions ⊳ Character Expressions ⊳ Local server classification in the ST GUI must be off, and classify must be enabled in Extras.
Smart Context is deprecated; superseded by Vector Storage.
- The embeddings module makes the ingestion performance comparable with ChromaDB, as it uses the same vectorization backend.
- Vector Storage does not use other Extras modules.
Summarize: the Main API is generally more capable, as it uses your main LLM to perform the summarization.
- The summarize module is only used when you summarize with the Extras API. It uses a specialized BART summarization model, with a context size of 1024.

Options

Flag	Description
`--enable-modules`	Required option. Which modules to enable. Expects a comma-separated list of module names. Ordering does not matter. See Modules Example: `--enable-modules=caption,sd`
`--port`	Specify the port on which the application is hosted. Default: 5100
`--listen`	Host the app on the local network
`--share`	Share the app on CloudFlare tunnel
`--secure`	Adds API key authentication requirements. Highly recommended when paired with share!
`--cpu`	Run the models on the CPU instead of CUDA. Enabled by default.
`--mps` or `--m1`	Run the models on Apple Silicon. Only for M1 and M2 processors.
`--cuda`	Use CUDA (GPU+VRAM) to run modules if it is available. Otherwise, falls back to using CPU.
`--cuda-device`	Specifies a CUDA device to use. Defaults to `cuda:0` (first available GPU).
`--talkinghead-gpu`	Use CUDA (GPU+VRAM) for Talkinghead. Highly recommended, 10-30x FPS increase in animation.
`--talkinghead-model`	Load a specific variant of the THA3 AI poser model for Talkinghead. Default: `auto` (which is `separable_half` on GPU, `separable_float` on CPU).
`--talkinghead-models`	If the THA3 AI poser models are not yet installed, downloads and installs them. Expects a HuggingFace model ID. Default: OktayAlpk/talking-head-anime-3
`--coqui-gpu`	Use GPU for coqui TTS (if available).
`--coqui-model`	If provided, downloads and preloads a coqui TTS model. Default: none. Example: `tts_models/multilingual/multi-dataset/bark`
`--summarization-model`	Load a custom summarization model. Expects a HuggingFace model ID. Default: Qiliang/bart-large-cnn-samsum-ChatGPT_v3
`--classification-model`	Load a custom sentiment classification model. Expects a HuggingFace model ID. Default (6 emotions): nateraw/bert-base-uncased-emotion Other solid option is (28 emotions): joeddav/distilbert-base-uncased-go-emotions-student For Chinese language: touch20032003/xuyuan-trial-sentiment-bert-chinese
`--captioning-model`	Load a custom captioning model. Expects a HuggingFace model ID. Default: Salesforce/blip-image-captioning-large
`--embedding-model`	Load a custom text embedding (vectorization) model. Both the `embeddings` and `chromadb` modules use this. The backend is `sentence_transformers`, so check there for info on supported models. Expects a HuggingFace model ID. Default: sentence-transformers/all-mpnet-base-v2
`--chroma-host`	Specifies a host IP for a remote ChromaDB server.
`--chroma-port`	Specifies an HTTP port for a remote ChromaDB server. Default: `8000`
`--sd-model`	Load a custom Stable Diffusion image generation model. Expects a HuggingFace model ID. Default: ckpt/anything-v4.5-vae-swapped Must have VAE pre-baked in PyTorch format or the output will look drab!
`--sd-cpu`	Force the Stable Diffusion generation pipeline to run on the CPU. SLOW!
`--sd-remote`	Use a remote SD backend. Supported APIs: sd-webui
`--sd-remote-host`	Specify the host of the remote SD backend Default: 127.0.0.1
`--sd-remote-port`	Specify the port of the remote SD backend Default: 7860
`--sd-remote-ssl`	Use SSL for the remote SD backend Default: False
`--sd-remote-auth`	Specify the `username:password` for the remote SD backend (if required)

Coqui TTS

Running on Mac M1

ImportError: symbol not found

If you're getting the following error when running coqui-tts module on M1 Mac:

ImportError: dlopen(/Users/user/.../lib/python3.11/site-packages/MeCab/_MeCab.cpython-311-darwin.so, 0x0002): symbol not found in flat namespace '__ZN5MeCab11createModelEPKc'

Do the following:

Install homebrew: https://brew.sh/
Build and install the mecab package

brew install --build-from-source mecab
ARCHFLAGS='-arch arm64' pip install --no-binary :all: --compile --use-pep517 --no-cache-dir --force mecab-python3

ChromaDB

❗ IMPORTANT! ChromaDB is used only by the chromadb module for the old Smart Context extension, which is deprecated. You will likely not need it.

ChromaDB is a blazing fast and open source database that is used for long-term memory when chatting with characters. It can be run in-memory or on a local server on your LAN.

NOTE: You should NOT run ChromaDB on a cloud server. There are no methods for authentication (yet), so unless you want to expose an unauthenticated ChromaDB to the world, run this on a local server in your LAN.

In-memory setup

Run the extras server with the chromadb module enabled (recommended).

Remote setup

Use this if you want to use ChromaDB with docker or host it remotely. If you don't know what that means and only want to use ChromaDB with ST on your local device, use the 'in-memory' instructions instead.

Prerequisites: Docker, Docker compose (make sure you're running in rootless mode with the systemd service enabled if on Linux).

Steps:

Run git clone https://github.com/chroma-core/chroma chromadb and cd chromadb
Run docker-compose up -d --build to build ChromaDB. This may take a long time depending on your system
Once the build process is finished, ChromaDB should be running in the background. You can check with the command docker ps
On your client machine, specify your local server ip in the --chroma-host argument (ex. --chroma-host=192.168.1.10)

If you are running ChromaDB on the same machine as SillyTavern, you will have to change the port of one of the services. To do this for ChromaDB:

Run docker ps to get the container ID and then docker container stop <container ID>
Enter the ChromaDB git repository cd chromadb
Open docker-compose.yml and look for the line starting with uvicorn chromadb.app:app
Change the --port argument to whatever port you want.
Look for the ports category and change the occurrences of 8000 to whatever port you chose in step 4.
Save and exit. Then run docker-compose up --detach
On your client machine, make sure to specity the --chroma-port argument (ex. --chroma-port=<your-port-here>) along with the --chroma-host argument.

API Endpoints

This section is developer documentation, containing usage examples of the API endpoints.

This is kept up-to-date on a best-effort basis, but there is a risk of this documentation being out of date. When in doubt, refer to the actual source code.

Get list of enabled modules

GET /api/modules

Input

None

Output

{"modules":["caption", "classify", "summarize"]}

Image captioning

POST /api/caption

Input

{ "image": "base64 encoded image" }

Output

{ "caption": "caption of the posted image" }

Text summarization

POST /api/summarize

Input

{ "text": "text to be summarize", "params": {} }

Output

{ "summary": "summarized text" }

Optional: `params` object for control over summarization:

Name	Default value
`temperature`	1.0
`repetition_penalty`	1.0
`max_length`	500
`min_length`	200
`length_penalty`	1.5
`bad_words`	["\n", '"', "*", "[", "]", "{", "}", ":", "(", ")", "<", ">"]

Text sentiment classification

POST /api/classify

Input

{ "text": "text to classify sentiment of" }

Output

{
    "classification": [
        {
            "label": "joy",
            "score": 1.0
        },
        {
            "label": "anger",
            "score": 0.7
        },
        {
            "label": "love",
            "score": 0.6
        },
        {
            "label": "sadness",
            "score": 0.5
        },
        {
            "label": "fear",
            "score": 0.4
        },
        {
            "label": "surprise",
            "score": 0.3
        }
    ]
}

NOTES

Sorted by descending score order

List of categories defined by the summarization model

Value range from 0.0 to 1.0

Stable Diffusion image generation

POST /api/image

Input

{ "prompt": "prompt to be generated", "sampler": "DDIM", "steps": 20, "scale": 6, "model": "model_name" }

Output

{ "image": "base64 encoded image" }

NOTES

Only the "prompt" parameter is required

Both "sampler" and "model" parameters only work when using a remote SD backend

Get available Stable Diffusion models

GET /api/image/models

Output

{ "models": [list of all available model names] }

Get available Stable Diffusion samplers

GET /api/image/samplers

Output

{ "samplers": [list of all available sampler names] }

Get currently loaded Stable Diffusion model

GET /api/image/model

Output

{ "model": "name of the current loaded model" }

Load a Stable Diffusion model (remote)

POST /api/image/model

Input

{ "model": "name of the model to load" }

Output

{ "previous_model": "name of the previous model", "current_model": "name of the newly loaded model" }

Generate Silero TTS voice

POST /api/tts/generate

Input

{ "speaker": "speaker voice_id", "text": "text to narrate" }

Output

WAV audio file.

Get Silero TTS voices

GET /api/tts/speakers

Output

[
    {
        "name": "en_0",
        "preview_url": "http://127.0.0.1:5100/api/tts/sample/en_0",
        "voice_id": "en_0"
    }
]

Get Silero TTS voice sample

GET /api/tts/sample/<voice_id>

Output

WAV audio file.

Compute text embeddings (vectorize)

POST /api/embeddings/compute

This is a vectorization source (text embedding provider) for the Vector Storage built-in extension of ST.

If you have many text items to vectorize (e.g. chat history, or chunks for RAG ingestion), send them in all at once. This allows the backend to batch the input, allocating the available compute resources efficiently, and thus running much faster (compared to processing a single item at a time).

The embeddings are always normalized.

Input

For one text item:

{ "text": "The quick brown fox jumps over the lazy dog." }

For multiple text items, just put them in an array:

{ "text": ["The quick brown fox jumps over the lazy dog.",
           "Lorem ipsum dolor sit amet, consectetur adipiscing elit.",
           ...] }

Output

When the input was one text item, returns one vector (the embedding of that text item) as an array:

{ "embedding": [numbers] }

When the input was multiple text items, returns multiple vectors (one for each input text item) as an array of arrays:

{ "embedding": [[numbers],
                [numbers], ...] }

Add messages to chromadb

POST /api/chromadb

Input

{
    "chat_id": "chat1 - 2023-12-31",
    "messages": [
        {
            "id": "633a4bd1-8350-46b5-9ef2-f5d27acdecb7",
            "date": 1684164339877,
            "role": "user",
            "content": "Hello, AI world!",
            "meta": "this is meta"
        },
        {
            "id": "8a2ed36b-c212-4a1b-84a3-0ffbe0896506",
            "date": 1684164411759,
            "role": "assistant",
            "content": "Hello, Hooman!"
        },
    ]
}

Output

{ "count": 2 }

Query chromadb

POST /api/chromadb/query

Input

{
    "chat_id": "chat1 - 2023-12-31",
    "query": "Hello",
    "n_results": 2,
}

Output

[
    {
        "id": "633a4bd1-8350-46b5-9ef2-f5d27acdecb7",
        "date": 1684164339877,
        "role": "user",
        "content": "Hello, AI world!",
        "distance": 0.31,
        "meta": "this is meta"
    },
    {
        "id": "8a2ed36b-c212-4a1b-84a3-0ffbe0896506",
        "date": 1684164411759,
        "role": "assistant",
        "content": "Hello, Hooman!",
        "distance": 0.29
    },
]

Delete the messages from chromadb

POST /api/chromadb/purge

Input

{ "chat_id": "chat1 - 2023-04-12" }

Get a list of Edge TTS voices

GET /api/edge-tts/list

Output

[{'Name': 'Microsoft Server Speech Text to Speech Voice (af-ZA, AdriNeural)', 'ShortName': 'af-ZA-AdriNeural', 'Gender': 'Female', 'Locale': 'af-ZA', 'SuggestedCodec': 'audio-24khz-48kbitrate-mono-mp3', 'FriendlyName': 'Microsoft Adri Online (Natural) - Afrikaans (South Africa)', 'Status': 'GA', 'VoiceTag': {'ContentCategories': ['General'], 'VoicePersonalities': ['Friendly', 'Positive']}}]

Generate Edge TTS voice

POST /api/edge-tts/generate

Input

{ "text": "Text to narrate", "voice": "af-ZA-AdriNeural", "rate": 0 }

Output

MP3 audio file.

Load a Coqui TTS model

GET /api/coqui-tts/load

Input

_model (string, required): The name of the Coqui TTS model to load. _gpu (string, Optional): Use the GPU to load model. _progress (string, Optional): Show progress bar in terminal.

{ "_model": "tts_models--en--jenny--jenny\model.pth" }
{ "_gpu": "False" }
{ "_progress": "True" }

Output

"Loaded"

Get a list of Coqui TTS voices

GET /api/coqui-tts/list

Output

["tts_models--en--jenny--jenny\\model.pth", "tts_models--en--ljspeech--fast_pitch\\model_file.pth", "tts_models--en--ljspeech--glow-tts\\model_file.pth", "tts_models--en--ljspeech--neural_hmm\\model_file.pth", "tts_models--en--ljspeech--speedy-speech\\model_file.pth", "tts_models--en--ljspeech--tacotron2-DDC\\model_file.pth", "tts_models--en--ljspeech--vits\\model_file.pth", "tts_models--en--ljspeech--vits--neon\\model_file.pth.tar", "tts_models--en--multi-dataset--tortoise-v2", "tts_models--en--vctk--vits\\model_file.pth", "tts_models--et--cv--vits\\model_file.pth.tar", "tts_models--multilingual--multi-dataset--bark", "tts_models--multilingual--multi-dataset--your_tts\\model_file.pth", "tts_models--multilingual--multi-dataset--your_tts\\model_se.pth"]

Get a list of the loaded Coqui model speakers

GET /api/coqui-tts/multspeaker

Output

{"0": "female-en-5", "1": "female-en-5\n", "2": "female-pt-4\n", "3": "male-en-2", "4": "male-en-2\n", "5": "male-pt-3\n"}

Get a list of the loaded Coqui model lanagauges

GET /api/coqui-tts/multlang

Output

{"0": "en", "1": "fr-fr", "2": "pt-br"}

Generate Coqui TTS voice

POST /api/edge-tts/generate

Input

{
  "text": "Text to narrate",
  "speaker_id": "0",
  "mspker": null,
  "language_id": null,
  "style_wav": null
}

Output

MP3 audio file.

Load a talkinghead character

POST /api/talkinghead/load

Input

A FormData with files, with an image file in a field named "file". The posted file should be a PNG image in RGBA format. Optimal resolution is 512x512. See the talkinghead README for details.

Example

'http://localhost:5100/api/talkinghead/load'

Output

'OK'

Load talkinghead emotion templates (or reset them to defaults)

POST /api/talkinghead/load_emotion_templates

Input

{"anger": {"eyebrow_angry_left_index": 1.0,
           ...}
 "curiosity": {"eyebrow_lowered_left_index": 0.5895,
               ...}
 ...}

For details, see Animator.load_emotion_templates in talkinghead/tha3/app/app.py. This is essentially the format used by talkinghead/emotions/_defaults.json.

Any emotions NOT supplied in the posted JSON will revert to server defaults. In any supplied emotion, any morph NOT supplied will default to zero. This allows making the templates shorter.

To reset all emotion templates to their server defaults, send a blank JSON.

Output

"OK"

Load talkinghead animator/postprocessor settings (or reset them to defaults)

POST /api/talkinghead/load_animator_settings

Input

{"target_fps": 25,
 "breathing_cycle_duration": 4.0,
 "postprocessor_chain": [["bloom", {}],
                         ["chromatic_aberration", {}],
                         ["vignetting", {}],
                         ["translucency", {"alpha": 0.9}],
                         ["alphanoise", {"magnitude": 0.1, "sigma": 0.0}],
                         ["banding", {}],
                         ["scanlines", {"dynamic": true}]]
 ...}

For a full list of supported settings, see animator_defaults and Animator.load_animator_settings, both in talkinghead/tha3/app/app.py.

Particularly for "postprocess_chain", see talkinghead/tha3/app/postprocessor.py. The postprocessor applies pixel-space glitch artistry, which can e.g. make your talkinghead look like a scifi hologram (the above example does this). The postprocessing filters are applied in the order they appear in the list.

To reset all animator/postprocessor settings to their server defaults, send a blank JSON.

Output

"OK"

Animate the talkinghead character to start talking

GET /api/talkinghead/start_talking

Example

'http://localhost:5100/api/talkinghead/start_talking'

Output

"talking started"

Animate the talkinghead character to stop talking

GET /api/talkinghead/stop_talking

Example

'http://localhost:5100/api/talkinghead/stop_talking'

Output

"talking stopped"

Set the talkinghead character's emotion

POST /api/talkinghead/set_emotion

Available emotions: see talkinghead/emotions/*.json. An emotion must be specified, but if it is not available, this operation defaults to "neutral", which must always be available. This endpoint is the backend behind the /emote slash command in talkinghead mode.

Input

{"emotion_name": "curiosity"}

Example

'http://localhost:5100/api/talkinghead/set_emotion'

Output

"emotion set to curiosity"

Output the animated talkinghead sprite.

GET /api/talkinghead/result_feed

Output

Animated transparent image, each frame a 512x512 PNG image in RGBA format.

Perform web search

POST /api/websearch

Available engines: google (default), duckduckgo

Input

{ "query": "what is beauty?", "engine": "google" }

Output

{ "results": "that would fall within the purview of your conundrums of philosophy", "links": ["http://example.com"] }

sillytavern-extras's People

Contributors

Stargazers

Watchers

Forkers

peppertaco pearax talker58 wind2sing babnam joaan-fushi galasal caprovinos batpin1357 elohffa awesomeday3 sanskar-mk2 pipazoul yeshuawb3 b-l-richards liukaixiang817 h4n1virus lotfi-2007 blipranger bdashore3 tonywhite11 waflarter silverjim bucketcat grainusbread itzyounghazn a1270 timyiscool tony-sama avelynkyo nyxdabria pent majick yellowrosecx rataqguy skylerchace blueprintcoding 50h100a sebs420 zhenyapav fredddi43 alexandra-grey ai-awe shalloctheshadow yukikun-dev pyrater city-unit wereretot ph0rk0z paulajones88 schwabischesbauernbrot deffcolony bigsk1 blakemckinniss ikonikre krittapat-canik moxmoussa tomatococotree ouoertheo brain-is-back korewachino rei-jyugo-015 yggdrasil75 donlinglok torchesburn josuefrancograno dalethium agentk3899 orios75 lolpepepe ghostxia jonnyquan leolemon214 th3-m1nd-3xpansi0n-n3xus dapper-magician qingyesc qw1642190428 peua1 misternegative21 tarunkumarkanakam ax2l smashlootrepeat zik000001 zba th3-m1nd-3xpansi0n-n3xus technologicat paniphons tungllm albycat soylentgrn vickivain daviddaven-port xphillyx kustomzone alez3000 chatbotsgpt shumway1980 windypluto thehiddenwolf wsadlhy

sillytavern-extras's Issues

[Feature Request] Emojis for classify module

I'd like a checkbox that displays a relevant emoji for each emotion on the corner of a character's avatar. This would add a lot to roleplaying without as much set up.

Image is in generations folder but won't show up in sillytavern

Steps to Reproduce: Try to generate image

SD inputs:
{'prompt': 'She gasps loudly before speaking out loudly', 'sampler': 'Euler a', 'steps': 20, 'scale': 7, 'width': 512, 'height': 512, 'prompt_prefix': 'best quality, absurdres, masterpiece, detailed, intricate, colorful,', 'negative_prompt': 'lowres, bad anatomy, bad hands, text, error, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry', 'restore_faces': False, 'enable_hr': False, 'karras': False}
[2023-06-11 03:17:01,729] ERROR in app: Exception on /api/image [POST]
Traceback (most recent call last):
File "C:\Users\joshy.conda\envs\extras\lib\site-packages\PIL\JpegImagePlugin.py", line 643, in _save
rawmode = RAWMODE[im.mode]
KeyError: 'RGBA'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "C:\Users\joshy.conda\envs\extras\lib\site-packages\flask\app.py", line 2190, in wsgi_app
response = self.full_dispatch_request()
File "C:\Users\joshy.conda\envs\extras\lib\site-packages\flask\app.py", line 1486, in full_dispatch_request
rv = self.handle_user_exception(e)
File "C:\Users\joshy.conda\envs\extras\lib\site-packages\flask_cors\extension.py", line 165, in wrapped_function
return cors_after_request(app.make_response(f(*args, **kwargs)))
File "C:\Users\joshy.conda\envs\extras\lib\site-packages\flask\app.py", line 1484, in full_dispatch_request
rv = self.dispatch_request()
File "C:\Users\joshy.conda\envs\extras\lib\site-packages\flask\app.py", line 1469, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "C:\Users\joshy\SillyTavern-extras\server.py", line 263, in decorated_view
return fn(*args, **kwargs)
File "C:\Users\joshy\SillyTavern-extras\server.py", line 570, in api_image
base64image = image_to_base64(image, quality=90)
File "C:\Users\joshy\SillyTavern-extras\server.py", line 393, in image_to_base64
image.save(buffered, format="JPEG", quality=quality)
File "C:\Users\joshy.conda\envs\extras\lib\site-packages\PIL\Image.py", line 2431, in save
save_handler(self, fp, filename)
File "C:\Users\joshy.conda\envs\extras\lib\site-packages\PIL\JpegImagePlugin.py", line 646, in _save
raise OSError(msg) from e
OSError: cannot write mode RGBA as JPEG
127.0.0.1 - - [11/Jun/2023 03:17:01] "POST /api/image HTTP/1.1" 500 -

Stable Diffusion missing documentation /Automatic1111 API

The wiki page does not have documentation about stable diffusion.
https://docs.sillytavern.app/extras/extensions/

Some description is found in one of the 3 documents I have found Specifically in the help file. Between these, there is large fragmentation.
Help file, wiki, localhost:port/help

ValueError(f"Could not load model {model} with any of the following classes: {class_tuple}.")

Hello, I've had SillyTavern-extras working fine for weeks with SillyTavern-dev.
I decided to finally update SillyTavern-extras and did a git pull... now it no longer works?
I'm not sure how I can fix it, I deleted the SillyTavern-extras folder and starting fresh, no luck.

bhadresh-savani/distilbert-base-uncased-emotion Not found

Repository Not Found for url: https://huggingface.co/bhadresh-savani/distilbert-base-uncased-emotion/resolve/main/config.json.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

The repo is a 404 on huggingface

ElevenLabs multilingual issue (please help)

when using elevenlabs as TTS and enabling the multilingual option it doesnt seem to work and seems to keep the monolingual model.
so when reading a french text it keeps the english language and accent.

please someone help me fix this .

how do i make him talk french instead of english.

Having trouble connecting to SD backend

Keep getting this error over and over again, trying to use webui backend

Can't find the google colab API link after install

I downloaded the the colab stuff but I can't find where the API Link anywhere. I did get an error saying
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
google-colab 1.0.0 requires requests==2.27.1, but you have requests 2.31.0 which is incompatible.
torchdata 0.6.1 requires torch==2.0.1, but you have torch 2.0.0+cu117 which is incompatible.
torchtext 0.15.2 requires torch==2.0.1, but you have torch 2.0.0+cu117 which is incompatible.

But it looked like everything still got downloaded, so I don't really know what happened. any help will be greatly appreciated, Thank you in advance!

Character definitions is a waste of tokens?

With introduction of powerful ChromaDB that adjusts context to the needs of user having even a relatively small permanent definitions feels like a waste of tokens. Is it possible to provide users with an option to "offload" them into database to make those definitions temporary instead of permanent? Maybe the same thing could be done with a world info too?

When Chromadb is active even summarize module feels unnecessary anymore.

Is it possible to run the captioning model with WD1.4 tagger?

Such as this one: https://huggingface.co/SmilingWolf/wd-v1-4-swinv2-tagger-v2
It's a lot better than the default one, but I'm not sure how to make it run on the extension.
--enable-modules=caption --captioning-model=SmilingWolf/wd-v1-4-swinv2-tagger-v2 gives errors like this OSError: SmilingWolf/wd-v1-4-swinv2-tagger-v2 does not appear to have a file named config.json. Checkout 'https://huggingface.co/SmilingWolf/wd-v1-4-swinv2-tagger-v2/main' for available files.

Stable Diffusion prompts being changed before sending

When generating images using /sd, any periods in the prompt are being replaced by a comma and a space, such that if the below is entered;

"/sd a picture of a (black:1.25) cat"

The following is sent to SD

"a picture of a (black:1, 25) cat"

pip install -r requirements-complete.txt

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
extract-msg 0.41.1 requires chardet<6,>=4.0.0, which is not installed.
argilla 1.7.0 requires httpx<0.24,>=0.15, which is not installed.
tb-nightly 2.12.0a20230126 requires google-auth-oauthlib<0.5,>=0.4.1, but you have google-auth-oauthlib 1.0.0 which is incompatible.
tb-nightly 2.12.0a20230126 requires tensorboard-data-server<0.7.0,>=0.6.0, but you have tensorboard-data-server 0.7.0 which is incompatible.
sentry-sdk 1.22.1 requires urllib3<2.0.0, but you have urllib3 2.0.3 which is incompatible.
open-clip-torch 2.7.0 requires protobuf==3.20.0, but you have protobuf 4.23.2 which is incompatible.
mediapipe 0.10.0 requires protobuf<4,>=3.11, but you have protobuf 4.23.2 which is incompatible.
google-auth 2.19.1 requires urllib3<2.0, but you have urllib3 2.0.3 which is incompatible.
clean-fid 0.1.29 requires requests==2.25.1, but you have requests 2.31.0 which is incompatible.
argilla 1.7.0 requires numpy<1.24.0, but you have numpy 1.24.3 which is incompatible.
argilla 1.7.0 requires pandas<2.0.0,>=1.0.0, but you have pandas 2.0.2 which is incompatible.
pyopenssl 23.0.0 requires cryptography<40,>=38.0.0, but you have cryptography 40.0.2 which is incompatible.

Feature request : SD LoRAs support

Prompt and SD not working?

I can't for the life of me get either of these modules to show up within SillyTavern. Am I missing something or do I just not understand how to use those modules? What triggers the prompt/image generation?

Feature Request: Elevnlabs Default

When using Elevenlabs for tts, it would be nice that when "Auto Generation" is checked, Elevenlabs would be called for each message on every character, instead of only the characters listed in the voice map. Alternatively, the ability to just set a voice for all characters like:

All:Bella

[Feature request] Chinese support for summarize and classify modules

After cross-testing with Chinese and English, I found that the summarize and classify modules do not support Chinese. For the summarize module, the Chinese language makes the summed plot completely irrelevant. For the classify module, the Chinese language will result in completely wrong emotion recognition.

I know that by design, the models or principles of these two modules may only support English, so it would be nice if they could support Chinese or more languages, thank you very much.

Having some issues running start.bat

Every time I run it I get hit with the following error message

C:\Users\lordj\AppData\Local\GitHubDesktop\app-3.2.3\SillyTavern>call npm install
node:internal/modules/cjs/loader:936
throw err;
^

Error: Cannot find module 'C:\Users\lordj\solana-mint\nodestuff\node_modules\npm\bin\npm-cli.js'
at Function.Module._resolveFilename (node:internal/modules/cjs/loader:933:15)
at Function.Module._load (node:internal/modules/cjs/loader:778:27)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12)
at node:internal/main/run_main_module:17:47 {
code: 'MODULE_NOT_FOUND',
requireStack: []
}
node:internal/modules/cjs/loader:936
throw err;
^

Error: Cannot find module 'yargs/yargs'
Require stack:

C:\Users\lordj\AppData\Local\GitHubDesktop\app-3.2.3\SillyTavern\server.js
at Function.Module._resolveFilename (node:internal/modules/cjs/loader:933:15)
at Function.Module._load (node:internal/modules/cjs/loader:778:27)
at Module.require (node:internal/modules/cjs/loader:1005:19)
at require (node:internal/modules/cjs/helpers:102:18)
at Object. (C:\Users\lordj\AppData\Local\GitHubDesktop\app-3.2.3\SillyTavern\server.js:4:15)
at Module._compile (node:internal/modules/cjs/loader:1101:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1153:10)
at Module.load (node:internal/modules/cjs/loader:981:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12) {
code: 'MODULE_NOT_FOUND',
requireStack: [
'C:\Users\lordj\AppData\Local\GitHubDesktop\app-3.2.3\SillyTavern\server.js'****

Now I am very dumb so I'm sure I messed something up during the installation but any form of help would be greatly appreciated.

[Feature request] ChromaDB vector database

Add support for chromadb to manage inlinity context sizes.

It's already implemented in oobabooga - https://github.com/oobabooga/text-generation-webui/tree/main/extensions/superbooga
I'll try to make a pull request for this in a week, but if you have some suggestions for api design, please tell.

For now I want to make these endpoints:

/api/chroma/add-message - add message to db
- Body:
  - id - some id of message, to sort them chronologically
  - chat_id - id of chat to store
  - content - text of message
  - role - user/system/bot
/api/chroma/query - query relevant messages
- Body:
  - query - for example, latest user's message
  - chat_id - id of chat to query
- Answer:
  - Array of id, chat_id, content and role

And of course this all should be later connected to SillyTavern. There is a very high demand of this feature in ai roleplay communities.

ERROR: Could not build wheels for hnswlib, which is required to install pyproject.toml-based projects

When running this command (python -m pip install -r requirements-complete.txt) to open the extension, I received the following error:
Building wheels for collected packages: hnswlib
Building wheel for hnswlib (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for hnswlib (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [5 lines of output]
running bdist_wheel
running build
running build_ext
building 'hnswlib' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for hnswlib
Failed to build hnswlib
ERROR: Could not build wheels for hnswlib, which is required to install pyproject.toml-based projects

How can I fix this error? I know it's related to pip install at the very least.

cannot import name 'PartialState' from 'accelerate'

Following conda installation guide while executing
python server.py --enable-modules=caption,summarize,classify
I'm getting following error:
cannot import name 'PartialState' from 'accelerate'.

I'm sure I'm doing something wrong, but I don't know enough about Python to figure it out :D

Trace:

Traceback (most recent call last):
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\transformers\utils\import_utils.py", line 1153, in _get_module
    return importlib.import_module("." + module_name, self.__name__)
  File "C:\Users\user\.conda\envs\extras\lib\importlib\__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 883, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\transformers\pipelines\__init__.py", line 44, in <module>
    from .audio_classification import AudioClassificationPipeline
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\transformers\pipelines\audio_classification.py", line 21, in <module>
    from .base import PIPELINE_INIT_ARGS, Pipeline
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\transformers\pipelines\base.py", line 36, in <module>
    from ..modelcard import ModelCard
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\transformers\modelcard.py", line 48, in <module>
    from .training_args import ParallelMode
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\transformers\training_args.py", line 67, in <module>
    from accelerate import PartialState
ImportError: cannot import name 'PartialState' from 'accelerate' (C:\Users\user\AppData\Roaming\Python\Python310\site-packages\accelerate\__init__.py)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "E:\ai\TavernAI-extras\server.py", line 6, in <module>
    from transformers import AutoTokenizer, AutoProcessor, pipeline
  File "<frozen importlib._bootstrap>", line 1075, in _handle_fromlist
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\transformers\utils\import_utils.py", line 1143, in __getattr__
    module = self._get_module(self._class_to_module[name])
  File "C:\Users\user\AppData\Roaming\Python\Python310\site-packages\transformers\utils\import_utils.py", line 1155, in _get_module
    raise RuntimeError(
RuntimeError: Failed to import transformers.pipelines because of the following error (look up to see its traceback):
cannot import name 'PartialState' from 'accelerate' (C:\Users\user\AppData\Roaming\Python\Python310\site-packages\accelerate\__init__.py)

Characters in groupchat are outputting the prompt text with their own name in all fields

Instead of having:

Person1: Hello
Person2: Hey! How are you?
Person1: I'm doing great!
Person2: That's great!

When the prompt text comes up in the terminal, if Person1 is talking, it looks like

Person1: Hello
Person1: Hey! How are you?
Person1: I'm doing great!
Person1: That's great!

If Person2 is talking, it looks like

Person2: Hello
Person2: Hey! How are you?
Person2: I'm doing great!
Person2: That's great!

This basically makes it impossible for two characters to have a conversation with eachother.

memory extra on POE

Hi if I install the extra that allow memory ( summarize ) would it work with POE ?

Thanks

ModuleNotFoundError: No module named 'flask'

Getting this error when trying to install the extra modules:

$ python server.py --enable-modules=caption
Traceback (most recent call last):
File "N:\AI\Text\TavernAI-extras\server.py", line 2, in
from flask import Flask, jsonify, request, render_template_string, abort
ModuleNotFoundError: No module named 'flask'

Bark TTS with voice clone?

Between expressions, memory and everything else, I think this would really kick immersion up.

https://github.com/serp-ai/bark-with-voice-clone

[bug]

After recent update Extras don't start on cpu only. It's try to use cuda regardless of --cpu option present or not.

Commit a533bd5

On commit 5b8d9e2 --cpu still works.

Remote ChromaDB REST api errors

Upsert on the chromaDB post request causes a numpy ndarray error. Cause has not been found. Only occurs with remote connections. Local connections are fine.

https://github.com/SillyTavern/SillyTavern-extras/blob/main/server.py#L715

Poe API key dont work

Invalid or expired token error

.cache on C drive is eating up my SSD

Is there any way I can put .cache inside the Tavern-Extra folder? It's 8gb on my OS drive and I'd rather it not be there. Thank you :)

It won't allow me to initiate BLIP

It just stays at "Initializing an image model..." indefinitely and the Extras server doesn't activate.

SD Hires - Possible to config?

I was wondering if there's a way to configure the hires option on the menu or configs?
The default config is really bad. How can I use a custom settings for it?

Problems with group chats

I have found a couple issues with the group chats

You cannot edit messages
pressing 'regenerate message' generates a new message instead of regenerating the previous one

What about Android?

As SillyTavern itself works on Android perfectly, it's a bit confusing to see TavernAI-extras being unable to complete pip install -r requirements.txt. I would like to get it running in Termux somehow.

[BUG] Very strange behavior of the Stable Diffusion module

I type "/image 1girl" into the chat and SillyTavern instead of sending a request to the SD sends a request to the text model: " + 'Pause your roleplay and provide a detailed and vivid description of 1girl]',"
The request then goes to the SD model, but not 1girl:
"["Bully mAId: Okay, Master! She takes off her shoes, and walks barefoot across the floor, her feet making a slight tapping noise against the wood. Her hair was pulled back into a messy bun, with some strands falling over her face. When she spoke, she would tug at those same messy strands of hair.* You know... my job is pretty good. My boss always says that I'm one of the best maids he has working for him."]"

[question/bug?] Do chromaDB module can work on CPU?

When I starting extras with this command line:
python server.py --cpu --enable-modules=caption,summarize,classify,chromadb
it still using GPU. (For chromaDB only, other modules use CPU.)

This is intended behavior or bug? I already read help files to find info about this, but don't get any answer from them...

[Feature] Edge Text to speech

Microsoft edge uses some natural sounding voices, would increase the quality of the tts.
There is one pip package that could work!
https://pypi.org/project/edge-tts/

ST-Extras will not connect to SD host.

So as the title states, sillytavern extras will not connect to my stable diffusion, remote(internal network machine 192.168.xxx.xxx) or host (127.0.0.1) instances. I installed the "requirments-complete.txt" file, without any errors, and it seems like all of the other packages are working. even the SD module seems like it wants to work but i am met with "cannot connect to SD backend" during server.py startup.

Im running automatic1111. Ive included "sd" in modules-enable command, and Ive tried --sd-remote --sd-remote-host 192.168.xxx.xxx --sd-remote-port 7860, and every combination, even going as far as to changing the Default IP in constants.py to reflect the proper IP address. I can open a browser window on the machine that is hosting sillytavernUI and connect to my Automatic1111 SD instance without issue. same goes for when i spin up automatic1111 on the same machine that ST-extras is running on.

This seems pretty much a "just me" problem, but this is also a new feature, and so far, not very well documented. I feel like it should just work but maybe not. Any help would be greatly appreciated. There is no other error message associated or debugging info, so thats kind of all of the information i have on it.

[Feature Request] Additional UI Windows like Character Expressions

Could we take what is going on with the Character expressions and add a second "window" on the right side? This window could serve a few different purposes:

Show a second character in group chats - DONE
Show "key word images"
- Keyword images would just be images that can be set in the same way the character expressions are except they simply respond to a keyword or world info entry (when a world info entry is triggered it could also trigger a keyword image).
- I'm thinking this is a pretty straightforward functional change that could lead to a lot of really cool interactions and stuff. But it could be as simple as "if bot says keyword then show keyword_img.png

Cannot connect to API (Mandatory API key)

UH OH, BUG!!!

SillyTavern is asking me for a non-existent API key to use SillyTavern-extras, but there is no API key, so SillyTavern refuses to connect to the server. I don't know what's going on. Is this a bug, is this a feature or did I do something wrong?

[Feature Request] LoRA support for SD (cloud)

I was wondering if it would be possible to add a LoRA feature for SD (cloud) in future updates.

I'm not sure if it's viable at this point, since Google is quite harsh with things related to SD running in Colab, but it would be nice to have some consistency to the character's look when chatting.

Thanks in advance.

[Feature request]: Descriptive prompt/refined prompt for SD

Not the right place, sorry.

[chromadb] No module named 'sentence_transformers'

Hi,

for some reason I cannot use chromadb since it looks it needs a module called "sentence_transformers"

Just in case I reran requirements.txt and requirements-complete.txt but no success.

from sentence_transformers import SentenceTransformer
ModuleNotFoundError: No module named 'sentence_transformers'

I'm on windows and in the right env, the rest works fine.
Did I miss something?

Questions about the summarization / memory plugin

It's a bit unclear how the variables Buffer and Summary relate. Through testing this is my assumption:

Buffer is the total tokens available for the summarization plugin to use
Summary is the amount of tokens reserved for long term memory

So if Buffer is 768 tokens, and Summary is 512 tokens, that would leave 768 - 512 = 256 tokens left over for short term memory. Is this correct?

Also, I'm unsure how short and long term memory is represented in the memory contents. I was expecting short and long term memory to go into different places, but they seem to be combined in a single paragraph.

[Question] Any way to use a second GPU

Hello,

I have my main gpu being totally used by loading a model, but I have a second gpu available locally : is it possible to ask the model to use the second gpu for all requests?
If it's possible, is it doable on windows?

Silero TTS launched but isn't connected to the SillyTavern

python3 server.py --enable-modules=summarize,tts,classify,chromadb,
/home/rexommendation/Programs/SillyTavern-extras/server.py:73: DeprecationWarning: Nesting argument groups is deprecated.
local_sd = sd_group.add_argument_group("sd-local")
/home/rexommendation/Programs/SillyTavern-extras/server.py:77: DeprecationWarning: Nesting argument groups is deprecated.
remote_sd = sd_group.add_argument_group("sd-remote")
Initializing a text summarization model...
Initializing a sentiment classification pipeline...
tts module is deprecated. Please use silero-tts instead.
Initializing Silero TTS server
2023-06-05 20:21:07.251 | INFO | silero_api_server.tts:init:43 - TTS Service loaded successfully
Initializing ChromaDB
ChromaDB is running in-memory with persistence. Persistence is stored in .chroma_db. Can be cleared by deleting the folder or purging db.
Successfully pinged ChromaDB! Your client is successfully connected.
No API key given because you are running locally.

Serving Flask app 'server'
Debug mode: off
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
Running on http://localhost:5100
Press CTRL+C to quit
127.0.0.1 - - [05/Jun/2023 20:21:16] "GET /tts/speakers HTTP/1.1" 404 -

Fix for installing using Option 2 - Vanilla with multiple installed python versions

I have 2 versions of python installed which led to this error

This is easily fixable by prepending the python command like this python -m pip install -r requirements.txt

[Feature request] Specify which GPU to use if CUDA is selected as a command line argument.

I have two graphics cards and do my sillytavern-extras processing on the second one. Could we have a command line argument (maybe as a condition to --CUDA) to specify which GPU to use? Otherwise I have to edit device_string = "cuda:0" every time there's an update.

Feature Request: Module to Manage Response Time

SillyTavern 1.6.6
Extras - Latest Update from Git (6/8/23)
OpenAI API

Note: I figured this would be appropriate to post in this area, versus the main thread as this seems more along the lines of an additional versus a constant feature.

Request/Idea:
I've been trying to find out if there is a way to implement a function that allows the user to give a varying time between the bots output? As a standalone for single chat, or even an adjustment to the time bots respond in Group Chat.

Thoughts on implementation:
For instance, I could see that once a user sends an input, that input is tagged with a time stamp, the bot would then do it's usual response and it's response would be tagged the same. As the conversation goes on, and say, the user steps away from the instance of ST running, or doesn't respond within a certain amount of time, the bot would then auto respond, similar to Auto-Mode in group text, however, it's response time will either be varied by something as simple as a random number of seconds or minutes after the last user input, or something a bit more complex such as user selected time frame.

Reason for implementation:
The reason I bring this up is two fold, one, I've found it almost impossible to respond to a group chat properly because, as silly as it is, I can't type out as fast as the bots that are talking to each other, or when the bots begin printing out paragraphs asking why the I won't answer; I simply can't. I usually leave my instance of ST running while working on other things, and it would be neat for the bot that's currently running to "randomly" text me, giving it an illusion of sorts that it's thinking about the user. Secondly, this would also give a bit of a reason for the response sound effects to be used, such as the chime that plays, currently unless set to play the sound effect for each response, the option to only have it respond when away from the ST UI Tab seems pointless to have. Other ideas that could leverage the time feature could be the Chat Memory extra to maybe refer back to events talked about in context previously, if the bot was to respond later on in the day, based upon some random/systematic amount of time. So far I've seen one program use a similar feature, but it seemed too generic, as it responds almost exactly 30 seconds after it texts you the first time if no immediate user respond, and then maybe another few hours before it does a sloppy summarization of previous context of chat to output to the user as a callback to the application, which would usually break the effect due to poor memory management.

Final Thoughts/Reiterated Issue:
If out of all this this, my main issue is that I can't enjoy the auto response feature, as I can't get my input in before the input box clears out my sentence and another bot begins to respond, (or a individual bot if you convert a chat to a group chat).

Feature request: SD module that uses webUI

I'd like to point sd to an existing stable diffusion server I have up, running automatic1111's webui API. It'd be nice if there's a module that can hand it off rather than running SD locally, especially since it's common to have all sorts of extensions and customizations on SD.

chromadb setup

I'd like to try chromadb locally, so I reinstalled extras with requirements and tried requirements-complete as well but I get this output after enabling it.

127.0.0.1 - - [15/Jun/2023 21:01:23] "OPTIONS /api/modules HTTP/1.1" 200 -
127.0.0.1 - - [15/Jun/2023 21:01:24] "GET /api/modules HTTP/1.1" 200 -
127.0.0.1 - - [15/Jun/2023 21:01:51] "OPTIONS /api/chromadb/query HTTP/1.1" 200 -
[2023-06-15 21:01:52,040] ERROR in app: Exception on /api/chromadb/query [POST]
Traceback (most recent call last):
  File "C:\Users\Imi\AppData\Local\Programs\Python\Python310\lib\site-packages\flask\app.py", line 2528, in wsgi_app
    response = self.full_dispatch_request()
  File "C:\Users\Imi\AppData\Local\Programs\Python\Python310\lib\site-packages\flask\app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "C:\Users\Imi\AppData\Local\Programs\Python\Python310\lib\site-packages\flask_cors\extension.py", line 165, in wrapped_function
    return cors_after_request(app.make_response(f(*args, **kwargs)))
  File "C:\Users\Imi\AppData\Local\Programs\Python\Python310\lib\site-packages\flask\app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "C:\Users\Imi\AppData\Local\Programs\Python\Python310\lib\site-packages\flask\app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "D:\textgen\kobold\SillyTavern-extras\server.py", line 285, in decorated_view
    return fn(*args, **kwargs)
  File "D:\textgen\kobold\SillyTavern-extras\server.py", line 764, in chromadb_query
    query_result = collection.query(
  File "C:\Users\Imi\AppData\Local\Programs\Python\Python310\lib\site-packages\chromadb\api\models\Collection.py", line 196, in query
    n_results = validate_n_results(n_results)
  File "C:\Users\Imi\AppData\Local\Programs\Python\Python310\lib\site-packages\chromadb\api\types.py", line 269, in validate_n_results
    raise TypeError(
TypeError: Number of requested results 0, cannot be negative, or zero.
127.0.0.1 - - [15/Jun/2023 21:01:52] "POST /api/chromadb/query HTTP/1.1" 500 -
127.0.0.1 - - [15/Jun/2023 21:01:52] "OPTIONS /api/classify HTTP/1.1" 200 -

Anyone know what could be causing this? Is there anything extra I need to do other than the install steps?

Feature request: "Continue message"

Well, like it could be done ChatGPT or Ooba:

Sometimes response isn't bad but too short, but you can't make model continue it. And also if you just send an empty message like in CAI it would respond with blank message. (Maybe it's trying to impersonate out of turn and getting nuked dunno)

sillytavern / sillytavern-extras Goto Github PK

sillytavern-extras's Introduction

SillyTavern - Extras

Extras project is discontinued and won't receive any new updates or modules. The vast majority of modules are available natively in the main SillyTavern application. You may still install and use it but don't expect to get immediate support if you face any issues.

Recent news

What is this

How to run

❗ IMPORTANT! Requirement files explained

Common errors when installing requirements

Missing modules reported by SillyTavern extensions menu?

☁️ Colab

What about mobile/Android/Termux? 🤔

❗ IMPORTANT!

💻 Locally

Option 1 - Conda (recommended) 🐍

❗ IMPORTANT! Talkinghead

Option 2 - Vanilla 🍦

Modules

❗ IMPORTANT!

Options

Coqui TTS

Running on Mac M1

ImportError: symbol not found

ChromaDB

In-memory setup

Remote setup

API Endpoints

Get list of enabled modules

Input

Output

Image captioning

Input

Output

Text summarization

Input

Output

Optional: params object for control over summarization:

Text sentiment classification

Input

Output

Stable Diffusion image generation

Input

Output

Get available Stable Diffusion models

Output

Get available Stable Diffusion samplers

Output

Get currently loaded Stable Diffusion model

Output

Load a Stable Diffusion model (remote)

Input

Output

Generate Silero TTS voice

Input

Output

Get Silero TTS voices

Output

Get Silero TTS voice sample

Output

Compute text embeddings (vectorize)

Input

Output

Add messages to chromadb

Input

Output

Query chromadb

Input

Output

Delete the messages from chromadb

Input

Get a list of Edge TTS voices

Output

Generate Edge TTS voice

Input

Output

Load a Coqui TTS model

Input

Output

Get a list of Coqui TTS voices

Output

Optional: `params` object for control over summarization: