parlance-zz / g-diffuser-bot Goto Github PK

Discord bot and Interface for Stable Diffusion

License: MIT License

Python 100.00%

discord-bot stable-diffusion diffusers ai-art artificial-intelligence generative-art image-generation img2img inpainting latent-diffusion

g-diffuser-bot's Introduction

https://www.stablecabal.org

g-diffuser-bot - Discord bot and interface for Stable Diffusion

G-Diffuser / Stable Cabal Discord

Nov 23-2022 Update: The first release of the all-in-one installer version of G-Diffuser is here. This release no longer requires the installation of WSL or Docker, and has a systray icon to keep track of and launch G-Diffuser components. The download link is available under this project's releases.

Nov 20-2022 Update: The infinite zoom scripts have been updated with some improvements, notably a new compositer script that is hundreds of times faster than before.

Nov 19-2022 Update: There are some new g-diffuser CLI scripts that can be used to make infinite zoom videos. Check out /inputs/scripts/ and have a look at zoom_maker and zoom_composite

Nov 11-2022 Update: I've created a website to showcase a demo gallery of out-painting images made using g-diffuser bot - https://www.g-diffuser.com/

Nov 08-2022 Update: In/out-painting and img2img (aka "riffing") has (finally) been added to the Discord bot. New Discord bot command 'expand' allows you to change the canvas size of an input image while filling it with transparency, perfect for setting up out-painting.

Nov 07-2022 Update: This update adds support for clip guided models and new parameters to control them. For now clip guidance has a heavy performance penalty, but this will improve with optimization. This update also adds negative prompt support to both the CLI and discord bot, and changes the default loaded models to include SD1.5 and SD1.5 with (small) clip. This update also adds several new samplers (dpmspp_1, dpmspp_2, dpmspp_3).

System Requirements:

Windows 10+ (1903+), nvidia GPU with at least 8GB VRAM, ~40GB free space for model downloads
You may need to turn on "developer mode" before beginning the install instructions. Look for "developer settings" in the start menu.

G-Diffuser all-in-one

The first release of the all-in-one installer is here. It notably features much easier "one-click" installation and updating, as well as a systray icon to keep track of g-diffuser programs and the server while it is running.

Installation / Setup

Download and extract G-Diffuser AIO Installer (Windows 10+ 64-bit) to a folder of your choice.
Run install_or_update.cmd at least once (once to install, and again later if you wish update to the latest version)
Edit the filed named "config" and make sure to add your hugging-face access token and save the file.
- If you don't have a huggingface token yet:
  - Register for a HuggingFace account at https://huggingface.co/join
  - Follow the instructions to access the repository at https://huggingface.co/CompVis/stable-diffusion-v1-4 (don't worry, this doesn't mean SD1.4 will be downloaded or used, it just grants you the necessary access to download stable diffusion models)
  - Create a token at https://huggingface.co/settings/tokens (if required, choose the "read" role)

Usage

Run run.cmd to start the G-Diffuser system
You should see a G-Diffuser icon in your systray / notification area. Click on the icon to open and interact with the G-Diffuser system. If the icon is missing be sure it isn't hidden by clicking the "up" arrow near the notification area.

GUI is coming soon(tm)

g-diffuser-bot's People

Contributors

Stargazers

Watchers

g-diffuser-bot's Issues

add the ability to explicitly specify output folder when using CLI

outputfolder= : create (if it doesn’t exist) a folder named , and place all outputs for the current batch into that folder. Allow for /?

Dynamic file name creation that the user can specify what pieces of data are used to create the file, eg filename= would create a file named 00001_12_dd.mm.yy, 00002_12_dd.mm.yy, 00003_.. etc
Eg filename= → 00001_hh.mm.ss_dd.mm.yy
-folder optional parameter would append the exact string value from filename= into the end of the output folder name
Eg outputfolder=Sexy Dogs
filename=
Would create a file called “00001_9/19/22.png” in a folder called “Sexy Dogs ”

Re-add k_diffusion samplers using existing code extensions to diffusers lib

At least until they are integrated in the core diffusers lib.

Create docker image or something more easily deployable

Never done this before, will investigate later unless someone else would like to do the honors.

add output_filename=etc

allow to specify filenames, including ability to insert values from arguments as part of the filename
eg sample('prompt', output_filename=f"{prompt}.{seed}.{steps}")

Redo readme with gallery, better instructions, explanations

make a much nicer readme with images gallery and stuff

convert CKPT models to diffusers

Setup contribution guide as per github community guidelines

Add Negative Prompts support

I have seen many webUIs have it, but as far as I know no bot, or frontend has implemented it yet. The closest thing I could find is: https://github.com/invoke-ai/InvokeAI/search?q=negative

Clean up DEBUG_MODE

cleanup debug printing, logging, exceptions, DEBUG_MODE

add global catch-all exception handler

Separate models into a models folder, add better output file / folder naming

Add new g-diffuser command "enhance"

Rescale the input image to a higher resolution and use inpainting with a constant mask of some opacity, effectively using SD for super-resolution. The same function could be aliased as a style transfer function, since it would do the same thing depending on opacity value and the prompt supplied.

Create an easy-to-read change log file after merging beta2 into main

Fix prompt folder naming truncation

g_diffuser_lib.get_filename_from_prompt needs to detect if truncation is occurring and if so append a short hash of the entire prompt. This will outputs from different prompts from going into the same folder without making the folder names excessively long.

Add fields to Command class

Command class should include a target_pipe field, this message should reflect whatever the target pipe is (important when mixed modalities come)

Also should have a used_pipe field filled out by the command server

Command class should have a used_args dictionary filled out by the command server with the complete list of all final used params, including any clipped params, adjusted resolution, un-changed default params, etc.

feature idea: panoramic or multi-stage outpainting mode

    feature idea: panoramic or multi-stage outpainting mode

basically if you take a 512x512 image where one half is image and the right half is erased (like the ghibli shack pic you were using during testing) and outpaint a new half for it, it would be nice if there were an easy way (from inside the gallery viewer even?) to take that new half and place it in its own new image all the way on the left so that the right side is once again blank/erased, and outpaint again, eventually creating a panorama (could also be done vertically of course, or maybe even in multiple directions, but that would require more RAM?).
For double extra bonus credits, the gallery viewer should be able to take the original starting image and automatically append it to the left side of the outputs from the second stage (so the second stage would be generating a 512x512 and the gallery would be taking your original 256x512 chunk and glueing it to the side) so you can more easily/quickly see which new images best 'match' the shape or feel of the original.

Originally posted by @lootsorrow in #36 (comment)

Portable version?

Once install on one PC, the root folder could be copied to another PC and it should work without the need to install anything.

discord bot !top command fails when message is too long

truncate top command output to 2000 chars

Build a webui-esque front-end that runs on top of g-diffuser-lib command server

Try to make it actually look and run nice.

Javascript and CSS is an extremely weak area for me, so if anyone is willing to step up and take a crack at this it would be extremely appreciated. The json command format is finally solidified enough that any work built on this won't need to be changed much if at all.

Add the new arg/param system to discord bot to remember args and input images, allow saving and loading args

Integrate the new savable / loadable arg set system in the discord bot in a friendly way. Preserve saved args and input images by making paths for them under the inputs path.

Add select buttons using discord gui thing in addition to !select for grids, for outputs that are older or made by other users.

Add automatic steps / scale calculation

If scale or steps is omitted but not the other, use steps ~= scale * 4.2 to derive the other

Upgrade discord bot to (optionally) use application or "slash" commands instead of a command prefix

music2music jam session

take a ~30 second backing track (or generate one with txt2music or whatever2music!)
User listens to it through headphones while playing an instrument into a microphone
take the input from the microphone (Stream A) and combine it with the backing track (Stream B) into a single stream (Stream C)
music2music with Stream C as the input, creating more of the same/similar backing track (Stream D)
5a) if music2music can generate 1 second of music in under 1 second, then just play Stream D live to the user's headphones
5aii) continue feeding Stream D into step 3 and then step 4?
5b) if music2music cannot generate 1 second of music in under 1 second, append Stream D to the end of Stream A (The Song)
5bii) play The Song to the audio output device while continuously generating new chunks (Stream E, F, G, etc) and appending them to the end of The Song
???
Virtual Live Improvisational Jam Session Band For People With No Friends!

[enhancement] RAM management

I noticed that 4 pipelines are loaded when the bots starts.
Namely: diffuser, txt2img, img2img, img_inp
Using the optimized mode, it took around 8.5 GB of Memory to load the bot.

possible solution:

import gc
example_model = ExampleModel().cuda()

del example_model

gc.collect()
# The model will normally stay on the cache until something takes it's place
torch.cuda.empty_cache()

https://gist.github.com/ejmejm/1baeddbbe48f58dbced9c019c25ebf71

replacement for 'for x in range' in cli

accept multivalued parameters/arguments in sample(), sample every combination

Important enhancements / bugfixes to outpainting

Is it possible to develop a latent space encoding / decoding for sparse non-linear data? (as opposed to dense linear data). If so, you could use diffusion models for things like text and tilemaps.

Try using the same techniques in _get_shaped_noise on the latent space representations of the src, noise, and masks, maybe try varying str or scale over steps for better annealing

More control params for out-painting

Add brightness, contrast, and color tone adjustment params for shaped noise outpainting

Rename project to g-diffuser-lib

The scope of the project has expanded significantly (the discord bot will remain included and under active development)

Better acknowledgement messages for discord bot

Alter ‘gimme a sec’ message to include acknowledgement of attached image (“Okay @lootsorrow, generating with unmasked image” or “generating with alpha masked image” or “generating with no image input”).

Notify user in response when they exceed param limits

Add bot command to show all default params and limits / ranges

OSError: It looks like the config file at 'J:/SD/models/stable-diffusion-v1-4.ckpt' is not a valid JSON file.

clean installation, i got this error with my model i usually use with other repo.
Here's the full output
Traceback (most recent call last):
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 272, in get_config_dict
config_dict = cls._dict_from_json_file(config_file)
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 324, in _dict_from_json_file
text = reader.read()
File "J:\MINICONDA\envs\g_diffuser\lib\codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 64: invalid start byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "J:\SD\g_diffuser_cli.py", line 206, in
main()
File "J:\SD\g_diffuser_cli.py", line 89, in main
gdl.load_pipelines(args)
File "J:\SD\g_diffuser_lib.py", line 639, in load_pipelines
pipe = pipe_map[pipe_name].from_pretrained(
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\pipeline_utils.py", line 290, in from_pretrained
config_dict = cls.get_config_dict(
File "J:\MINICONDA\envs\g_diffuser\lib\site-packages\diffusers\configuration_utils.py", line 274, in get_config_dict
raise EnvironmentError(f"It looks like the config file at '{config_file}' is not a valid JSON file.")
OSError: It looks like the config file at 'J:/SD/models/stable-diffusion-v1-4.ckpt' is not a valid JSON file.

Add fine-tuning / training utilities to g-diffuser-lib

Change -x param in discord bot

Negative values should trigger repeat mode ad infinitum (or up to max repeat limit), until !stop is run.

Do not show repeated commands in the queue more than once.

Add CLI command to process an entire folder of json arg files, regenerating each sample

This is imminently needed because this will form the basis of repeatable "test suites" when optimizing out-painting and other experimental tools.

start_interactive_cli.bat doesn't have working command history

pushd %0\..\
cmd /k "conda run -n g_diffuser --no-capture-output python g_diffuser_cli.py --interactive"

I've removed this file for now because it is extremely annoying to not have command history. The issue is due to a conda bug which can easily be reproduced (at least on Windows) by running:

conda run -n some_env --no-capture-output python

.. then trying to use the up arrow to browse command history.

eliminate txt2img pipeline

create an identity image/seed/noise for use with img2img/inpainting to functionally act like txt2img, thus removing the need to have/load txt2img pipeline

Seeds are broken

Need to either submit a patch to diffusers to find a workaround

gdl.get_default_args improvement

get_default_args should take optional keyword args that will be injected into the namespace after parse_args()

Unbreak discord bot and command server by updating to use the new command format in g_diffuser_lib

Unbreak shaped noise after code overhaul

This is broken again now, need to fix this after cleaning up code.

move model args to sub-namespace in args

model_name
use_optimized
loaded_pipes
pipe_list

after this is done amend load_pipelines to save these globally, and amend get_samples to overwrite / re-attach these args to incoming args (with warning if any mismatch)

Add operator in discord bot syntax to range parameter values over the number of samples

command param ranges with : operator

distributes reps over the defined (multi-dimensional if more than 1) param space ranges

Remote command server separation and clustering

Add “--remote” server option to command_server to allow accepting connections from non-localhost using a pre-shared secret token for auth.

In remote mode the out_attachments list will use URLs instead of local file paths

G_diffuser_bot.py should support this and download from those URLs in remote mode

You should be able to specify a list of nodes to connect to your g_diffuser_bot_config.py and have the discord bot use all those nodes, robustly distributing commands

Create manual and wiki page for frequently asked questions

Give error / warning when model-name in args does not match the loaded model name

At least until the issue of loading models more dynamically is solved, not practical at the moment due to memory constraints.

Command server robustness improvements

Have the command server check for cancellation after every sample to waste less time until diffusers pipes can actually be aborted

Prevent the command server from starting 2 commands simultaneously (as when discord bot is on multiple servers)

queue cmds when cmd server not ready in discord bot

auto restart cmd server if unresponsive

new admin cmd to restart cmd server

Cmd server should dynamically re-import g_diffuser_lib in DEBUG_MODE