Coder Social home page Coder Social logo

sd-buddy's Introduction

Square30x30Logo Stable Diffusion Buddy

Companion desktop app for the self-hosted M1 Mac version of Stable Diffusion.

It is intended to be a lazier way to generate images, by allowing you to focus on writing prompts instead of messing with the command line.

Stable Diffusion Buddy v0.8.0

Behind the scenes it executes this command in your local Stable Diffusion directory:

python scripts/txt2img.py \
  --prompt "a red juicy apple floating in outer space, like a planet" \
  --n_samples 1 --n_iter 1 --plms --ddim_steps 10 --H 512 --W 512 --seed 42

Important Note 1 It's a WIP, so check back often. For now it's barely a step above running the command manually, but I have a lot of things in mind (see the wishlist below) that should make my life easier when generating images with Stable Diffusion.

Important Note 2 This is a spur-of-the-moment, passion project that scratches my own itch. As a result, I feel zero pressure or desire to satisfy anyone else's wishlist. So if you feel it lacks anything YOU want, or I'm moving too slowly, go ahead and fork the repo and build your own thing.

Stack

The app is built with:

Prerequisites

  • An M1 Mac.

  • Follow this excellent guide or this equally excellent Twitter thread to install Stable Diffusion on your M1 Mac.

  • Make sure the program runs without errors and it can generate images by running the above command manually, before using the app. The app doesn't have a lot of error checking built in, so try not to abuse it 🙂

Features

The first publicly-available version is pretty thin on features. My goal was to provide a basic UI around the CLI command.

  • Register and persist the location of the Stable Diffusion directory.
  • Clickable links for the Stable Diffusion project and output directories.
  • Generation parameters: prompt (--prompt), steps (--ddim_steps), scale (--scale), batch count (--n_iter), batch size (--n_samples), image height (--H) and width (--W), seed (--seed).
  • Generate multiple images with the same parameters, in sequence.
  • Use a random seed for each generation.
  • Parameter matrix (aka parametric prompts). Use variables prefixed with "$" to trigger a set of inputs for each variable where you can list comma-separated parameters. You can then generate the entire batch of prompt combinations with one click.
  • NEW Prompt queue. Push prompts to a queue so they can be generated in sequence. Delete queue items, or mark them as skipped.
  • Real-time duration timer.
  • Display the output of the command and click it to copy to the clipboard.
  • Display the generated image at 512x512.
  • Delete a run from history.
  • A history of previous runs is displayed below the form.
  • Thumbnails for the generated images (click an image to open it in the associated program).
  • Links to the generated images (opens it in the associated program).
  • Click a previous run to reuse the prompt & all the generation parameters.
  • Each run contains generation parameters metadata.
  • Gallery section which shows all the generated images in a simple grid. Has controls to filter by prompt fragment, sort by generated timestamp, and change thumbnail size.
  • Keyboard shortcuts for quitting the app, copy/paste, etc.

Download

Download the compiled app in .dmg format from the Releases page.

The app is unsigned so you will get a security message when you open it the first time.

2022-09-11-sd-buddy-security-1

Go to System Preferences > Security & Privacy > General, and click Open Anyway.

2022-09-11-sd-buddy-security-2

Open the app again and this time you'll get a different security warning. Click Open and you should be set.

2022-09-11-sd-buddy-security-3

Windows and Linux builds

For a short time there were Linux and Windows builds available. I disabled those for several reasons.

  1. This project is optimized around the M1 Mac flavor of Stable Diffusion.
  2. Even though Stable Diffusion can be installed on Windows, certain "magic" I employed around determining which is the last generated image dictates that this will remain a Mac build for now. I'll have to change this at some point since it's not ideal.
  3. The "magic" mentioned on #2 may actually work in Linux but I don't have a way to test it. If you do have a Linux environment with Stable Diffusion installed and are willing to build this app yourself, please give it a try and let me know in a new Discussions topic.
  4. In the end I kept only the Mac build because it makes GitHub Actions CI faster to have 1 build instead of 3.

Building the app

In addition to the above, if you want to build the Mac binary yourself, first install the Tauri environment + CLI (including the Rust CLI + Cargo), then clone this repo and run:

npm install

# dev mode
cargo tauri dev

# build the production app
cargo tauri build

Generating app icons

Follow the official Tauri icons guide.

npx @tauri-apps/tauricon src/assets/sd-buddy-logo.png

Alpha status

Be aware that if the current version looks like v0.x.x the app is in "alpha" state, meaning that things can and will change drastically between versions. This includes breaking changes, regressions, or new bugs.

Wishlist

I'll tackle these in whatever order I feel is a priority for how I use Stable Diffusion.

  • A gallery of generated images.
  • Light/dark mode.
  • UI improvements including a Help section.
  • Configurable output folder.
  • Embed the metadata in the generated image, optionally. A similar thing can be done by saving the generation parameters to a text file with the same name as the image. Follow the changes in my txt2img.py gist.
  • Support img2img.py as in python scripts/img2img.py --prompt "a red juicy apple floating in outer space, like a planet" --init-img apple-input.jpg --strength 0.8 --skip_grid --n_samples 1.

In addition, I need to sort out various small details around developing with Tauri, such as global keyboard shortcuts for common actions such as quitting the app, enabling copy/paste in text boxes, and narrowing down the file/directory operations scope to the settings folder.

Known issues

  1. The tauri.allowlist.fs.scope key in tauri.conf.json. This essentially defines what file system locations the fs command is allowed to touch. Currently it is set to ** which means everywhere. Since fs is used only by the tauri-settings package to create and write to settings.json, I'm not worried about it. Nevertheless, I'd like to limit it to just the location it needs once I discover the correct string pattern for that option.

  2. The current (v0.3.0) image thumbnail rendering feature suffers from an issue stemming from the way I implemented it. Essentially when the program completes, I run a Rust function that executes a system command (find + arguments) to retrieve the name of the newest image in the output folder that was created during the timeframe in which the run completed. Then I read the image contents from disk as binary data and convert it to a base64 string to be rendered on the front-end. Unfortunately it seems that the Stable Diffusion python command sometimes overwrites files. So what used to be grid-0032.jpg from a previous run is also the same for a new run. This causes the generated thumbnails to look the same, since the metadata embedded with each run to points to the same location. Workaround I think my problem appeared because I was deleting images from disk that I didn't like, thus creating gaps. The SD command likes to back-fill those gaps but it also overwrites images that I hadn't deleted.

  3. Setting Samples to anything other than 1 crashes the Python script with this message failed assertion [MPSTemporaryNDArray initWithDevice:descriptor:] Error: product of dimension sizes > 2**31' /opt/homebrew/Cellar/[email protected]/3.10.6_2/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown.

  4. The queue cannot be stopped currently (v0.11.0) while it's processing. There is also a minor issue which prevents the same prompt matrix from being added more than once, which would actually be a feature if it was consistent with how simple prompts behave (they can be added multiple times).

Troubleshooting

Image generation fails when disconnected from the internet

While SD Buddy is completely offline and doesn't need the internet to function, it appears that the Stable Diffusion installation does in fact need an internet connection. I first ran into this problem while on plane in airplane mode. I haven't researched why that's the case. You can verify by putting your computer in airplane mode (or disconnecting from the internet) and running the Python script manually.

ModuleNotFoundError: no module named "ldm"

If you receive this error when running img2img follow these instructions.

In /path/to/stable-diffusion/scripts/img2img.py add this line above the ldm imports:

sys.path.append(os.path.join(os.path.dirname(__file__), "..")) # add this line
from ldm.util import instantiate_from_config
from ldm.models.diffusion.ddim import DDIMSampler
from ldm.models.diffusion.plms import PLMSSampler

UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown

Enforce a small number of samples (--n_iter, --n_samples), ideally 1. Example:

python3 scripts/img2img.py --prompt "A fantasy landscape, trending on artstation" --init-img sketch-mountains-input.jpg --strength 0.8 --n_iter 1 --n_samples 1

Security FAQ

  • How secure is the app? It is strictly a local app that doesn't communicate with the network. Tauri disables all system APIs by default. I've enabled the minimum necessary system APIs to allow it to function.
  • How do I know it's not communicating with the internet? Use a network inspector tool to analyze the traffic.
  • How do I know it doesn't do nefarious things on my computer? It's open source. Feel empowered to inspect the source code and build it yourself.
  • Why is the app unsigned? Currently I don't have an Apple developer account because I'm not paying Apple for the privilege since I'm not making any money from this app or any other. When this changes, I would love to be able to sign my apps. If you want to contribute towards this goal, a sponsorship is very much appreciated.

Contributing

At this time Stable Diffusion Buddy is not open to contribution. You may create issues but there's absolutely no guarantee I will tackle or even glance at them.

If you feel strongly that you want to contribute, please focus on these areas:

  • Submit fixes for critical bugs you have encountered
  • Submit PRs that handle the known issues listed above
  • Improvements on the Tauri side (I'm still a noob, please teach me)

New feature requests that are not on the wishlist will probably fall on deaf ears, unless it's something that I personally like, or has been explained really well in Discussions.

Here's an example of a very thoughtful PR by Swyx that fixed actual issues. I added those myself manually since by that time the code had shifted. So use that as a model and all will be good.

License

Until I find the appropriate license to attach, please consider and abide by the following terms of use.

Stable Diffusion Buddy is provided as-is without any guarantees. I assume no liabilities from any potential harm (physical, financial, or otherwise) from using it. In other words, use at your own risk.

Stable Diffusion Buddy is open source and free for personal use.

You may not use Stable Diffusion Buddy for any commercial purpose. That means you may not sell or profit in any way from the compiled app, from compiling the app yourself, from the source code, or a fork of it. The images generated by using this app do not fall under these limitations for obvious reasons.

Unless it's strictly for personal use (on your machine), when forking the source code and/or rebuilding the binary please provide attribution and a link to this repo in a README file.

sd-buddy's People

Contributors

breadthe avatar caesar avatar danawoodman avatar swyxio avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

sd-buddy's Issues

Roadmap

heya, so i know this project isn't open contributions but i do find the need for a GUI and i like the way you do stuff so am just penning my developing thoughts as i start to explore SD and think about what i want in a GUI

everything here is subject to the recognition that you can/should ignore any of it and i wont take any offense

  1. more cli options. i would like a CFG parameter setting (which I think means the --scale param?). this one is easy, you'll probably do it eventually. being able to tweak --n_samples would also be nice. and seed (max 4294967295)
  2. prevent footguns
  3. track "time elapsed" so we can see how far along we are in the generation...
    • alongside of maybe an "estimated time" thing with heuristics based on past runs and/or "complexity cost" based on the params
  4. seeing the results of the image in the app itself. not suuper sure how exactly to do it but its probably not hard?
  5. (hardest) being able to choose the image and "refine" it - this to me was the killer UX of midjourney - this probably means storing the metadata of each image alongside the image, including a fixed seed. lets you explore a bunch of stuff in a grid and then refine it

something like that? thoughts? obv if we're aligned then i'd be happy to contribute most of these since i want them

Parametric Prompt bugs

  1. deleting parameters
  • the repro steps are:
    1. create a prompt with 2 or more params
    2. fill out the param fields
    3. delete one of the params from the prompt
    4. see how the UI is now buggy - doesnt show the variations, or shows too many variations
  • i think this means that there is some hidden state inside here, and we should create a pure function that parse from prompt to param list
  1. words with $ in them
    • this is minor but we shouldnt pick up on micro$oft
    • maybe using this regex /\B\$[a-zA-z]+/g

Doesn't work with direnv (only supports virtualenv)

I know the Readme says you don't want contributions, so feel free to ignore… that said, this is certainly a critical bug for me, so I thought I'd open an issue anyhow.

Regardless of input, when I click Generate, the app gives an output of "image error".
It doesn't give any more details about what the error might be. I've tried running in dev mode in the hopes of some debug output either on the CLI or in the web console, but there's nothing.

Copying the displayed command and manually running it from the terminal from within the SD directory works fine.

I suspect (but have no evidence yet) that this is related to the fact that my installation of SD is using pyenv via direnv (as opposed to virtualenv, as shown in the guide linked from the readme). Perhaps whatever method the app is using to run the python command doesn't load the pyenv.
That's all I can think of anyway… I'll try and do some more debugging when I get a chance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.