Coder Social home page Coder Social logo

linden-li / collage-diffusion-ui Goto Github PK

View Code? Open in Web Editor NEW
49.0 5.0 4.0 4.18 MB

An open source, layer-based web interface for Collage Diffusion - use a familiar Photoshop-like interface and let the AI harmonize the details.

Home Page: http://collagediffusion.stanford.edu

Dockerfile 0.11% Python 88.54% Makefile 0.04% MDX 9.14% Shell 0.01% HTML 0.02% CSS 0.01% JavaScript 0.01% TypeScript 2.11%
ai ai-art ai-art-generator artificial-intelligence automatic1111 diffusers diffusion diffusion-models image-generation photoshop

collage-diffusion-ui's Introduction

Collage Diffusion UI

[Demo Website] [Blog Post] [Video Tutorial] [Paper]

Collage Diffusion web UI

Collage Diffusion is a novel interface for interacting with image generation models. It allows you to specify the composition of an image in a familiar Photoshop-like interface. Our modified version of Stable Diffusion takes the layers in and produces a harmonized image, ensuring that everything from perspectives to lighting are plausible. Unlike text prompting supported by traditional diffusion interfaces, Layered Diffusion allows you to precisely outline how a scene should be composed—from where objects are relative to each other to what they look like.

The frontend is a React app written in Typescript using the Chakra UI library. The server implements a custom scheduler runtime that dispatches requests to the Ray Serve library for inference. The model is a modified version of Stable Diffusion via HuggingFace Diffusers.

Development Setup

Create a configuration file by running

./configure.sh config_dev.json

Frontend

The frontend is a React App written in Typescript, with UI components from Chakra UI. All code is placed in the frontend directory. To setup locally, make sure you've installed node and npm. Setup dependencies by running the following commands:

cd frontend
npm install

The app is built using vite. To start the app, run:

npm start

and navigate to http://localhost:5173. If you want to deploy, run

./start.sh frontend

which will build the app and serve it on port 3000.

Server

The server hosts a modified version of Stable Diffusion v1.5 from the HuggingFace diffusers library. Inference is best run on a node with GPUs. The model weights take approximately 8 GB of GPU VRAM, so most GPUs (NVIDIA A100, A10G, V100, etc.) should be able to handle the workload without running out of memory.

We provide a configuration file in configs/config_gcp.json that allows you to configure the ports that the app is run on. A crucial field is backend.activeGpus, which adjusts the CUDA_VISIBLE_DEVICES environment variable within the application container.

We provide a Dockerfile containing all of the dependencies to run the server. To build the image, run

docker build -t collage-diffusion .

from the project root directory. This will create an image called layered-diffusion, which you can use to run the server by running:

docker run -d --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -p 8009:8009 -p 9007:9007 collage-diffusion

If you modified the ports in the config file above, then be sure to adjust the forwarded port accordingly.

Setting up Google Cloud

Our service uploads to a Google Cloud Bucket. To use your own custom bucket, navigate to backend/utils/gcloud_utils.py and modify the PROJECT_ID and BUCKET_NAME variables accordingly.

Install the Google Cloud storage package:

pip install google-cloud-storage

After that, install the Google Cloud SDK. You can do this by running the following command:

curl -O https://dl.google.com/dl/cloudsdk/channels/rapid/downloads/google-cloud-cli-417.0.0-linux-x86_64.tar.gz
tar -xf google-cloud-cli-417.0.0-linux-x86_64.tar.gz

Install via:

./google-cloud-sdk/install.sh

After installation, restart your shell. Then, run the following commands:

./google-cloud-sdk/bin/gcloud init

which will prompt you to log in to your Google account. Select your project.

You will then have to login using:

gcloud auth application-default login

To test that uploading to GCP works, run the following command:

python backend/utils/gcloud_utils.py -f {path_to_file}

where {path_to_file} is a path to a file you want to upload. If the file successfully uploaded, you should see a link to the file in the console.

Collage Diffusion Implementation

The Collage Diffusion implementation modifies Stable Diffusion off of a fork of diffusers. The file backend/pipeline_controlnet.py contains a diffusers pipeline where the inputs to Collage Diffusion can be easily configured. For an example on how to use the pipeline, see backend/test_controlnet.py.

Acknowledgements

This project was done under the supervision of Prof. Chris Re and Prof. Kayvon Fatahalian. The implementation and design of the system was done by Vishnu Sarukkai, Arden Ma, and Linden Li.

collage-diffusion-ui's People

Contributors

linden-li avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

collage-diffusion-ui's Issues

Missing dependencies in requirements.txt + excess imports in api.py

A brief digression: kudos to everyone involved in this. It's a really neat project.

Issue Description:
After running pip install -r requirements.txt and running python api.py, there are some dependencies which seem to be unsatisfied.

Expected Behavior:
pip install -r requirements.txt should cover all required imports if possible. Running python api.py after the pip install should let the backend code run.

Observed Behavior:
There are a handful of additional packages that need to be installed:

  • ModuleNotFoundError: No module named 'omegaconf' (Resolve with 'pip install omegaconf')
  • ModuleNotFoundError: No module named 'h11' (Resolve with 'pip install h11')
  • ModuleNotFoundError: No module named 'ray' (Resolve with 'pip install ray')
  • ModuleNotFoundError: No module named 'google.cloud' (Resolve with 'pip install google-cloud-storage' [!!!])
  • ModuleNotFoundError: No module named 'dotenv' (Resolve by removing unused import in api.py [!!!])
  • ModuleNotFoundError: No module named 'pytorch_lightning' (Resolve with 'pip install pytorch-lightning')
  • ModuleNotFoundError: No module named 'einops' (Resolve with 'pip install einops')

config.json not found while running npm start

Hi, I was trying to run your codebase, while I have been able to set up the docker server, I'm having a bit of a trouble running the frontend. I followed the steps as instructed in the readme, but I'm running into this error:

Failed to resolve import "../config.json" from "src/components/CollageEditor.tsx". Does the file exist?

Is there anything I'm missing, how do I generate this config?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.