Coder Social home page Coder Social logo

pier-maker92 / stable-diffusion-experiments Goto Github PK

View Code? Open in Web Editor NEW
9.0 2.0 0.0 63.59 MB

This is a repo providing same stable diffusion experiments, regarding textual inversion task and captioning task

Python 0.89% Jupyter Notebook 99.11%
caption-generation caption-generator captioning-images clip huggingface huggingface-diffusers img2txt latent-diffusion latent-diffusion-models pytorch stable-diffusion textual-inversion

stable-diffusion-experiments's Introduction

Stable diffusion experiments

This is a repo providing same stable diffusion experiments, regarding textual inversion task and captioning task.

Installation

Clone the repo, then create a conda envirnoment from envirnoment.yml and install the dependecies.

  conda env create --file=environment.yml
  conda activate sd
  pip install -r requirements.txt

Textual inversion

video-gif

The textual inversion experiment creates a video of 20 frames out of the generation of two images that starts from different concepts provided by the user.

It is possible to load concepts giving a valid Huggin Face ๐Ÿค— concept repo: https://huggingface.co/spaces/sd-concepts-library/stable-diffusion-conceptualizer

Usage

  --model_id MODEL_ID   The s.d. model checpoint you want to use
  --from_file, --no-from_file
                        load arguments from file
  -p PROMPT_FILE_PATH, --prompt_file_path PROMPT_FILE_PATH
                        path file where to read prompt
  -s SEED, --seed SEED  Set the random seed
  --from_concept_repo FROM_CONCEPT_REPO
                        The start concept you want to use. (Provide a hugginface concept repo)
  --to_concept_repo TO_CONCEPT_REPO
                        The end concept you want to use. (Provide a hugginface concept repo)
  --from_prompt FROM_PROMPT
                        Start prompt you want to use
  --to_prompt TO_PROMPT
                        End prompt you want to use
  --num_inference_steps NUM_INFERENCE_STEPS
                        Number of inference step.
  --guidance_scale GUIDANCE_SCALE
                        The guidance scale value to set.
  --width WIDTH         Canvas width of generated image.
  --height HEIGHT       Canvas height of generated image.
  --use_negative_prompt, --no-use_negative_prompt
                        flag to use negative prompt stored in negative_prompt.txt
  -b BATCH_SIZE, --batch_size BATCH_SIZE
                        Batch size to use
  --mps, --no-mps       Set the device to 'mps' (M1 Apple)

example

python textual_inversion.py --from_file -p "prompt_close_up.txt" --mps --num_inference_steps 50
python textual_inversion.py --from_concept_repo "sd-concepts-library/gta5-artwork" --to_concept_repo "sd-concepts-library/low-poly-hd-logos-icons" --from_prompt "A man planting a seed in the <concept> style" --to_prompt "A <concept> of a beautiful tree" --mps --num_inference_steps 60 -s 0

img -> caption -> img

This is more an evaluation across different models to perform image-to-text, providing caption to use as s.d. prompt for recreate the original image. It has been designed as an investigation task, so I used the notebook captioning_task.ipynb to conduct experiments.

There are 3 different models for image2caption wich have been evaluated

mscoco_finetuned_CoCa-ViT-L-14-laion2B-s13B-b90k
vit-gpt2-image-captioning
blip-image-captioning-base

And then there is a comparison with a image2prompt model, the CLIP-Interrogator

pharma/CLIP-Interrogator

caption_task

stable-diffusion-experiments's People

Contributors

pier-maker92 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.