Comments (1)
Hi, unfortunately there isn't any reference for the "ad hoc" method I used to compensate for CFG, but I can give a quick explanation, if you have more questions we can discuss this further...
Because ODEs used in diffusion models are somewhat sensitive to initial conditions, using the CFG "vector" at t-1
to invert and find the t
latent does not give the correct answer (seen in the fact that it is not always possible to invert a generated image back to the latent if the CFG is high). The correct answer is found by finding what CFG vector at t
gives the correct t-1
latent, but since we do not know the latent at t
in the first place, how can we find the CFG vector?
One solution is to use a gradient descent approximation, where we first use the wrong CFG vector (at t-1
) to get an approximation of the latent at t
, then do a forward diffusion pass to re-obtain our latent at t-1
, we can then compute the difference and use gradient descent on the CFG vector.
In my simple implementation, I am assuming that the latent landscape near our point of interest (latent at t
) is a convex and smooth function (which is most likely wrong), thus I am directly doing gradient descent on the latent at t
using the difference of the ground truth and predicted t-1
. (The numerically correct method would be to do backprop through the model twice, but it would be too slow...) This solution provided here is literally an approximation of an approximation, but works quite well for images generated by Stable Diffusion. In my tests, images that were produced using a CFG of up to 5.5 can be reasonably well inverted. For real images, the results are satisfactory in most cases up to a CFG of 4.5, but some images cannot be inverted at all.
For the magic number, it was found empirically. If tless is not used, sometimes the result diverges when re-diffusing the inverted latent and you get a completely grey image.
from crossattentioncontrol.
Related Issues (20)
- negative weighting HOT 1
- how to prevent promp 1 from being distorted HOT 1
- Question about the code in CrossAttention_Release.ipynb HOT 1
- Did you get a same result? HOT 1
- AttributeError: 'dict' object has no attribute 'sample' HOT 2
- Can't run the notebook in Google Colab, some issues with versions. HOT 2
- Notebook error HOT 3
- Add direct target editing to the notebook HOT 1
- How to make image inversion more precise? HOT 8
- please add InverseCrossAttention to colab
- Better support for prompt_edit_token_weights parsing
- Implementing Dreambooth weights HOT 2
- The differences from the official implementation? HOT 1
- Question about original google implementation with stable diffusion HOT 3
- About terms["nll"]
- Why LMSDiscreteScheduler? HOT 1
- Relating to the recent paper about 'Self-guidance' method HOT 1
- An observation HOT 3
- Optmised for 6 GB? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crossattentioncontrol.