Coder Social home page Coder Social logo

freeu's Introduction

Wowchemy Website Builder

Academic Template for Hugo

The Hugo Academic Resumé Template empowers you to create your job-winning online resumé and showcase your academic publications.

Check out the latest demo of what you'll get in less than 10 minutes, or view the showcase.

Wowchemy makes it easy to create a beautiful website for free. Edit your site in Markdown, Jupyter, or RStudio (via Blogdown), generate it with Hugo, and deploy with GitHub or Netlify. Customize anything on your site with widgets, themes, and language packs.

Crowd-funded open-source software

To help us develop this template and software sustainably under the MIT license, we ask all individuals and businesses that use it to help support its ongoing maintenance and development via sponsorship.

Ecosystem

Screenshot

freeu's People

Contributors

chenyangsi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

freeu's Issues

FreeU with Lora Stack

Since there is no Discussion Tab available, I am asking a question here.

In comfyui should we put FreeU node before or after lora Stack ?

How to use it in SimpleCrossAttnUpBlock2D?

I've tried to change your code in order to maintain SimpleCrossAttnUpBlock2D however it seems that shapes doesn't fit up. How can I do it? Thanks!

  File "/usr/local/lib/python3.9/dist-packages/gradio/routes.py", line 523, in run_predict
    output = await app.get_blocks().process_api(
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1437, in process_api
    result = await self.call_function(
  File "/usr/local/lib/python3.9/dist-packages/gradio/blocks.py", line 1109, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/usr/local/lib/python3.9/dist-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.9/dist-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.9/dist-packages/gradio/utils.py", line 865, in wrapper
    response = f(*args, **kwargs)
  File "/home/ubuntu/mimesis-ml-gan-backend/app.py", line 128, in generate
    image = pipe(image=input_image,
  File "/usr/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/mimesis-ml-gan-backend/src/diffusions/kandinsky/pipeline_kandinsky_img2img_scheduler.py", line 125, in __call__
    noise_pred = self.unet(
  File "/usr/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/lib/python3.9/site-packages/diffusers/models/unet_2d_condition.py", line 1020, in forward
    sample = upsample_block(
  File "/usr/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/mimesis-ml-gan-backend/free_lunch_utils.py", line 166, in forward
    hidden_states = torch.cat([hidden_states, res_hidden_states], dim=1)
RuntimeError: Tensors must have same number of dimensions: got 3 and 4 ```

running FreeU on SM_52 cards

hello. i'm trying to use freeU with SD1.5 on a TitanX 12Gb card and i get this message :
RuntimeError: cuFFT doesn't support signals of half type with compute capability less than SM_53, but the device containing input half tensor only has SM_52

the strange thing is that it runs perfectly when used with SDXL so i was wondering ..
(i'm using diffusers)

  • diffusers version: 0.27.0
  • Platform: Linux-5.15.0-94-generic-x86_64-with-glibc2.29
  • Python version: 3.8.0
  • PyTorch version (GPU?): 2.2.1+cu121 (True)
  • Huggingface_hub version: 0.21.4
  • Transformers version: 4.38.2
  • Accelerate version: 0.27.2
  • xFormers version: not installed

Details of the hyperparameters

Thanks for this interesting work. While I was trying to implement this, I find that some key hyperparameters are missing, for example:

  • What are the radius and radius threshold for the skip features?
  • Why use only half of the channels for (x)?
  • What are default recommended values for s and b? Are they the same for different models?

Thanks!

What's your class UNetModel and function timestep_embedding

This question may be a bit silly. When I read your code, I found that your class inherited UNetModel and used the function timestep_embedding, but you did not open source them. Could you please open source them, it would make it easier to try FreeU.

You should look to input blocks

The results from FreeU often cause a colors to be far too intense (with default values) often causing burning that is specifically something looked to be avoided. Applying your method to the input_blocks provides extra details, without the cost of color burning. I have made a mod of this called FreeU Advanced for ComfUI that does this (among other experiments) if you want to take a look.

Threshold of factor s

The factor s used parameters as threshold = 1 , which makes s1 and s2 no sense in inference.
Is this intended ? Or it's just a little bug

FreeU doesn't work with ZLUDA on auto1111

Hello! I'm not sure if it's an issue with Zluda (it probably is), but freeU doesn't seem to work at all with it. Whenever I try to generate an image with freeU activated, it returns an error:

*** Error completing request
*** Arguments: ('task(4ctd2qw8925ax5d)', <gradio.routes.Request object at 0x0000020FAF26F460>, "a test to showcase that freeU doesn't work", '', [], 20, 'Euler a', 1, 1, 7, 1152, 768, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], 0, False, '', 0.8, -1, False, -1, 0, 0, 0, False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96, 48, 4, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 'DemoFusion', True, 128, 64, 4, 2, False, 10, 1, 1, 64, False, True, 3, 1, 1, False, 3072, 192, True, True, True, False, True, 0, 1, 0, 'Version 2', 1.2, 0.9, 0, 0.5, 0, 1, 1.4, 0.2, 0, 0.5, 0, 1, 1, 1, 0, 0.5, 0, 1, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, False, False, False, 0, False) {}
    Traceback (most recent call last):
      File "F:\SD-Zluda\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "F:\SD-Zluda\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "F:\SD-Zluda\modules\txt2img.py", line 110, in txt2img
        processed = processing.process_images(p)
      File "F:\SD-Zluda\modules\processing.py", line 787, in process_images
        res = process_images_inner(p)
      File "F:\SD-Zluda\modules\processing.py", line 1015, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "F:\SD-Zluda\modules\processing.py", line 1351, in sample
        samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
      File "F:\SD-Zluda\modules\sd_samplers_kdiffusion.py", line 239, in sample
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "F:\SD-Zluda\modules\sd_samplers_common.py", line 261, in launch_sampling
        return func()
      File "F:\SD-Zluda\modules\sd_samplers_kdiffusion.py", line 239, in <lambda>
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "F:\SD-Zluda\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "F:\SD-Zluda\repositories\k-diffusion\k_diffusion\sampling.py", line 145, in sample_euler_ancestral
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "F:\SD-Zluda\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "F:\SD-Zluda\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
        return forward_call(*args, **kwargs)
      File "F:\SD-Zluda\modules\sd_samplers_cfg_denoiser.py", line 237, in forward
        x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
      File "F:\SD-Zluda\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "F:\SD-Zluda\venv\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
        return forward_call(*args, **kwargs)
      File "F:\SD-Zluda\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
        eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
      File "F:\SD-Zluda\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
        return self.inner_model.apply_model(*args, **kwargs)
      File "F:\SD-Zluda\modules\sd_models_xl.py", line 44, in apply_model
        return self.model(x, t, cond)
      File "F:\SD-Zluda\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "F:\SD-Zluda\venv\lib\site-packages\torch\nn\modules\module.py", line 1561, in _call_impl
        result = forward_call(*args, **kwargs)
      File "F:\SD-Zluda\modules\sd_hijack_utils.py", line 18, in <lambda>
        setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
      File "F:\SD-Zluda\modules\sd_hijack_utils.py", line 32, in __call__
        return self.__orig_func(*args, **kwargs)
      File "F:\SD-Zluda\repositories\generative-models\sgm\modules\diffusionmodules\wrappers.py", line 28, in forward
        return self.diffusion_model(
      File "F:\SD-Zluda\venv\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
        return self._call_impl(*args, **kwargs)
      File "F:\SD-Zluda\venv\lib\site-packages\torch\nn\modules\module.py", line 1561, in _call_impl
        result = forward_call(*args, **kwargs)
      File "F:\SD-Zluda\modules\sd_unet.py", line 91, in UNetModel_forward
        return original_forward(self, x, timesteps, context, *args, **kwargs)
      File "F:\SD-Zluda\repositories\generative-models\sgm\modules\diffusionmodules\openaimodel.py", line 997, in forward
        h = th.cat([h, hs.pop()], dim=1)
      File "F:\SD-Zluda\extensions\sd-webui-freeu\lib_free_u\unet.py", line 67, in free_u_cat_hijack
        h_skip = filter_skip(
      File "F:\SD-Zluda\extensions\sd-webui-freeu\lib_free_u\unet.py", line 99, in filter_skip
        x_freq = torch.fft.fftn(x.to(fft_device).float(), dim=(-2, -1))
    RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR

I'm running auto1111 on an AMD GPU (rx 6800) with the directml fork of auto1111 and Zluda installed on top, if that helps. I'm not sure what other details I can provide to help troubleshoot the issue, so feel free to ask :)

UnetModel

I save the code in .py file but I get an error when ComfyUi start:

FileNotFoundError: [Errno 2] No such file or directory: 'C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\FreeU\init.py'
Cannot import C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\FreeU module for custom nodes: [Errno 2] No such file or directory: 'C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\FreeU\init.py'

Node is available and works in ComfyUI... but I'm not sure to get 100% of the function.
class Free_UNetModel(UNetModel)
no UnitModel.py file... anywhere, in ComfyUI or A1111 folders

Some feedback

Disclaimer: I'm not an AI researcher so I could've done something wrong.

So I got it working after some minor changes and import fixes. Now when I run it with 512x768 resolution the issue is: RuntimeError: cuFFT only supports dimensions whose sizes are powers of two when computing in half precision, but got a signal size of[12, 8] I suppose that's the corresponding layer size which becomes rectangular because of the base resolution being like that. 512x512 should result in a [8, 8] array here and it all works fine.

As expected, running A1111 with --no-half makes it work but the speed is much worse.

I used the parameters for SD1.4 and simply hardcoded them to quickly test if it works at all. On a fine tuned model epiCRealism naturalSin FreeU makes the images worse: they become more saturated, the skin texture turns into plastic (maybe because we suppress the high frequency features exactly?). It starts looking more like the base models or the early fine tuned models:

Original:
1

FreeU:
2

AnimateDiff doesn't seem to work in --no-half mode, throws a CUDA error. So we're limited by 512x512. Same symptoms of oversaturation, the skin quality doesn't apply due to grain and artifacts. However, FreeU added a third hand. I tried two slightly different prompts.

Original:
02086-751880090
02085-751880090

FreeU:

02087-751880090
02084-751880090

The faces are garbled in all cases but to be honest I much prefer the results without FreeU. The colors are better, the anatomy is better, the skirt is much more detailed and moves more naturally.

My patch, applied to stable-diffusion-webui/repositories/stable-diffusion-stability-ai:

diff --git a/ldm/modules/diffusionmodules/openaimodel.py b/ldm/modules/diffusionmodules/openaimodel.py
index cc3875c..ede0b5a 100644
--- a/ldm/modules/diffusionmodules/openaimodel.py
+++ b/ldm/modules/diffusionmodules/openaimodel.py
@@ -4,6 +4,7 @@ import math
 import numpy as np
 import torch as th
 import torch.nn as nn
+import torch.fft as fft
 import torch.nn.functional as F
 
 from ldm.modules.diffusionmodules.util import (
@@ -418,6 +419,24 @@ class Timestep(nn.Module):
         return timestep_embedding(t, self.dim)
 
 
+def Fourier_filter(x, threshold, scale):
+    # FFT
+    x_freq = fft.fftn(x, dim=(-2, -1))
+    x_freq = fft.fftshift(x_freq, dim=(-2, -1))
+
+    B, C, H, W = x_freq.shape
+    mask = th.ones((B, C, H, W)).cuda()
+
+    crow, ccol = H // 2, W //2
+    mask[..., crow - threshold:crow + threshold, ccol - threshold:ccol + threshold] = scale
+    x_freq = x_freq * mask
+
+    # IFFT
+    x_freq = fft.ifftshift(x_freq, dim=(-2, -1))
+    x_filtered = fft.ifftn(x_freq, dim=(-2, -1)).real
+
+    return x_filtered
+
 class UNetModel(nn.Module):
     """
     The full UNet model with attention and timestep embedding.
@@ -798,8 +817,24 @@ class UNetModel(nn.Module):
             hs.append(h)
         h = self.middle_block(h, emb, context)
         for module in self.output_blocks:
-            h = th.cat([h, hs.pop()], dim=1)
-            h = module(h, emb, context)
+            if True:
+                hs_ = hs.pop()
+
+                # --------------- FreeU code -----------------------
+                # Only operate on the first two stages
+                if h.shape[1] == 1280:
+                    h[:,:640] = h[:,:640] * 1.2
+                    hs_ = Fourier_filter(hs_, threshold=1, scale=0.9)
+                if h.shape[1] == 640:
+                    h[:,:320] = h[:,:320] * 1.4
+                    hs_ = Fourier_filter(hs_, threshold=1, scale=0.2)
+                # ---------------------------------------------------------
+
+                h = th.cat([h, hs_], dim=1)
+                h = module(h, emb, context)
+            else:
+                h = th.cat([h, hs.pop()], dim=1)
+                h = module(h, emb, context)
         h = h.type(x.dtype)
         if self.predict_codebook_ids:
             return self.id_predictor(h)

The simplest way to switch between FreeU and vanilla is to change if True: to if False:. Again, it's just a hack to test if it works.

In conclusion, if everything is correct on my end, it's probably not worth it for the best fine tuned models. On the opposite, to make it work in all cases you have to run it in full 32 bit resolution at ≈3x slowdown and get images that look worse than without it. The base models sure benefit from it but honestly, who uses them except the researchers and LoRA trainers?

I hope I did a mistake somewhere so these results are all wrong. After all, I just copied the part that differs from the original code and fixed the errors to make it work, but who knows.

How do you create the figures in your research paper?

Could you please show me how figures 2 and 3 in the paper were created? Such as,the figure of “The denoising process(figure 2)” and “Relative log amplitudes of Fourier for diffusion inter-mediate steps(figure 3)”

Is there code for the experiment

Hello, Si,
I am very interested in this work. I also want to know if this result hold in other models as well. So I would be appreciated if you can provide details and code of the experiment.
Sincerely Jiahui.Li

Hyperparameter range

Thanks for sharing your hyperparameter values. I am trying to apply FreeU to Diffusion-based TTS model, which is my field of interest.

I have a question because I had difficulty setting hyperparameter ranges during my experiment.
How did you set the range of skip factors and backbone factors?

Some question about FreeU code

Thank you for your excellent and intriguing work! However, I still have some questions regarding the application of FreeU in downstream tasks:
1、Since FreeU doesn't have any parameters to train, do I only need to incorporate the FreeU component during reverse denoising, rather than including it in the training phase?
2、Have you ever experimented with applying FreeU to sequence generation tasks? If so, could you provide insights on how to set the values for 'b' and 's'?
3、The code includes FreeU operations only in the first two up blocks. Have you explored adding FreeU operations to all upblocks? What outcomes or results can be expected from such an approach?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.