Coder Social home page Coder Social logo

dps2022 / diffusion-posterior-sampling Goto Github PK

View Code? Open in Web Editor NEW
412.0 5.0 46.0 16.18 MB

Official pytorch repository for "Diffusion Posterior Sampling for General Noisy Inverse Problems"

Home Page: https://dps2022.github.io/diffusion-posterior-sampling-page/

Dockerfile 0.56% Python 99.28% Shell 0.16%
diffusion-model inverse-problems pytorch

diffusion-posterior-sampling's Introduction

Diffusion Posterior Sampling for General Noisy Inverse Problems (ICLR 2023 spotlight)

result-gif1 result-git2

Abstract

In this work, we extend diffusion solvers to efficiently handle general noisy (non)linear inverse problems via the approximation of the posterior sampling. Interestingly, the resulting posterior sampling scheme is a blended version of the diffusion sampling with the manifold constrained gradient without strict measurement consistency projection step, yielding more desirable generative path in noisy settings compared to the previous studies.

cover-img

Prerequisites

  • python 3.8

  • pytorch 1.11.0

  • CUDA 11.3.1

  • nvidia-docker (if you use GPU in docker container)

It is okay to use lower version of CUDA with proper pytorch version.

Ex) CUDA 10.2 with pytorch 1.7.0


Getting started

1) Clone the repository

git clone https://github.com/DPS2022/diffusion-posterior-sampling

cd diffusion-posterior-sampling

2) Download pretrained checkpoint

From the link, download the checkpoint "ffhq_10m.pt" and paste it to ./models/

mkdir models
mv {DOWNLOAD_DIR}/ffqh_10m.pt ./models/

{DOWNLOAD_DIR} is the directory that you downloaded checkpoint to.

πŸ”ˆ Checkpoint for imagenet is uploaded.


3) Set environment

[Option 1] Local environment setting

We use the external codes for motion-blurring and non-linear deblurring.

git clone https://github.com/VinAIResearch/blur-kernel-space-exploring bkse

git clone https://github.com/LeviBorodenko/motionblur motionblur

Install dependencies

conda create -n DPS python=3.8

conda activate DPS

pip install -r requirements.txt

pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113

[Option 2] Build Docker image

Install docker engine, GPU driver and proper cuda before running the following commands.

Dockerfile already contains command to clone external codes. You don't have to clone them again.

--gpus=all is required to use local GPU device (Docker >= 19.03)

docker build -t dps-docker:latest .

docker run -it --rm --gpus=all dps-docker

4) Inference

python3 sample_condition.py \
--model_config=configs/model_config.yaml \
--diffusion_config=configs/diffusion_config.yaml \
--task_config={TASK-CONFIG};

πŸ”ˆ For imagenet, use configs/imagenet_model_config.yaml


Possible task configurations

# Linear inverse problems
- configs/super_resolution_config.yaml
- configs/gaussian_deblur_config.yaml
- configs/motion_deblur_config.yaml
- configs/inpainting_config.yaml

# Non-linear inverse problems
- configs/nonlinear_deblur_config.yaml
- configs/phase_retrieval_config.yaml

Structure of task configurations

You need to write your data directory at data.root. Default is ./data/samples which contains three sample images from FFHQ validation set.

conditioning:
    method: # check candidates in guided_diffusion/condition_methods.py
    params:
        scale: 0.5

data:
    name: ffhq
    root: ./data/samples/

measurement:
    operator:
        name: # check candidates in guided_diffusion/measurements.py

noise:
    name:   # gaussian or poisson
    sigma:  # if you use name: gaussian, set this.
    (rate:) # if you use name: poisson, set this.

Citation

If you find our work interesting, please consider citing

@inproceedings{
chung2023diffusion,
title={Diffusion Posterior Sampling for General Noisy Inverse Problems},
author={Hyungjin Chung and Jeongsol Kim and Michael Thompson Mccann and Marc Louis Klasky and Jong Chul Ye},
booktitle={The Eleventh International Conference on Learning Representations },
year={2023},
url={https://openreview.net/forum?id=OnD9zGAGT0k}
}

diffusion-posterior-sampling's People

Contributors

dps2022 avatar hj-harry avatar jeongsol-kim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

diffusion-posterior-sampling's Issues

Issue with phase retrieval and Poisson noise

Hello, I am trying to reproduce phase retrieval results with Poisson noise.
I have tried different rates for the noise but I never got a decent result, while results for gaussian noise were good.
are the values needed for phase retrieval with poison noise different from those who needed for PR with gaussian noise?

Thanks!

Impact of batchsize on performance during sampling

Thanks for your nice work. I tried this method on my own dataset and observed a phenomenon. During the testing phase, the performance of the model is affected by the batch size. It seems that a small batch size will give better results, but this will increase the time spent evaluating on the test set. Is this reasonable, or is there any way to fix it?

ModuleNotFoundError: No module named 'models.arch_util'

Hi, I am getting this error while trying to run the code with the nonlinear deblur task. If I try the super resolution task it works.

python3 sample_condition.py \                                              
--model_config=configs/model_config.yaml \
--diffusion_config=configs/diffusion_config.yaml \
--task_config=configs/nonlinear_deblur_config.yaml;
Device set to cpu.
Traceback (most recent call last):
  File "sample_condition.py", line 121, in <module>
    main()
  File "sample_condition.py", line 57, in main
    operator = get_operator(device=device, **measure_config['operator'])
  File "/home/ethan/diffusion-posterior-sampling/guided_diffusion/measurements.py", line 32, in get_operator
    return __OPERATOR__[name](**kwargs)
  File "/home/ethan/diffusion-posterior-sampling/guided_diffusion/measurements.py", line 178, in __init__
    self.blur_model = self.prepare_nonlinear_blur_model(opt_yml_path)     
  File "/home/ethan/diffusion-posterior-sampling/guided_diffusion/measurements.py", line 184, in prepare_nonlinear_blur_model
    from bkse.models.kernel_encoding.kernel_wizard import KernelWizard
  File "/home/ethan/diffusion-posterior-sampling/bkse/models/kernel_encoding/kernel_wizard.py", line 3, in <module>
    import models.arch_util as arch_util
ModuleNotFoundError: No module named 'models.arch_util'

Calculating FID

Hello, thanks for publishing this paper and repo.

I am curious about reproducing the results in the paper. I applied the Gaussian blur model to the first 1,000 images of FFHQ-256 as per Issue #4, but when using torch-fidelity I don't reproduce the FID numbers. If I include torch-fidelity's image resizing, I get 29.3. If I don't include image resizing, I get 37.0. Both of these are pretty far away from the paper value of 44.05.

Could you provide some more details on how to reproduce the numbers of Table 1?

Details for sampling from ImageNet images

Thank you for your excellent work.
When I tried to sample Super Resolution results of ImageNet pictures, I found that the results were not ideal, but there isn't a complete guide for ImageNet operations. So, how can I generate ImageNet results?

Details about pretrained neural network

I am trying to use the pretrained neural network with my own inputs. From my understanding, the output has 6 channels, the first 3 of which are the mean, and the last 3 of which are the variance. I believe the network is trained on T=1000 steps. Therefore, when the input is a clean image, at t=999, the output should be almost unchanged. But when I ran it, although the shape of the face and features are there, the coloring and contrast is completely different. So I am wondering what are the scales of the images for the input and output? The most realistic output I get is when I scale the input to be between 0 and 1, but even in that case, the output is mostly between -1 and 1.

Got exception: invalid load key, '<'.

While running the below task (and others) I get the error Got exception: invalid load key, '<'. but the program proceeds to execute.

python3 sample_condition.py --model_config=configs/model_config.yaml --diffusion_config=configs/diffusion_config.yaml --task_config=configs/gaussian_deblur_config.yaml

Paper & implementation differences

Hi,
There are a few differences between the paper and this repository and it will be wonderful if you could clarify for me the reasons behind them:

  1. The reported gaussain-noisy experiments in the paper use sigma_y=0.05, and indeed in the config files config['noise']['sigma']=0.05.
    But while the images are stretchered from [0,1] to [-1,1], the sigma is unchanged – meaning that in practice the noise added is with std sigma/2, i.e. y_n is cleaner compared to the reported settings in the paper.
    This can be easily checked by computing torch.std(y-yn) after the creation of y and y_n in sample_condition.py.
  2. The paper defines the step-size scalar as a constant divided by the norm of the gradient (Appendix C.2), meaning that we always normalize the gradient before scaling it.
    In the code, the constant is defined in config['conditioning']['params']['scale'] and used in PosteriorSampling.conditioning() to scale the gradient, but we never normalized the gradient in the first place (in PosteriorSampling.grad_and_value() for example).
    By adding the gradient normalization the method seems to break.
  3. For the gaussian FFHQ-SRx4 case, Appendix D.1 defines the scale as 1.0, but configs/super_resolution_config.yaml uses 0.3.

Thank you for your time and effort!

Formulation for Gaussian Sampling

In the paper, if I understand x'{t-1} is the standard sampling method before using the posterior sampling method,
Screenshot 2024-07-26 at 4 32 01β€―PM
but in standard DDIM/DDPM sampling, x
{t-1} is given as
Screenshot 2024-07-26 at 4 32 37β€―PM
Is there anywhere in the paper explaining where the former is derived from the standard equation?

Reproducing results in the paper

Hi, I am trying to reproduce the results from the paper and I cannot find exactly which 1k images of the FFHQ and ImageNet dataset were used for the tables in the paper. Can you please clarify the exact split used for comparing DPS with the other methods?
Thank you!

Using DDIM sampling method

I am trying to use the DDIM sampling method to decrease the number of sampling steps required. When I change the sampler to ddim in diffusion_config.yaml (and change nothing else) with gaussian_deblur_config, I get an output which is just a black image. Do I have to change some of the other parameters too?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.