Coder Social home page Coder Social logo

somepago / dcr Goto Github PK

View Code? Open in Web Editor NEW
98.0 4.0 4.0 2.82 MB

Official Pytorch repo of CVPR'23 and NeurIPS'23 papers on understanding replication in diffusion models.

License: Apache License 2.0

Python 100.00%
diffusion memorization text-to-image-diffusion text-to-image-generation

dcr's People

Contributors

somepago avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dcr's Issues

Mitigation strategies:Training-time mitigation strategy and inference time mitigation strategy do not use the same Stable diffusion?

Hey,

thank you for providing the code to reproduce your experiments. I want to confirm if my understanding of the different strategies using Stable diffusion versions in the 'Mitigation strategies' section of the article is correct. In the training-time mitigation strategy, you fine-tuned on Stable diffusion 2.1 and then analyzed using the generated model. However, in the inference time mitigation strategy, you directly incorporated the mitigation strategy on Stable diffusion 1.4?

python diff_inference.py get OutOfMemoryError

I first generated the pictures using the diff_inference.py

python diff_inference.py -nb 4000 --dataset laion --capstyle instancelevel_blip --rand_augs rand_numb_add
while I met
File "/home/anaconda3/envs/diffrep/lib/python3.9/site-packages/diffusers/models/cross_attention.py", line 314, in call
attention_probs = attn.get_attention_scores(query, key, attention_mask)
File "/home/anaconda3/envs/diffrep/lib/python3.9/site-packages/diffusers/models/cross_attention.py", line 253, in get_attention_scores
attention_probs = attention_scores.softmax(dim=-1)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.16 GiB (GPU 0; 15.46 GiB total capacity; 11.31 GiB already allocated; 2.48 GiB free; 11.39 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

while I still have a lot of gpu, thanks for your suggestion
image

​​
​​

Info regarding pretrained models

Hi,

I really liked your work!!!
Do you have plans of releasing pretrained models ?
Also, for stable diffusion experiment, does any finetuning is required as it is already trained on Laion 2B dataset ?

Thanks,
Kartik

fine tune model error

accelerate launch diff_train.py
--pretrained_model_name_or_path stabilityai/stable-diffusion-2-1
--instance_data_dir train/images_large
--resolution=256 --gradient_accumulation_steps=1 --center_crop --random_flip
--learning_rate=5e-6 --lr_scheduler constant_with_warmup
--lr_warmup_steps=5000 --max_train_steps=100000
--train_batch_size=16 --save_steps=10000 --modelsavesteps 20000 --duplication nodup
--output_dir=output --class_prompt classlevel --instance_prompt_loc miscdata/laion_combined_captions.json

raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError:

I use the train/images_large in the LAION-10k split here, may I ask where is wrong? Thanks for your reply.

I can't reproduce the results for Inference time mitigation

I have tried to reproduce the results for Inference time mitigation. But I found that compared to the experimental results in the paper, the reproduced FID values are high and the similarity scores are low. Where could the problem lie?
To reproduce the results, I performed the following operations.

  1. I first generated the pictures using the diff_inference.py
    python diff_inference.py -nb 4000 --dataset laion --capstyle instancelevel_blip --rand_augs rand_numb_add
  2. Then, I evaluated the pictures
    python diff_retrieval.py --arch resnet50_disc --similarity_metric dotproduct \ --pt_style sscd --dist-url 'tcp://localhost:10001' --world-size 1 --rank 0 \ --query_dir /root/autodl-tmp/logs/Projects/DCR/inferences/defaultsd/laion/instancelevel_blip_auginfer_rand_numb_add_2/ --val_dir /root/autodl-tmp/laion_10k/train/

For the two python file, I only changed the path of the file,including savepath,checkpath(Stable diffusion 2.1) and prompt_json. And the dataset I used is laion10k which was given in your readme.
But the results were unexpected.
image
image
Fid=21.833.
The experiment in the paper is as shown below:
image
image
As you can see, the sim95_pc is 0.27, which is much different from the 0.556 in the experiment. The same is true for FID values.
Although this experiment was only run once, I feel that the results of one time should not be so different from the experiment.
Can you give me some advice on what the problem might be? Thanks.

Does this paper imply that all outputs are infringement?

My understanding is that Stable Diffusion is a text-to-image deep learning model that generates detailed images based on text descriptions. However, I am concerned about the possibility of copyright or intellectual property infringement if the model copies expressions or content from the input data. Could you explain how Stable Diffusion handles and generates outputs to ensure that they are representative representations learned from training data rather than direct copies?

I have heard that the model outputs are weighted combinations of hierarchical representations of the input, making them "expression machines," but can you confirm whether this is a misconception? Answering these questions will not only aid in understanding the model's functionality but also enable responsible use while adhering to ethical and legal considerations.

caption.json example

I am trying to use python diff_inference.py --dataset laion. Could you provide an example of caption json file in the proper format?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.