somepago / dcr Goto Github PK
View Code? Open in Web Editor NEWOfficial Pytorch repo of CVPR'23 and NeurIPS'23 papers on understanding replication in diffusion models.
License: Apache License 2.0
Official Pytorch repo of CVPR'23 and NeurIPS'23 papers on understanding replication in diffusion models.
License: Apache License 2.0
Hey,
thank you for providing the code to reproduce your experiments. I want to confirm if my understanding of the different strategies using Stable diffusion versions in the 'Mitigation strategies' section of the article is correct. In the training-time mitigation strategy, you fine-tuned on Stable diffusion 2.1 and then analyzed using the generated model. However, in the inference time mitigation strategy, you directly incorporated the mitigation strategy on Stable diffusion 1.4?
Hi
The following packages need to be installed too
scikit-image
seaborn
timm
LovelyPlots
matplotlib
natsort
scikit-learn
triton
Also CLIP is installed via pip in env.yaml, but openai/CLIP#329 (comment)
that should be installed using pip install git+https://github.com/openai/CLIP.git
I first generated the pictures using the diff_inference.py
python diff_inference.py -nb 4000 --dataset laion --capstyle instancelevel_blip --rand_augs rand_numb_add
while I met
File "/home/anaconda3/envs/diffrep/lib/python3.9/site-packages/diffusers/models/cross_attention.py", line 314, in call
attention_probs = attn.get_attention_scores(query, key, attention_mask)
File "/home/anaconda3/envs/diffrep/lib/python3.9/site-packages/diffusers/models/cross_attention.py", line 253, in get_attention_scores
attention_probs = attention_scores.softmax(dim=-1)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 3.16 GiB (GPU 0; 15.46 GiB total capacity; 11.31 GiB already allocated; 2.48 GiB free; 11.39 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
while I still have a lot of gpu, thanks for your suggestion
Hi,
I really liked your work!!!
Do you have plans of releasing pretrained models ?
Also, for stable diffusion experiment, does any finetuning is required as it is already trained on Laion 2B dataset ?
Thanks,
Kartik
accelerate launch diff_train.py
--pretrained_model_name_or_path stabilityai/stable-diffusion-2-1
--instance_data_dir train/images_large
--resolution=256 --gradient_accumulation_steps=1 --center_crop --random_flip
--learning_rate=5e-6 --lr_scheduler constant_with_warmup
--lr_warmup_steps=5000 --max_train_steps=100000
--train_batch_size=16 --save_steps=10000 --modelsavesteps 20000 --duplication nodup
--output_dir=output --class_prompt classlevel --instance_prompt_loc miscdata/laion_combined_captions.json
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError:
I use the train/images_large in the LAION-10k split here, may I ask where is wrong? Thanks for your reply.
I have tried to reproduce the results for Inference time mitigation. But I found that compared to the experimental results in the paper, the reproduced FID values are high and the similarity scores are low. Where could the problem lie?
To reproduce the results, I performed the following operations.
python diff_inference.py -nb 4000 --dataset laion --capstyle instancelevel_blip --rand_augs rand_numb_add
python diff_retrieval.py --arch resnet50_disc --similarity_metric dotproduct \ --pt_style sscd --dist-url 'tcp://localhost:10001' --world-size 1 --rank 0 \ --query_dir /root/autodl-tmp/logs/Projects/DCR/inferences/defaultsd/laion/instancelevel_blip_auginfer_rand_numb_add_2/ --val_dir /root/autodl-tmp/laion_10k/train/
For the two python file, I only changed the path of the file,including savepath,checkpath(Stable diffusion 2.1) and prompt_json. And the dataset I used is laion10k which was given in your readme.
But the results were unexpected.
Fid=21.833.
The experiment in the paper is as shown below:
As you can see, the sim95_pc is 0.27, which is much different from the 0.556 in the experiment. The same is true for FID values.
Although this experiment was only run once, I feel that the results of one time should not be so different from the experiment.
Can you give me some advice on what the problem might be? Thanks.
My understanding is that Stable Diffusion is a text-to-image deep learning model that generates detailed images based on text descriptions. However, I am concerned about the possibility of copyright or intellectual property infringement if the model copies expressions or content from the input data. Could you explain how Stable Diffusion handles and generates outputs to ensure that they are representative representations learned from training data rather than direct copies?
I have heard that the model outputs are weighted combinations of hierarchical representations of the input, making them "expression machines," but can you confirm whether this is a misconception? Answering these questions will not only aid in understanding the model's functionality but also enable responsible use while adhering to ethical and legal considerations.
I am trying to use python diff_inference.py --dataset laion. Could you provide an example of caption json file in the proper format?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.