zhendongwang6 / dire Goto Github PK

View Code? Open in Web Editor NEW

260.0 5.0 20.0 1.95 MB

[ICCV 2023] Official implementation of the paper: "DIRE for Diffusion-Generated Image Detection"

Python 99.14% Shell 0.86%

diffusion-model fake-image-detection image-forensics

dire's Introduction

Hi there 👋

🔭 I’m currently working on computer vision and image processing.

dire's People

Contributors

Stargazers

Watchers

dire's Issues

Differences in the three package of dataset

Hi, thanks for aharing the dataset and code.

I noticed the 3 package(dite, images, recons) are provided, what are the differences between them?

And there is a little suggestion that volume compression zips are more friendly for the people with bad network connection.

Thanks!

The training set data includes the test set data.

I am reproducing this paper, but I encountered some problems during testing. I tested using the following steps, but found some weird things. Is there anything wrong with my steps?

My steps

from the OneDrive location datasets/DiffsuionForensics/dire/train/lsun_bedroom, I downloaded the real dire images real.tar.gz and unpacked them as training set. (There are 40,000 images in real class. I'm trying to re-train the model)
from the OneDrive location datasets/DiffsuionForensics/dire/test/lsun_bedroom/lsun_bedroom, I downloaded the real dire images real.tar.gz and unpacked them as testing set (There are 1,000 images in real class.)
I found that some images in the testing set were also included in the training set. For example: I can found 022f8c89734486038ba814b5d2b8259cd58695a5.jpg, 022f12f989ed6bc972ed921903cd2e2f996a2a85.jpg, 022f19bf68b72ce134f60d246fd48808b1bc5187.jpg ....... both in training set and testing set. (I don't know exactly how much data is included, but I found quite a bit (about 620 images) in the 'real' class.)
I can get results that are the same as those in the paper：

'test_model:lsun_adm' model testing on...
lsun_adm_data:
ACC: 1.00000
AP: 1.00000
R_ACC: 1.00000
F_ACC: 1.00000

But I am curious, if the testing set is already included in the training set, does it mean that the model has seen it during the training phase, and therefore performs well in the test phase?

Virus in dire/test/lsun_bedroom/lsun_bedroom/real.tar.gz

The zip folder of real test dire images from lsun_bedroom seems to be corrupted, as I was unable to download the file and got this error.

It seems like this zip file was uploaded on 12/19/2023, when the change from #22 was fixed. Would it be possible for someone to take a look? We would like to reproduce the results from the paper.

Code Run Error

Why is the code running with an error, the images have already been put into the specified folder by pressing readme?

Can't download pretrained model

Please correct the link. Can't download the pretrained model. Thanks.

Inference progress does not calculate DIRE

As far as I can see, in demo.py the code does not calculate DIRE in inference progress. Instead, it directlt put image in ResNet after several transition.
Or did I miss something?

I hope you'll share the pre-trained model again, thank you very much.

After actual testing, the images generated by stable diffusion cannot be recognized.

give the samples from test result , which predict score are all 0.0000

compute_dire.py questions

Does any body know how to solve this problem?

If any other DIRE image preprocessing methods?

I run ./DIRE/guided-diffusion/compute_dire.py using 256x256_diffusion_uncond.pt on my dataset and get the following images (source, recon and dire):

and other results are all similar. As we can see, DIRE is not as significant as DF. I would like to know whether there are any other preprocessing methods for DIRE images?

About the relased dataset and ckpt

Thank you so much for open-source this work. But I would like to make a small suggestion, can you replace the folder in the web disk with a compressed package, it is too inconvenient to download.

Currently, the setup is divided into 2 steps.
Step 1: Copute_dire.sh to generate an DIRE image from Input image(Real or Fake) (Distributed computing on N GPUS)
Step 2: demo.py to generate probability from DIRE image.

This setup makes it very hard to test and use the model.

I think if we can simplify it to one batched function call to execute both steps will make the model more usable to other researchers or end users.

Example:
Create inference.py file which takes a directory as input and generates probabilities in a CSV file on a Single GPU or M2 Mac.

It seams that the DIRE tensor save format: jpg or png, determine the results of the resnet50 detector

my computh_dir.sh is

## set MODEL_PATH, num_samples, has_subfolder, images_dir, recons_dir, dire_dir
export CUDA_VISIBLE_DEVICES=0
export NCCL_P2P_DISABLE=1
MODEL_PATH="../models/256x256_diffusion_uncond.pt" # "models/lsun_bedroom.pt, models/256x256_diffusion_uncond.pt"

SAMPLE_FLAGS="--batch_size 1 --num_samples 4  --timestep_respacing ddim20 --use_ddim True"
SAVE_FLAGS="--images_dir ../data/single_test --recons_dir ../recons_test/single_test --dire_dir ../dire_test/single_test"
MODEL_FLAGS="--attention_resolutions 32,16,8 --class_cond False --diffusion_steps 1000 --dropout 0.1 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 256 --num_head_channels 64 --num_res_blocks 2 --resblock_updown True --use_fp16 True --use_scale_shift_norm True"
mpiexec --allow-run-as-root -n 1 python compute_dire.py --model_path $MODEL_PATH $MODEL_FLAGS  $SAVE_FLAGS $SAMPLE_FLAGS --has_subfolder True

the diffusion model is 256x256_diffusion_uncond.pt, but i also tried other models like lsun_bedroom.pt.
then I run computh_dir.py to get the DIRE img.
then I run the demo.py to use resnet50 cnn model which weights is lsun_adm.pt:

python demo.py -f /data/github_issue/DIRE/dire_test/single_test/single_test -m /data/github_issue/DIRE/models/lsun_adm.pth

this scripts can get Prob of being synthetic.
In the test, the Fake image is png format, the real image is jpg format. these image is download from DiffusionForensics dataasets
My question is: when I using computh_dir.py to save the DIRE tensor to "PNG" format, the Prob of being synthetic always 1.0000; In the other hand, save to "JPG" format, the Prob of being synthetic always 0.0000, no matter whether fake or real image i use.

wrong ACC/AP in SD-v1

I used lsun_pndm.pth and test on lsun/lsun/sdv1,but got
lsun_lsun_sdv1:
ACC: 1.00000
AP: 1.00000
R_ACC: 1.00000
F_ACC: 1.00000
,and I train the model used the dataset and got the same result.but the paper says the result is 89.4/99.9.Does anyone have the same result with me?

Cannot get the results in paper

I did test as the following steps, but cannot get the same results in paper.
Is there anything wrong in my steps?

My steps

from the OneDrive location dire/test/lsun_bedroom/lsun_bedroom, I downloaded the real dire images real.tar.gz and unpacked them into data/test/lsun_adm_data/0_real. This set has 1000 JPG images.
from the OneDrive location dire/test/lsun_bedroom/lsun_bedroom, I downloaded the fake dire images adm.tar.gz and unpacked them into data/test/lsun_adm_data/1_fake. This set has 1000 PNG images
I downloaded the model checkpoints/lsun_adm.pth from OneDrive and put it in the folder data/exp/test_model/ckpt
I ran test.py by the following command:
python test.py --gpus 0 --ckpt lsun_adm.pth --exp_name test_model datasets_test lsun_adm_data.
This command pointed to the the downloaded model and lsun_adm dataset.

My results

Here is the result I got by the command:

'test_model:lsun_adm' model testing on...
lsun_adm_data:
ACC: 0.89900
AP: 0.99890
R_ACC: 0.79800
F_ACC: 1.00000

These results are different from those in paper. Is there anything wrong in my steps?

My configurations

CPU : Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz
GPU : NVIDIA Corporation TU104GL [Tesla T4]
System : 20.04.1-Ubuntu SMP
python : 3.8.18
torch : 2.0.0
cuda : 11.7

Link to pre-trained model broken

cant access the link to pretrained model
It says the link has been removed

How to run robustness test?

I'd like to know how to run robustness test?
From my perspective, I first add distortion to original images, then run compute_dire.sh to get their dire image, finally run test.sh on dire images. Is this correct?
By the way, in compute_dire.sh, what's the model checkpoint you used indeed for each dataset?

Real Class PNG Format Image Request

I've observed that the images in the "real" folder are in JPG format, while the generated images are in PNG format. I wrote the following code to convert the images in the "adm" folder into JPG:

import os

import cv2
import glob
from pathos.multiprocessing import ProcessingPool as Pool
from tqdm import tqdm

png_images = glob.glob("./*.png")
png_images.sort()

def png2jpg(png_image):
    img = cv2.imread(png_image)
    os.remove(png_image)
    cv2.imwrite(png_image[:-4] + ".jpg", img)


with Pool(8) as p:
    list(tqdm(p.imap(png2jpg, png_images), total=len(png_images)))

and after that i ran the test.py, and the results of lsun_bedroom/lsun_bedroom using lsun_adm.pth:

ACC: 0.49500
AP: 0.50112
R_ACC: 0.79800
F_ACC: 0.19200

Before converting the images, the results were as follows (as seen in issue #9)

ACC: 0.89900                                                                                                                                                                                                                                                                              
AP: 0.99890                                                                                                                                                                                                                                                                               
R_ACC: 0.79800                                                                                                                                                                                                                                                                            
F_ACC: 1.00000

Can you provide real class PNG dire image? Thanks.

Something when unzip the data

I downloaded the dataset from one-drive. when I unzip dataset, I got:

mpiexec runs on colab pro?

I changed compute_dire.sh to have only 2 gpus instead of four (as written by default in the sh file):
export CUDA_VISIBLE_DEVICES=0,1 but still my run is slow and not computing dire images fast enough. I'm connected to a T4 High RAM GPU on Google Colab Pro.

Has anyone faced this problem? Pls help.

About the inversion and reconstruction process

Hello! I found your work quite interesting and inspiring, and I have a question about the procedure before calculating dire.
In section 4.1, it mentioned that ADM is used as the reconstruction model, then how did you finish the inversion process, do you use any specific models or just do it by yourself?

Download link invalid

When I tried to download the pretrained model and dataset, the following message was displayed.
This link has been deleted.
Sorry, access to this document has been removed. Please contact the person who shares it with you.

source code for this image reconstruction

Did he not give the source code for this image reconstruction? I can't find the image reconstruction

something wrong with the dataset

some files in the dataset can not be downloaded correctly

(I have downloaded the dataset twice)

About AP

I used lsun_bed.pth and test on the dire testset lsun_bedroom/lsun_bedroom/adm, 3 accuracy rates are okay but the ap is very low, far less than 1.0.
ACC: 1.00000
AP: 0.30710
R_ACC: 1.00000
F_ACC: 1.00000
I don't know how this happened. Has anyone encountered similar problem?

can you share the dataset

Hi, the link to the dataset won't open, can you share the dataset please?

Onedrive dataset link permission deny

Hello, The onedrive link of the dataset does not seem to have access permission, Can you update the link?Thank you very much

Dataset link is is broken

When will you release the code?

Nice work!
Is there any plan for code release?
Thanks!

M2 Mac issues

I tried to run this on M2 Mac. I get below errors when distributed.all_gather is called.

(dire) skoneru@macbook-pro guided-diffusion % PYTORCH_ENABLE_MPS_FALLBACK=1 ./compute_dire.sh Logging to /Users/skoneru/workspace/DIRE/recons_images/val/imagenet/real Namespace(images_dir='/Users/skoneru/workspace/DIRE/images/val/imagenet/real', recons_dir='/Users/skoneru/workspace/DIRE/recons_images/val/imagenet/real', dire_dir='/Users/skoneru/workspace/DIRE/dire_images/val/imagenet/real', clip_denoised=True, num_samples=16, batch_size=4, use_ddim=True, model_path='models/256x256_diffusion_uncond.pt', real_step=0, continue_reverse=False, has_subfolder=True, image_size=256, num_channels=256, num_res_blocks=2, num_heads=4, num_heads_upsample=-1, num_head_channels=64, attention_resolutions='32,16,8', channel_mult='', dropout=0.1, class_cond=False, use_checkpoint=False, use_scale_shift_norm=True, resblock_updown=True, use_fp16=False, use_new_attention_order=False, learn_sigma=True, diffusion_steps=1000, noise_schedule='linear', timestep_respacing='ddim20', use_kl=False, predict_xstart=False, rescale_timesteps=False, rescale_learned_sigmas=False) have created model and diffusion have created data loader computing recons & DIRE ... dataset length: 5000 Traceback (most recent call last): File "/Users/skoneru/workspace/DIRE/guided-diffusion/compute_dire.py", line 172, in <module> main() File "/Users/skoneru/workspace/DIRE/guided-diffusion/compute_dire.py", line 121, in main dist.all_gather(gathered_samples, recons) # gather not supported with NCCL File "/Users/skoneru/miniconda/envs/dire/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py", line 1451, in wrapper return func(*args, **kwargs) File "/Users/skoneru/miniconda/envs/dire/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py", line 2448, in all_gather work = default_pg.allgather([tensor_list], [tensor]) RuntimeError: ProcessGroupGloo::allgather: invalid tensor type at index 0 (expected TensorOptions(dtype=unsigned char, device=cpu, layout=Strided, requires_grad=false (default), pinned_memory=false (default), memory_format=(nullopt)), got TensorOptions(dtype=unsigned char, device=mps:0, layout=Strided, requires_grad=false (default), pinned_memory=false (default), memory_format=(nullopt)))

Command used is
mpiexec -n 1 python compute_dire.py --model_path $MODEL_PATH $MODEL_FLAGS $SAVE_FLAGS $SAMPLE_FLAGS --has_subfolder True

Changes:
Set device to "mps"

About the code

Thank you for your excellent work on image forgery identification. The DIRE you proposed has very far-reaching reference significance for my research.
You have not released the code of this project, do you plan to disclose the code?
Thanks again!

Odd results in demo.py with your data

I am running simple tests with your data, something like

python demo.py -f test/celebahq/real/000009.jpg  -m checkpoints/celebahq_sdv2.pth
Testing on image 'test/celebahq/real/000009.jpg'
**************************************************
Prob of being synthetic: 1.0000

Every single run is returning 'Prob of being synthetic: 1.0000'

Can you explain?

'lsun_adm:lsun_adm' model testing on...                                                                                                                                                                                                                                                   
lsun_adm:                                                                                                                                                                                                                                                                                 
ACC: 0.89900                                                                                                                                                                                                                                                                              
AP: 0.99890                                                                                                                                                                                                                                                                               
R_ACC: 0.79800                                                                                                                                                                                                                                                                            
F_ACC: 1.00000

Are your results of AP and ACC refer to F_ACC，not including the result of real dataset?

zhendongwang6 / dire Goto Github PK

dire's Introduction

Hi there 👋

dire's People

Contributors

Stargazers

Watchers

Forkers

dire's Issues

My steps

My steps

My results

My configurations

Recommend Projects

Recommend Topics

Recommend Org