endlesssora / deeperforensics-1.0 Goto Github PK

[CVPR 2020] A Large-Scale Dataset for Real-World Face Forgery Detection

Python 95.90% Shell 4.10%

face-forgery-detection face-forensics face-manipulation videos dataset benchmark method real-world perturbations deepfakes

deeperforensics-1.0's Introduction

DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection

This repository provides the dataset and code for the following paper:

DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection
Liming Jiang, Ren Li, Wayne Wu, Chen Qian and Chen Change Loy
In CVPR 2020.
Project Page | Paper | YouTube Demo

Abstract: We present our on-going effort of constructing a large-scale benchmark for face forgery detection. The first version of this benchmark, DeeperForensics-1.0, represents the largest face forgery detection dataset by far, with 60,000 videos constituted by a total of 17.6 million frames, 10 times larger than existing datasets of the same kind. Extensive real-world perturbations are applied to obtain a more challenging benchmark of larger scale and higher diversity. All source videos in DeeperForensics-1.0 are carefully collected, and fake videos are generated by a newly proposed end-to-end face swapping framework. The quality of generated videos outperforms those in existing datasets, validated by user studies. The benchmark features a hidden test set, which contains manipulated videos achieving high deceptive scores in human evaluations. We further contribute a comprehensive study that evaluates five representative detection baselines and make a thorough analysis of different settings.

Updates

[02/2021] The technical report of DeeperForensics Challenge 2020 is released on arXiv.
[08/2020] The DeeperForensics Challenge 2020 starts together with ECCV 2020 SenseHuman Workshop.
[05/2020] The perturbation codes of DeeperForensics-1.0 are released.
[05/2020] The dataset of DeeperForensics-1.0 is released.
[02/2020] The paper of DeeperForensics-1.0 is accepted by CVPR 2020.

Dataset

DeeperForensics-1.0 dataset has been made publicly available for non-commercial research purposes. Please visit the dataset download and document page for more details. Before using DeeperForensics-1.0 dataset for face forgery detection model training, please read these important tips first.

Code

The code to implement the diverse perturbations in our dataset has been released. Please see the perturbation implementation for more details.

Competition

We have hosted DeeperForensics Challenge 2020 based on the DeeperForensics-1.0 dataset. The challenge officially starts at the ECCV 2020 SenseHuman Workshop. The prizes of the challenge are a total of $15,000 (AWS promotional code). If you are interested in soliciting new ideas to advance state of the art in real-world face forgery detection, we look forward to your participation!

The technical report of DeeperForensics Challenge 2020 has been released on arXiv.

Summary

Data Collection

We invite 100 paid actors from 26 countries to record the source videos. Our high-quality collected data vary in identities, poses, expressions, emotions, lighting conditions, and 3DMM blendshapes.

Face Manipulation

We also propose a new learning-based many-to-many face swapping method, DeepFake Variational Auto-Encoder (DF-VAE). DF-VAE improves scalability, style matching, and temporal continuity to ensure face swapping quality.

Several face manipulation results:

Many-to-many (three-to-three) face swapping by a single model:

Real-World Perturbation

We apply 7 types (transmission errors, compression, etc.) of distortions at 5 intensity levels. Some videos are subjected to a mixture of more than one distortion. These perturbations make DeeperForensics-1.0 better simulate real-world scenarios.

Benchmark

We benchmark five representative forgery detection methods using the DeeperForensics-1.0 dataset. Please refer to our paper for more information.

Citation

If you find this work useful for your research, please cite our papers:

@inproceedings{jiang2020deeperforensics1,
  title={{DeeperForensics-1.0}: A Large-Scale Dataset for Real-World Face Forgery Detection},
  author={Jiang, Liming and Li, Ren and Wu, Wayne and Qian, Chen and Loy, Chen Change},
  booktitle={CVPR},
  year={2020}
}

@article{jiang2021dfc20,
  title={{DeeperForensics Challenge 2020} on Real-World Face Forgery Detection: Methods and Results},
  author={Jiang, Liming and Guo, Zhengkui and Wu, Wayne and Liu, Zhaoyang and Liu, Ziwei and Loy, Chen Change and Yang, Shuo and Xiong, Yuanjun and Xia, Wei and Chen, Baoying and Zhuang, Peiyu and Li, Sili and Chen, Shen and Yao, Taiping and Ding, Shouhong and Li, Jilin and Huang, Feiyue and Cao, Liujuan and Ji, Rongrong and Lu, Changlei and Tan, Ganchao},
  journal={arXiv preprint},
  volume={arXiv:2102.09471},
  year={2021}
}

Acknowledgments

This work is supported by the SenseTime-NTU Collaboration Project, Singapore MOE AcRF Tier 1 (2018-T1-002-056), NTU SUG, and NTU NAP. We gratefully acknowledge the exceptional help from Hao Zhu and Keqiang Sun for their contribution on source data collection and coordination.

Contact

If you have any questions, please contact us by sending an email to [email protected].

Terms of Use

The use of DeeperForensics-1.0 is bounded by the Terms of Use: DeeperForensics-1.0 Dataset.
The code is released under the MIT license.

deeperforensics-1.0's People

Stargazers

Watchers

Forkers

happog baldrlector zuhengming asdlei99 nidhoggurz deepunsupervisedlearning trendingtechnology chomolungma phymucs jkginfinite boozyguo nx2018 gongchensz awoziji bweng001 kabobobo emilyed66 anothorld zjz5250 eycab trantorrepository amchuz sinianyutian aideveloper-oz peterzs cupcake-oirr cv-ip bruinxiong jokecorleone wit543 wjgaas deepvision007 chenhs-ustb mario-kart-felix xiehuanjun cjyoleon lorenzob xa9223 bholdmanny baopingliu deeplearner888 cuiwenbing corner4world bhavinjawade tzuren wwq111111 mathpopo deepgd cslele neolock kirito1112 kennyvoo lightdxy wbiscuits trucj ca1vin1im dumpmemory xianglizuel anoop-qasolve akashsingh215 sslogan666 xiaoyan0716 momina93 taiyi98 iq-scm jackzhousz sohailkhanmarwat zsrhhh sunanda-mange

deeperforensics-1.0's Issues

About std/x dataset in your experiments in the paper

hi!
Thank you for your wonderful work~
I have noticed that you have used different data settings in your experiments: std/sing 、std/rand 、std/mix, and I am confused that whether you add the same perturbations to the original video data as you did to the manipulated data in the experiments?

What compression level does the std set correspond to FF++ original video?

There are three types of compressed videos in FF++, raw, c23, and c40. I don't know which compression level corresponds to the end_to_end set? Is it raw?

JPEG compression in distortions.py

Hi,

I just wanted to point out that jpeg_compression in distortions.py actually performs pixelation instead.

Script to download them all

Dear DeeperForensics authors,
great work! Thank you so much. In the effort of seamlessly download automatically all the dataset I create a bash script using gdown it worked at the beginning but apparently after a while it brakes with the message for large files only.

Access denied with the following error:

        Too many users have viewed or downloaded this file recently. Please
        try accessing the file again later. If the file you are trying to
        access is particularly large or is shared with many people, it may
        take up to 24 hours to be able to view or download the file. If you
        still can't access a file after 24 hours, contact your domain
        administrator. 

You may still be able to access the file from the browser:

Update: I realized that I was posting the downloading script here which is not correct. @EndlessSora let me know if you want I can share with you the script privately so you can provide it to people that access the dataset

What is the FF++ original video compression level corresponding to end_to_end_random_level?

About DF-VAE

Thank you for your work!

Do you intend to release the relevant code and training scripts about the DF-VAE?

How did you generate deepfakes for frames where no faces were detected in the original video in FaceForensics++?

Thanks for the great dataset. While creating a deepfake dataset, I came up with a question.
When generating deepfakes, sometimes faces are not detected in the original video frames. How did you handle this situation in the frames of the original FaceForensics++ video?

Unable to reproduce the experimental results of the paper

Can you provide the training log corresponding to the experiment of the paper?

I performed the same distortions on the source videos of Faceforensics++, and used this dataset to train the face detection model. The model can quickly converge during training, but its performance on the hidden dataset is very poor? Do you know what the problem is?

In addition, this is my training log. thank you very much!

2020-09-22 10:27:08,954 - INFO: Epoch:0 || Iter:0/549 || Loss:0.69330(0.69330) || Accuracy:0.56250(0.56250) 2020-09-22 10:27:13,801 - INFO: Epoch:0 || Iter:10/549 || Loss:0.33249(0.54624) || Accuracy:0.83594(0.72301) 2020-09-22 10:27:18,548 - INFO: Epoch:0 || Iter:20/549 || Loss:0.10217(0.38276) || Accuracy:0.96875(0.81659) 2020-09-22 10:27:23,361 - INFO: Epoch:0 || Iter:30/549 || Loss:0.09741(0.29400) || Accuracy:0.93750(0.86139) 2020-09-22 10:27:28,143 - INFO: Epoch:0 || Iter:40/549 || Loss:0.09310(0.24562) || Accuracy:0.96094(0.88472) 2020-09-22 10:27:32,883 - INFO: Epoch:0 || Iter:50/549 || Loss:0.05807(0.20966) || Accuracy:0.98438(0.90227) 2020-09-22 10:27:37,603 - INFO: Epoch:0 || Iter:60/549 || Loss:0.07660(0.18957) || Accuracy:0.98438(0.91342) 2020-09-22 10:27:42,391 - INFO: Epoch:0 || Iter:70/549 || Loss:0.05925(0.17513) || Accuracy:0.96875(0.92066) 2020-09-22 10:27:47,103 - INFO: Epoch:0 || Iter:80/549 || Loss:0.07028(0.16092) || Accuracy:0.96875(0.92824) 2020-09-22 10:27:51,832 - INFO: Epoch:0 || Iter:90/549 || Loss:0.06247(0.14881) || Accuracy:0.96094(0.93389) 2020-09-22 10:27:56,724 - INFO: Epoch:0 || Iter:100/549 || Loss:0.09728(0.13896) || Accuracy:0.96875(0.93858) 2020-09-22 10:28:01,467 - INFO: Epoch:0 || Iter:110/549 || Loss:0.04423(0.13061) || Accuracy:0.97656(0.94264) 2020-09-22 10:28:06,290 - INFO: Epoch:0 || Iter:120/549 || Loss:0.09134(0.12317) || Accuracy:0.96875(0.94602) 2020-09-22 10:28:11,023 - INFO: Epoch:0 || Iter:130/549 || Loss:0.02772(0.11808) || Accuracy:0.99219(0.94853) 2020-09-22 10:28:15,779 - INFO: Epoch:0 || Iter:140/549 || Loss:0.02995(0.11276) || Accuracy:0.98438(0.95063) 2020-09-22 10:28:20,518 - INFO: Epoch:0 || Iter:150/549 || Loss:0.02055(0.10843) || Accuracy:1.00000(0.95266) 2020-09-22 10:28:25,399 - INFO: Epoch:0 || Iter:160/549 || Loss:0.04992(0.10441) || Accuracy:0.96094(0.95414) 2020-09-22 10:28:30,255 - INFO: Epoch:0 || Iter:170/549 || Loss:0.02497(0.10071) || Accuracy:0.99219(0.95587) 2020-09-22 10:28:35,166 - INFO: Epoch:0 || Iter:180/549 || Loss:0.03729(0.09727) || Accuracy:0.98438(0.95740) 2020-09-22 10:28:39,957 - INFO: Epoch:0 || Iter:190/549 || Loss:0.03673(0.09374) || Accuracy:0.97656(0.95877) 2020-09-22 10:28:44,687 - INFO: Epoch:0 || Iter:200/549 || Loss:0.03946(0.09064) || Accuracy:0.99219(0.96028) 2020-09-22 10:28:49,426 - INFO: Epoch:0 || Iter:210/549 || Loss:0.02468(0.08788) || Accuracy:0.98438(0.96131) 2020-09-22 10:28:54,239 - INFO: Epoch:0 || Iter:220/549 || Loss:0.04746(0.08512) || Accuracy:0.98438(0.96249) 2020-09-22 10:28:58,963 - INFO: Epoch:0 || Iter:230/549 || Loss:0.03039(0.08289) || Accuracy:0.98438(0.96341) 2020-09-22 10:29:03,685 - INFO: Epoch:0 || Iter:240/549 || Loss:0.08809(0.08134) || Accuracy:0.95312(0.96398) 2020-09-22 10:29:08,470 - INFO: Epoch:0 || Iter:250/549 || Loss:0.02432(0.07950) || Accuracy:0.97656(0.96473) 2020-09-22 10:29:13,233 - INFO: Epoch:0 || Iter:260/549 || Loss:0.02534(0.07781) || Accuracy:1.00000(0.96558) 2020-09-22 10:29:18,048 - INFO: Epoch:0 || Iter:270/549 || Loss:0.03035(0.07645) || Accuracy:0.97656(0.96616) 2020-09-22 10:29:22,891 - INFO: Epoch:0 || Iter:280/549 || Loss:0.01610(0.07478) || Accuracy:0.99219(0.96694) 2020-09-22 10:29:27,696 - INFO: Epoch:0 || Iter:290/549 || Loss:0.02178(0.07328) || Accuracy:0.98438(0.96770) 2020-09-22 10:29:32,495 - INFO: Epoch:0 || Iter:300/549 || Loss:0.01254(0.07157) || Accuracy:1.00000(0.96846) 2020-09-22 10:29:37,226 - INFO: Epoch:0 || Iter:310/549 || Loss:0.01840(0.06979) || Accuracy:0.99219(0.96928) 2020-09-22 10:29:41,951 - INFO: Epoch:0 || Iter:320/549 || Loss:0.02171(0.06842) || Accuracy:0.98438(0.96982) 2020-09-22 10:29:46,696 - INFO: Epoch:0 || Iter:330/549 || Loss:0.00467(0.06701) || Accuracy:1.00000(0.97035) 2020-09-22 10:29:51,466 - INFO: Epoch:0 || Iter:340/549 || Loss:0.01416(0.06609) || Accuracy:1.00000(0.97063) 2020-09-22 10:29:56,222 - INFO: Epoch:0 || Iter:350/549 || Loss:0.01028(0.06513) || Accuracy:1.00000(0.97106) 2020-09-22 10:30:00,983 - INFO: Epoch:0 || Iter:360/549 || Loss:0.02263(0.06392) || Accuracy:0.99219(0.97156) 2020-09-22 10:30:05,720 - INFO: Epoch:0 || Iter:370/549 || Loss:0.03179(0.06285) || Accuracy:0.97656(0.97197) 2020-09-22 10:30:10,445 - INFO: Epoch:0 || Iter:380/549 || Loss:0.03527(0.06230) || Accuracy:0.98438(0.97240) 2020-09-22 10:30:15,216 - INFO: Epoch:0 || Iter:390/549 || Loss:0.00949(0.06134) || Accuracy:1.00000(0.97279) 2020-09-22 10:30:20,041 - INFO: Epoch:0 || Iter:400/549 || Loss:0.05724(0.06046) || Accuracy:0.97656(0.97317) 2020-09-22 10:30:24,888 - INFO: Epoch:0 || Iter:410/549 || Loss:0.00370(0.05961) || Accuracy:1.00000(0.97354) 2020-09-22 10:30:29,716 - INFO: Epoch:0 || Iter:420/549 || Loss:0.04780(0.05885) || Accuracy:0.96875(0.97391) 2020-09-22 10:30:34,467 - INFO: Epoch:0 || Iter:430/549 || Loss:0.04402(0.05810) || Accuracy:0.96875(0.97419) 2020-09-22 10:30:39,184 - INFO: Epoch:0 || Iter:440/549 || Loss:0.05830(0.05733) || Accuracy:0.98438(0.97456) 2020-09-22 10:30:43,892 - INFO: Epoch:0 || Iter:450/549 || Loss:0.02611(0.05658) || Accuracy:0.97656(0.97490) 2020-09-22 10:30:48,628 - INFO: Epoch:0 || Iter:460/549 || Loss:0.02152(0.05582) || Accuracy:0.98438(0.97519)

Any updates on the code & dataset?

Hi,

Thank you for an interesting & well-written paper! Do you have any updates on the code & dataset?

Thank you,
Johannes

Datasets split for stand, and training

Hi, thank you for your work.
According to your paper, the standard set only includes 1k Youtube videos and 1k manipulated videos(end_to_end), right?
And if one wants to train their model on "std+std/sing", he or she need to apply the same pertubation on real videos(1k from ff++ and 100 actors' videos) , since the provided real videos have no pertubations, right?

Looking forward to your reply

Another question about dataset split

Thanks for your interest in our work.

Yes. The standard set only includes 1k Youtube videos and 1k manipulated videos (end_to_end).
Almost correct. The perturbations with a similar distribution should be applied to the real (1k Youtube videos from ff++, but no need for 100 actors' videos since they are source videos used for face manipulation) and fake videos if one would like to train his model on "std+std/sing".

Originally posted by @EndlessSora in #10 (comment)

敬请期待

啥时候才能真正发布出来呀？要是过年的时候开放下载，那我这年都过不踏实了。。。。哈

Questions of benchmark

As the title shows , I have difficulty in reproducing the Results of XceptionNet Baseline .
I hope you could show me some ”Not private“ details of your experiments if you still remember them. Or point out the errors in my own process.

Thank you anyway.

Our total process is shown as follows:

Using face detection method(MTCNN) to detect all frames in FF++_C23 videos, to get original face bounding box --【Boxes only from FF++_c23】；
With the scale (=1.3), enlarge the bounding box(also trying to be a rectangle box ); Then I use the boxes to extract faces in both FF++_C23 videos and DF1.0--end2end--the corresponding fake videos ; --【1.3 faces from Both】；

2.5) Then we have two big folders, each has 1,000 sub-folders of images (1000+1000 == 2000)

The the XceptionNet is trained from the two Folders (train:val: test is about 7: 1：2, so about 0.7x2000 ==1400 sub-folders), and each video/sub-folder produce 270 frames at regular intervals(like frame_0, frame_2, ..., frame_538, if total frames is larger than 540)--【270 frames from each video】
The parameter of XceptionNet is
4.1) batch_size = 32 , epoches = 40
4.2) optimizer_ft = optim.Adam(model.parameters(), lr=0.0002) #Other Default
4.3) exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=2, gamma=0.9)
4.4）Val is done after each epoch has trained
The test process is done with all the images of the test sub-folders (about 0.2 x 2000 == 400)
If test on other dataset, like end_to_end_level_1, the test set is also like above(about 0.2 x 2000 == 400 sub-folders)

About training set and test set

Do you use the entire DeeperForensics_1.0\source_videos as the training set, so when generating the swapped dataset, the model has actually been trained on all source images. The whole model is not subject agnostic. Is it true?