Coder Social home page Coder Social logo

doctamper's Introduction

DocTamper

The DocTamper dataset is now avaliable at BaiduDrive and Google Drive (part1 and part2).

The DocTamper dataset is only available for non-commercial use, you can request a password for it by sending an email with education email to [email protected] explaining the purpose.

To visualize the images and their corresponding ground-truths from the provided .mdb files, you can run this command "python vizlmdb.py --input DocTamperV1-FCD --i 0".


The official implementation of the paper Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution is in the "models" directory.

I delay the release of training codes as forced by my supervisor and the cooperative enterprise who bought them. My training pipline for DocTamper dataset and the IoU metric heavily brought from a famous project in this area, the results of the paper can be easily re-produced with it, you just need to adjust the loss functions and the learing rate decay curve. I also used its augmentation pipline except for (RandomBrightnessContrast, ShiftScaleRotate, CoarseDropout).

Open Source Scheme:
1、Inference models and codes: June, 2023.
2、Training codes: TBD.
3、Data synthesis code: Within 2024.


Any question about this work please contact [email protected].


If you find this work useful in your research, please consider citing:

@inproceedings{qu2023towards,
  title={Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution},
  author={Qu, Chenfan and Liu, Chongyu and Liu, Yuliang and Chen, Xinhong and Peng, Dezhi and Guo, Fengjun and Jin, Lianwen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={5937--5946},
  year={2023}
}

doctamper's People

Contributors

qcf-568 avatar

Stargazers

 avatar  avatar CV-deeplearing avatar An Dang-Hieu avatar Richard Tseng avatar  avatar  avatar Noman dilawar avatar  avatar 李要杰 avatar  avatar Dongliang Chang avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar KUN avatar stakOverflow avatar varun-affinsys avatar  avatar  avatar fireae avatar  avatar Mohamed Dhouib avatar Zenos avatar  avatar 清吾 avatar  avatar 12345678 avatar trannam avatar Icemoon avatar codeplay avatar Ayush Tiwari avatar Zihao Zhang avatar xuyouqian avatar Johannes avatar Yue Tiezhu avatar lamb_wolf avatar  avatar  avatar Katrina Patterson avatar  avatar  avatar  avatar Jason avatar  avatar  avatar YJiaaaaa avatar Arthur Hemmer avatar  avatar Dezhi Peng avatar  avatar  avatar Luong Huu Thanh Nam avatar Shen Chen(陈燊) avatar Enea Duka avatar Yanru He avatar  avatar ZXYAC avatar RogerYu avatar Chongyu-Liu avatar Zheng Bowen avatar Sandalots avatar 爱可可-爱生活 avatar Ning Lu avatar  avatar Cao Anda avatar  avatar i_Corner avatar PeterYoung avatar day_dayup avatar  avatar  avatar Pearson Zhao avatar Jeonghun Baek avatar  avatar  avatar x5 avatar Chun Chet Ng avatar  avatar TIAN Xin avatar shentao avatar Eswai avatar  avatar  avatar mkz-ccc avatar

Watchers

Sukesh Perla avatar Shen Chen(陈燊) avatar 燃 avatar Luong Huu Thanh Nam avatar  avatar  avatar  avatar

doctamper's Issues

details of data augmentation

Hi, instead of jpeg, is there any augmentation used? is the augmentation order manner (augmentations-then-jpeg or jpeg-then-augmentations)?

The image is not reopened after that last compression

https://github.com/qcf-568/DocTamper/blob/db5189013c9ab4ab4404b0a7a3e64f388dcde18b/dataloader.py#L76C23-L76C23
Hello,
In the dataloader.py file, it appears that the image isn’t reopened after the final compression. This means that if there are three compressions, the image provided to the model has only undergone two compressions. However, the DCT coefficients and the quantization table are from the final compression, which is problematic. Could you clarify the reasoning behind this approach, or is this simply an oversight in the code?

About doctamper

doctamper数据集中都是被篡改过的?真实图像和篡改图像的比例是怎么样,是否有表示图像是篡改还是真实图像的标签呢?

关于重压缩

你好,请问您在训练的时候按照q压缩了一次,在test时压缩1-3次,还是在train和test都压缩了随机1-3次

Tampering detection in an image captured from a tampered document.

Hi, does this model works for use cases whereby a person captures an image of a document using scanner for example, make changes to the document, capture an image of tampered document using smartphone and submit to the system? I tried this model on images with no tampering, captured using smartphone camera but model is showing lot of tampering. Please see the attached image.

prediction

Any guidance is truly appreciated.

FileNotFoundError: [Errno 2] No such file or directory: 'pks/DocTamperV1-TrainingSet/_75.pk'

Traceback (most recent call last):
File "/mnt/data/experiments/DocTamper-main/models/train.py", line 114, in
train_data = TamperDataset(train_imgs_dir, 'train')
File "/mnt/data/experiments/DocTamper-main/models/train.py", line 36, in init
with open('pks/'+roots+'_%d.pk'%minq,'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'pks/DocTamperV1-TrainingSet/_75.pk'

I followed your hints and wrote train.py, but I don't know how to generate the pk file for the training set

public pristine set

Hi, very appreciate for your work, i was wondering if you are planning to public to pristine dataset for further study? Since all the samples in DocTamper dataset are forgery

Question regarding VPH and Swin for feature extraction

Hi, thanks for providing the code for digital tamper detection, great work!

I had a doubt, if the VPH and Swin layers were trained with the full dtd network end2end or they were pre-trained on image-net separately and plugged in as feature extractors?

Had this doubt since, you were using the last three stages of swin transformer.

T-SROIE dataset link

Hello

I know you didn't make the dataset but please can you point out where to find it? Do they also share code/weight? Are the links in the publication? I don't speak Chinese and I didn't find them when translating...

full capacity

I encountered an issue with the training code I wrote: the memory usage kept increasing until it reached full capacity and stopped (around the 4th epoch), preventing the completion of multiple training iterations.
For confidentiality purposes, could you please provide an email address so that I can privately send you this Python file?

self.swin = torch.load('swin_imagenet.pt')

self.vph = torch.load('vph_imagenet.pt')
self.swin = torch.load('swin_imagenet.pt')
self.fph = FPH()
In line 294 of the dtd.py file, why is it not necessary to define self.vph = VPN() and self.swin = self.swin = SwinTransformerV2(), and how is from swins import * used in line 29, and why can't I debug any information on the self.swin variable.

No read the compressed image

In line 76, the image is saved in JPEG format, and the JPEG information is read from the file (line 77). However, the RGB data is not re-read from the file.

image

Specific tampering type of DocTamper Dataset

As shown in Table 2 of this paper, Copy-move, splicing and generation are all included, the specific number of different tampering types is clearly given, So I wonder whether the tampering type of each image is labeled?
or how does the specific number in Table 2 counted, and is there any way to redivide the dataset with tampering type?
Thanks!

有关训练和测试压缩因子的问题

作者您好,我想向您请教如下问题:
1、训练过程中图片的压缩因子应当如何选取
2、测试过程中图片最后不需要以q为质量因子压缩一次吗(代码中图片压缩存入了temp中但是没有保存)
a6bba803b050451184f191c75fde6c0
3、我使用了您提供的代码和权重在DocTamper上进行了测试,在压缩因子为75仅压缩一次的情况下,P R F分别为0.588 0.487 0.533,相较压缩因子在75-100之间选取、压缩1到3次的情况有所下降,请问这是正常现象吗

关于数据集

你好,请问您有计划释出png版本的数据集嘛,数据集中除了掩码和篡改图片之外还有其他的信息吗?dct系数是否存储在数据集中?感谢回答!

sss

代码怎么跑啊,我需要自己训练数据吗

训练集随机压缩因子问题

我在论文中看到 ”所有模型都使用动态JPEG压缩进行训练,以匹配测试集的配置。“但是训练集没有提供像测试集一样"xxx_75.pk"类似 记录随机压缩因子的文件。我想问一下,您提供的训练集是否已经压缩过?
如果没有压缩过,那么训练集随机压缩因子的 minq 参数是多少,是论文中提到的 75 吗?用这个固定的minq训练集来训练,分别在不同的minq测试集进行测试? 还是对应着测试集的?比如使用minq=85 的训练集训练模型来测试minq=85的测试集。
非常感谢您的解答!

Dataset password

After the dataset is downloaded from Baidu Cloud, why do you need a password to decompress it? What is the password set by the author?

About training CATNET on Doctamper

您好,不好意思打扰了我想再请教一下,目前我是用CATNET:https://github.com/mjkwon2021/CAT-Net 这个项目里的默认设置和训练脚本在doctamper的整个训练集和测试集上训练和测试,在训练的时候loss确实在下降,但是到测试集validate的时候,输出的那几个数值都一模一样,第一轮iou就在49.2,后面几轮就连小数都没有变过,loss也加上过lovasz但也没区别,尝试不加载预训练的RGB和DCT也差不多,我检查了一下用来验证的模型确实是更新过的,这方面不太懂感觉很奇怪,大佬可以帮忙分析一下可能是哪里出问题了吗,感谢感谢。
image
image
image

model.load_state_dict(ckpt['state_dict']) 错误

请教一个问题:

model = seg_dtd('',2).cuda()
ckpt = torch.load("./pths/dtd_sroie.pth",map_location='cpu')

以上没有提示错误(DTD类中load 2个权重文件路径已经修改为从 ./pths 目录load)

运行到:
model.load_state_dict(ckpt['state_dict'])

提示错误:
model.load_state_dict(ckpt['state_dict'])
File "/home/vipuser/anaconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for seg_dtd:
Missing key(s) in state_dict: "model.vph.downsample_layers.0.0.weight", "model.vph.downsample_layers.0.0.bias", "model.vph.downsample_layers.0.1.weight", "model.vph.downsample_layers.0.1.bias", "model.vph.downsample_layers.1.0.weight", "model.vph.downsample_layers.1.0.bias", "model.vph.downsample_layers.1.1.weight", "model.vph.downsample_layers.1.1.bias", "model.vph.stages.0.0.gamma", "model.vph.stages.0.0.dwconv.weight", "model.vph.stages.0.0.dwconv.bias", "model.vph.stages.0.0.norm.weight", "model.vph.stages.0.0.norm.bias", "model.vph.stages.0.0.pwconv1.weight", ........

权重文件是从百度网盘下载的,

麻烦问下这有可能是什么问题。谢谢。

DocYamperV1

After the dataset is downloaded from Baidu Cloud, why do you need a password to decompress it? What is the password set by the author?

测试demo,文件缺失

你好,请问如果我想检测我自己的图片,运行哪个文件呢?还有一个问题是“sroie_test_1011.json”和“data.mdb lock.mdb”这三个文件在哪里下载呢?

Regarding dataset access

Respected Sir,
I am a student from IIT Kharagpur, India, working on a Problem Statement relating to fraud detection in medical invoices and bills. We have our competition starting today. I had mailed you earlier regarding this but did not get any reply.

I urgently need access to the dataset.

Please provide me with the password to access the dataset. You can contact me at [email protected].

Invalid JPEG file structure: two SOI markers

dataset里面进行数据预处理的时候,doctamper的某些图片经过多次(一般3次)压缩之后,会出现Invalid JPEG file structure: two SOI markers的报错,但目前我这边没法捕捉到该error,暂时没办法定位是哪一行代码出的问题。

能否公开您的训练数据集 生成代码?

这无疑是一篇非常优秀的论文,对于文档篡改检测研究有着非常大的推动作用,尤其是公布了一个非常优质的文档篡改数据集。但是对于我们这种人来说,对于论文中您非常详细的论述如何构建一个数据集那部分,实在是不得其意,自己动手更是举步维艰,所以就当是行行好,帮助一下我们这些还在苦苦挣扎的同行,不知是否可以公开您的训练数据集生成代码,好让我们可以从头复现这篇著作,从而在复现并且学习的过程中找到更多灵感,为文档篡改检测研究做出更多的贡献。当然,如果出于无奈,实在不能公开的话我们也理解,那您可否能比论文中更加详细的解释一下数据集生成过程,好让我们循规蹈矩,照虎画猫。感恩感恩,感激不尽。

请教下关于训练脚本

您好,我目前尝试在doctamper的FCD和SCD数据集上,仿照您推荐的那个天池比赛训练方案,用SoftCrossEntropy和Diceloss作为损失函数,尝试过efficientnet_b6+unet++和resnet101+deeplabV3进行训练,但是损失函数好像一直都没下降,f1-score也没提高,在这方面没有什么经验,想请教下可能是哪里设置有问题,应该怎么调?感谢
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.