Coder Social home page Coder Social logo

Comments (3)

qcf-568 avatar qcf-568 commented on June 26, 2024 2

Hello,

The resize function is forbidden, since this method relies on the Block Artifacts Grids (BAGs) for tampering localization. When the input image is resized (bigger or smaller), its BAGs will be destroyed, so this method does not work in this case.

However, by utilizing the BAGs, model can gain much more better fine-grained perception ability to detect visually consistent tampering, which is important for tampered text detection in documents. Moreover, by utilizing the BAGs, model can gain much more better cross-domain generalization.
So, this methods is sacrificing anti-resize for better detection and generalization abilities.

It's also notable that if an image is resized or never has BAGs (e.g. smartphone screen captures), it can be easily distinguished from the images that have BAGs in Fourier frequency domain by a simple binary classification model.
Therefore, in real-world application, we can conduct tampering localization in a Mixture of Experts manner: first, use a classifier to identify whether the input image has BAGs, if it has BAGs, then crop it to patches and feed the patches into the frequency-based model like this model; if it does not have BAGs, then feed it into a common pure RGB model that is anti-resize. By doing this, the advantages of both of the two model types can be maximized.
In some toB cases, we also can tell users not to resize the image before upload and let them re-upload one if the original one is detected to be resized.

For question 2, the performance is totally not related to "clarity", it is totally related to the existence of BAGs.

from doctamper.

Nomiluks avatar Nomiluks commented on June 26, 2024

@RobotDouble Could you please help me to understand how to test it on our own dataset?

from doctamper.

qcf-568 avatar qcf-568 commented on June 26, 2024

@Nomiluks Please refer to L39~L137 https://github.com/qcf-568/DocTamper/blob/main/models/tsroie/infer_sroie.py​

and this colab notebook https://colab.research.google.com/drive/1rWaSKy2Rsy5welyvj6FbzF01o2zv8ips?usp=sharing

from doctamper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.