Coder Social home page Coder Social logo

hwding / pdf-unstamper Goto Github PK

View Code? Open in Web Editor NEW
357.0 11.0 65.0 241 KB

Remove textual watermark of any font, any encoding and any language with pdf-unstamper now!

License: GNU General Public License v3.0

Java 83.25% Shell 16.75%
pdf pdf-merge stamp tool command-line-tool pdfbox

pdf-unstamper's Introduction

pdf-unstamper


Remove textual watermark of any font, any encoding and any language with pdf-unstamper now!

Disclaimer

This software(pdf-unstamper.jar) is ONLY for technical exchange and CANNOT be held responsible for any consequences caused by pdf modification.

Effect

Before After

Manual

Run the tool without argument will show the following.

Usage:
   [OPTION] -i [INPUT PDF] -k [KEYWORDS...] (-o [OUTPUT PDF])
   [OPTION] -I [INPUT DIR] -k [KEYWORDS...] (-O [OUTPUT DIR])

Options:
   -d,  --direct            directly modify the input file(s), option o/O is
                            unnecessary when this option is on
   -r,  --recursive         process files in the given dir recursively
   -s,  --strict            use strict mode, a text area is considered as watermark
                            only if its content strictly equals one of the keywords
   -c,  --clear             clear all annotations in pages which contains the target
                            textual watermark(s), if you encounter bordered frame issues,
                            enable this switch

Get it now

As a command

Install (or update to) the latest version as command unstamp in ~/bin/ and soft-linked as /usr/local/bin/unstamp for convenience.

Install

Make sure you have wget installed.

➜ sudo bash -c "$(curl -s https://raw.githubusercontent.com/hwding/pdf-unstamper/master/script/install)"

# If using a proxy
➜ sudo proxychains bash -c "$(curl -s https://raw.githubusercontent.com/hwding/pdf-unstamper/master/script/install)"

# Script options
#
# -d [DIR]  install binary into a specified directory
# -w        install without creating a soft-link to /usr/local/bin/

# If install with options, download the script first
➜ wget https://raw.githubusercontent.com/hwding/pdf-unstamper/master/script/install

# Then execute with or without proxy
➜ sudo bash install -d ~/my-bins/
➜ sudo bash install -w
➜ sudo bash install -d ~/my-bins/ -w
➜ sudo proxychains bash install -d ~/my-bins/
➜ sudo proxychains bash install -w
➜ sudo proxychains bash install -d ~/my-bins/ -w

Run

# For single file processing
➜ unstamp -i "C Recipes.pdf" -o "C Recipes.unstamped.pdf" -k www.allitebooks.com -s -c
➜ unstamp -i RoR.pdf -o RoR.unstamped.pdf -k 图灵社区会员
# Or
➜ unstamp -i "C Recipes.pdf" -d -k www.allitebooks.com -s -c
➜ unstamp -i RoR.pdf -d -k 图灵社区会员
 
# For massive files processing
➜ unstamp -I pdfs/ -O unstampedPdfs/ -r -k 图灵社区会员 www.allitebooks.com -c
# Or
➜ unstamp -I pdfs/ -d -r -k 图灵社区会员 www.allitebooks.com -c

As a JAR

Download

Get pdf-unstamper.jar.

Run

# For single file processing
➜ java -jar pdf-unstamper.jar -i "C Recipes.pdf" -o "C Recipes.unstamped.pdf" -k www.allitebooks.com -s -c
➜ java -jar pdf-unstamper.jar -i RoR.pdf -o RoR.unstamped.pdf -k 图灵社区会员
# Or
➜ java -jar pdf-unstamper.jar -i "C Recipes.pdf" -d -k www.allitebooks.com -s -c
➜ java -jar pdf-unstamper.jar -i RoR.pdf -d -k 图灵社区会员
 
# For massive files processing
➜ java -jar pdf-unstamper.jar -I pdfs/ -O unstampedPdfs/ -r -k 图灵社区会员 www.allitebooks.com -c
# Or
➜ java -jar pdf-unstamper.jar -I pdfs/ -d -r -k 图灵社区会员 www.allitebooks.com -c

pdf-unstamper's People

Contributors

dependabot[bot] avatar hwding avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pdf-unstamper's Issues

Not removing Border/Link

Hi,

Thanks for the script. The Java version I tested is not removing the border and the link from allitebooks pdfs.

Thanks

[Bug / Performace / Enhancement ] NullPointerException on many files

Exception in thread "main" java.lang.NullPointerException
	at java.base/java.util.ArrayDeque.addLast(ArrayDeque.java:303)
	at java.base/java.util.ArrayDeque.add(ArrayDeque.java:494)
	at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:174)
	at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169)
	at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169)
	at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.<init>(PDPageTree.java:159)
	at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.<init>(PDPageTree.java:153)
	at org.apache.pdfbox.pdmodel.PDPageTree.iterator(PDPageTree.java:123)
	at java.base/java.lang.Iterable.forEach(Iterable.java:74)
	at com.amastigote.unstamper.core.Processor.process(Processor.java:59)
	at com.amastigote.unstamper.util.TaskRunner.submitToProcessor(TaskRunner.java:53)
	at com.amastigote.unstamper.util.TaskRunner.procSingleFile(TaskRunner.java:44)
	at com.amastigote.unstamper.Main.main(Main.java:54)

[Enhancement] 支持倾斜的水印

Describe the issue
在我的设备上,此程序可以处理横体排列的水印。
希望进一步支持对倾斜水印的删除。

Version
0.2.5/pdf-unstamper.jar

Arguments

java -jar pdf-unstamper.jar -i [A.pdf] -o [A.unstamped.pdf] -k [slanted watermark content]
## OR
java -jar pdf-unstamper.jar -i [A.pdf] -o [A.unstamped.pdf] -k [slanted watermark content] -c -s

Search whole sentence instead of per word

The problem i have is when the water mark has a common word like "TO BE".

With current software, doing

"pdf-unstamper -i in.pdf -k TO BE -o out.pdf"

Will remove all the content with "TO BE" in it.
While using strict mode will not remove my watermark.

It would be nice if i can do

Pdf-unstaper -i in.pdf -k 'TO BE' -o out.pdf

That only remove combination of 'TO' and 'BE' instead of individual word. Since that can remove content of the pdf.

Thank you

[Enhancement ] remove text in special area only?

Describe the issue
Is there possible to remove the text only in a particular area?

If the top-left is the start point, it may look like this:

(width_start_percent, height_start_percent, width_end_percent, height_end_percent)

Maybe the given area is like this, (10, 20, 20, 40)

Version
0.2.5

Arguments
Options you used to run the unstamper.

Expected behavior
A brief description of what you expected to happen.

Original file(s)
Provide original file(s) if possible.

使用没效果

您好,我使用了你的jar包,但是执行之后没有去除文字,麻烦帮看下~

cannot open PDF file

C:>java -jar pdf-unstamper.jar -i 111.PDF -d -k "更多电子书资料请搜索「书行天下」:http://www.sxpdf.com" -s -c
[Processor] Processing PDF file '111.PDF' ...
[Processor] Error loading file '111.PDF' as PDF, skipping

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.