Coder Social home page Coder Social logo

pdftowrite's Introduction

pdftowrite: Annotate PDFs with Stylus Labs Write

A utility that converts PDF to Stylus Labs Write document, and vice versa.

Annotate PDFs

There are two ways to annotate PDFs.

A. Convert PDF -> SVG -> PDF (literally)

  1. pdftowrite example.pdf: Convert *.pdf to *.svgz
  2. (Open example.svgz with Stylus Labs Write and write your notes)
  3. writetopdf example.svgz -o example-annot.pdf: Convert *.svgz to *.pdf

pdftowrite converts PDF pages to SVG paths with invisible but selectable text layers, so you can preserve text as selectable characters.

You should use writetopdf instead of Write's PDF exporter which does not support some features (e.g. Unicode text, multi-coords tspans, etc.).

The result PDF (excluding annotations) is, however, not 100% the same as the original PDF. This is because:

  • PDF and SVG are not 100% compatible
  • Write does not support entire SVG spec, so some modifications are required for compatibility with Write
  • Original text elements are deleted. Instead, a text layer is added to the page as mentioned earlier

B. Annotation mode

  1. pdftowrite example.pdf: Convert *.pdf to *.svgz
  2. (Open example.svgz with Stylus Labs Write and write your notes)
  3. writetopdf --annot example.svgz -o example-annot.pdf: New PDF = Original PDF + Annotations

You can see that --annot option is added in 3. If the option is added, writetopdf creates a new PDF by overlaying annotations on top of the original PDF pages. This is similar to Xournal's method.

You can annotate different PDF file with --pdf-file FILE option. e.g.:

writetopdf --annot --pdf-file example2.pdf example.svgz -o example2-annot.pdf

Install

pip install --user pdftowrite

Requirements

pdftowrite:

  • Poppler (pdfinfo)
  • Inkscape (either native or flatpak)
  • ImageMagick (convert)
  • gzip
  • lxml (libxml2, libxslt)

writetopdf:

  • wkhtmltopdf
  • PDFtk(pdftk-java)
  • librsvg (rsvg-convert)
  • gzip

You need to manually install the packages. e.g.:

  • Debian/Ubuntu: sudo apt install poppler-utils inkscape imagemagick gzip libxml2-dev libxslt-dev wkhtmltopdf pdftk librsvg2-bin
  • Fedora: sudo dnf install poppler inkscape ImageMagick gzip libxml2-devel libxslt-devel wkhtmltopdf pdftk librsvg2-tools
  • Arch: sudo pacman -S poppler inkscape imagemagick gzip libxslt wkhtmltopdf pdftk librsvg

Usage

pdftowrite

usage: pdftowrite [-h] [-v] [-o OUTPUT] [-f] [-m {mixed,poppler,inkscape}]
                  [-C] [-d DPI] [-g PAGES] [-u NODUP_PAGES] [-Z] [-s SCALE]
                  [-x X] [-y Y] [-X XRULING] [-Y YRULING] [-l MARGIN_LEFT]
                  [-p PAPERCOLOR] [-r RULECOLOR]
                  FILE

Convert PDF to Stylus Labs Write document

positional arguments:
  FILE                  A pdf file

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -o OUTPUT, --output OUTPUT
                        Specify output filename
  -f, --force           Overwrite existing files without asking
  -m {mixed,poppler,inkscape}, --mode {mixed,poppler,inkscape}
                        Specify render mode (default: mixed)
  -C, --no-compat-mode  Turn off Write compatibility mode
  -d DPI, --dpi DPI     Specify resolution for bitmaps and rasterized filters
                        (default: 96)
  -g PAGES, --pages PAGES
                        Specify pages to convert (e.g. "1 2 3", "1-3")
                        (default: all)
  -u NODUP_PAGES, --nodup-pages NODUP_PAGES
                        Specify no-dup pages (e.g. "1 2 3", "1-3") (default:
                        all)
  -Z, --nozip           Do not compress output
  -s SCALE, --scale SCALE
                        Scale page size (default: 1.0)
  -x X                  Specify the x coordinate of the viewport of <svg>
                        (default: 10.0)
  -y Y                  Specify the y coordinate of the viewport of <svg>
                        (default: 10.0)
  -X XRULING, --xruling XRULING
                        Specify x rulling (default: 0.0)
  -Y YRULING, --yruling YRULING
                        Specify y rulling (default: 40.0)
  -l MARGIN_LEFT, --margin-left MARGIN_LEFT
                        Specify margin left (default: 100.0)
  -p PAPERCOLOR, --papercolor PAPERCOLOR
                        Specify paper color (default: #FFFFFF)
  -r RULECOLOR, --rulecolor RULECOLOR
                        Specify rule color (default: #9F0000FF)

writetopdf

usage: writetopdf [-h] [-v] [--annot] [--pdf-file PDF_FILE] [-o OUTPUT] [-f]
                  [-g PAGES] [-s SCALE]
                  FILE

Convert Stylus Labs Write document to PDF

positional arguments:
  FILE                  A Write document

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --annot               Use annotation mode
  --pdf-file PDF_FILE   Specify the PDF file to be annotated
  -o OUTPUT, --output OUTPUT
                        Specify output filename
  -f, --force           Overwrite existing files without asking
  -g PAGES, --pages PAGES
                        Specify pages to convert (e.g. "1 2 3", "1-3")
                        (default: all)
  -s SCALE, --scale SCALE
                        Scale page size (default: 1.0)

pdftowrite's People

Contributors

apebl avatar

Stargazers

Siddhant Laddha avatar atanas avatar  avatar Bunk3m avatar Ulysses Zhan avatar eul94458 avatar Ilia Konnov avatar  avatar Joshua Carlson avatar Cedric Schwyter avatar Elsie Hupp avatar

Watchers

James Cloos avatar  avatar

pdftowrite's Issues

Error when converting PDF generated by pdfLaTeX if there are big vertical lines

How to reproduce:

echo '\documentclass{article}\\begin{document}$\\big|$\\end{document}' | pdflatex
pdftowrite texput.pdf

The error:

Traceback (most recent call last):
  File "/home/ulysses/.pyenv/versions/3.10.6/bin/pdftowrite", line 8, in <module>
    sys.exit(main())
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pdftowrite/pdftowrite.py", line 187, in main
    run(sys.argv[1:])
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pdftowrite/pdftowrite.py", line 164, in run
    pages = loop.run_until_complete( convert_to_pages(filename, page_nums, ns) )
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pdftowrite/pdftowrite.py", line 114, in convert_to_pages
    result = await asyncio.gather(*tasks)
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pdftowrite/pdftowrite.py", line 104, in process_page
    return Background(page_num, svg, text_layer_svg, not ns.no_compat_mode)
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pdftowrite/docs.py", line 70, in __init__
    self.__process_svg(svg, text_layer_svg, compat_mode, uniquify)
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pdftowrite/docs.py", line 92, in __process_svg
    self.text_layer = self.__create_text_layer(text_layer_svg)
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/site-packages/pdftowrite/docs.py", line 284, in __create_text_layer
    tree = ET.ElementTree( ET.fromstring(text_layer_svg) )
  File "/home/ulysses/.pyenv/versions/3.10.6/lib/python3.10/xml/etree/ElementTree.py", line 1342, in XML
    parser.feed(text)
xml.etree.ElementTree.ParseError: not well-formed (invalid token): line 24, column 22

Environment:

  • OS: Ubuntu 20.04.4 LTS
  • Python: 3.10.6
  • pdftowrite: 2021.5.3
  • Inkscape: 1.2 (1:1.2.1+202207142221+cd75a1ee6d)
  • TeX: 3.14159265 (TeX Live 2019/Debian)

'type' is not scriptable

Using python3.7, when I run this I get

Traceback (most recent call last):
  File "/usr/local/bin/pdftowrite", line 5, in <module>
    from pdftowrite.pdftowrite import main
  File "/usr/local/lib/python3.7/site-packages/pdftowrite/pdftowrite.py", line 5, in <module>
    import pdftowrite.utils as utils
  File "/usr/local/lib/python3.7/site-packages/pdftowrite/utils.py", line 16, in <module>
    def apply_vars(text: str, vars: dict[str,Any]) -> str:
TypeError: 'type' object is not subscriptable

Output svgz file makes Write crash (segmentation fault)

Here is the PDF file that I wanted to convert. Open the output file with Write and scroll to page 8 will make Write crash. Writing some strokes on page 7 and saving the file will also make Write crash. The message output by Write in the console is

[1]    2758388 segmentation fault (core dumped)  Write

Not sure whether this is a bug of Write.
Environment:

  • OS: Ubuntu 20.04
  • Python 3.10.6
  • pdftowrite 2021.05.03
  • Inkscape 1.2.2 (1:1.2.2+202212051550+b0a8486541)

modul typing throws error

Hello,
I use debian10 with python 3.7. After installation with pip the programm throws an error

def apply_vars(text: str, vars: dict[str,Any] -> str
TypeError: 'type' object is not subscriptable

I think this comes from the modul typing . In the documentation of typing it is Dict[] with capital letter.

I tried then on ubuntu21.04 . There is the other syntax accepted.

What is the reason for failure on debian10?

Sincerely, Helge

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.