Coder Social home page Coder Social logo

Comments (2)

wildhart avatar wildhart commented on August 24, 2024

@rkusa Have you been able to look into this yet?

I've done some more experiments with different files. The source file is a file I generated with another library jspdf containing text, lines and images.

The full file contains 5 pages with a header image on each page plus two pages with photos.

Here's a summary of my findings:

Description Original Orig Size Converted Converted Size
Full orig.pdf 2.15 Mb converted.pdf 10.7 Mb
No headers orig no headers.pdf 2.09 Mb converted no headers.pdf 10.4 Mb
No images orig no images.pdf 71.8 kB converted no images.pdf 341 kB

Why is PDFjs increasing the output file size so much?

  • Are the fonts getting duplicated for each page?
  • Is the image resolution getting upscaled?
  • Is compression being removed?
  • In the source file, the same header image data is re-used on each page - is this getting duplicated by PDFjs?

from pdfjs.

wildhart avatar wildhart commented on August 24, 2024

Just FYI, I've moved way from using pdfjs for merging PDFs, due to this issue with excessive file sizes, and also (#312) where certain PDF files throw errors when they are merged.

Instead I'm using pdf-lib which is really easy to use to copy pages from one PDF to another, and it doesn't have any problems with the files we've provided in #312 which throw errors in pdfjs, and the output file size is never bigger than the original files. It also seems a bit faster.

I'm still using pdfjs to generate PDF from html, but then I use pdf-lib to combine that with other PDF files.

from pdfjs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.