Coder Social home page Coder Social logo

fdawgs / node-poppler Goto Github PK

View Code? Open in Web Editor NEW
164.0 3.0 23.0 95.67 MB

Asynchronous node.js wrapper for the Poppler PDF rendering library

Home Page: https://npmjs.com/package/node-poppler

License: MIT License

JavaScript 100.00%
pdf-converter pdf async attach cairo converter detach html pdf-to-cairo pdf-to-html

node-poppler's Introduction

Well Bonjour!

πŸ₯ I work for a clinical software developer working with the NHS.

Previously worked within the NHS over six years across Taunton and Somerset NHSFT, Yeovil District Hospital NHSFT, and Somerset NHSFT, as an information analyst and then an interface developer.

Have a nack for automating myself out of employment.

Contact

LinkedIn icon Email icon

Stats

GitHub Stats Top Languages

node-poppler's People

Contributors

arthurdenner avatar dependabot-preview[bot] avatar dependabot[bot] avatar fdawgs avatar github-actions[bot] avatar greenkeeper[bot] avatar harm-less avatar michaelleehobbs avatar multics avatar rntrp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

node-poppler's Issues

Flatten PDF

API Docs do not have a mention of how to flatten PDF.

i instaleed this pakage and get errore. Error: spawn ./usr/bin/pdftocairo ENOENT at Process.ChildProcess._handle.onexit (internal/child_process.js:240:19) ".os:centos8 ,node-poppler version 4.1.1

Describe the bug

A clear and concise description of what the bug is.

To Reproduce

Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior

A clear and concise description of what you expected to happen.

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information)

  • OS: [e.g. Windows]
  • Version [e.g. Windows 10]
  • Node.js Version [e.g. NodeJS v14.8.0]

Additional context

Add any other context about the problem here.

want to use popler by importing it with wasm

Prerequisites

  • I have written a descriptive title

  • I have searched existing feature requests to ensure it has not already been proposed

  • I agree to follow the Code of Conduct that this project adheres to

Description

Poppler provide C++ api for pdf manipulation
I can write some wrapper function to export these apis
and I can export it into wasm using emscripten

if there is existing wasm package, I hope to use it directly.
Otherewise, I can submit a PR

poppler.pdfInfo always reports 0 bytes in fileSize if PDF file is Buffer

Prerequisites

  • I have written a descriptive issue title

  • I have searched existing issues to ensure it has not already been reported

  • I agree to follow the Code of Conduct that this project adheres to

API/app/plugin version

5.1.5

Node.js version

16

Operating system

Linux

Operating system version (i.e. 20.04, 11.3, 10)

20.04

Description

The poppler.pdfInfo always reports 0 bytes in fileSize if the PDF file is Buffer.

Steps to Reproduce

Add the below code to index.test.js and run to see it

test("Should list info of PDF file as Buffer as a JSON object", async () => {
	const poppler = new Poppler(testBinaryPath);
	const attachmentFile = await fs.promises.readFile(file);

	const res = await poppler.pdfInfo(attachmentFile, {
		printAsJson: true,
	});

	expect(res).toMatchObject({
		tagged: "yes",
		userProperties: "no",
		suspects: "no",
		form: "AcroForm",
		javaScript: "no",
		pages: "16",
		encrypted: "no",
		pageSize: "595.276 x 841.89 pts (A4)",
		pageRot: "0",
		fileSize: "583094 bytes",
		optimized: "no",
		pdfVersion: "1.3",
	});
});

The test reports is

  ● Node-Poppler Module β€Ί pdfInfo Function β€Ί Should list info of PDF file as Buffer as a JSON object

    expect(received).toMatchObject(expected)

    - Expected  - 1
    + Received  + 1

    @@ -1,8 +1,8 @@
      Object {
        "encrypted": "no",
    -   "fileSize": "583094 bytes",
    +   "fileSize": "0 bytes",
        "form": "AcroForm",
        "javaScript": "no",
        "optimized": "no",
        "pageRot": "0",
        "pageSize": "595.276 x 841.89 pts (A4)",

      414 |                     });
      415 |
    > 416 |                     expect(res).toMatchObject({
          |                                 ^
      417 |                             tagged: "yes",
      418 |                             userProperties: "no",
      419 |                             suspects: "no",

      at Object.toMatchObject (src/index.test.js:416:16)

Test Suites: 1 failed, 1 total
Tests:       1 failed, 90 passed, 91 total
Snapshots:   0 total
Time:        32.347 s, estimated 33 s
Ran all test suites.

Expected Behaviour

It should report correct fileSize.

Problem with encoding when returning buffer instead of output file

Hello, if I choose not to save to file directly, but rather get the output to a buffer to do whatever I need to do with it, it seems that the returned buffer is corrupted.

const pdfBuffer = await fs.promises.readFile(inputPdfFile)
const result = await poppler.pdfToCairo(pdfBuffer, null, { singleFile: true, pngFile: true }) // jpg too!
const pngBuffer = Buffer.from(result, 'utf-8') // this buffer is always broken
await fs.promises.writeFile(outputPath, pngBuffer);

I tried setting the encoding of Buffer.from() to binary as well, but when the file is saved it is always broken. From a quick look at the code it seems that the problem comes from the fact that the png contents are converted to utf-8 on the way... one clue about this is that SVG output works, because SVG is a text-based format, while PNG (and JPEG, for that matter) is a binary format, and they get corrupted.

node-poppler/src/index.js

Lines 749 to 753 in 3cfb17b

if (Buffer.isBuffer(file)) {
child.stdin.setDefaultEncoding("utf-8");
child.stdin.write(file);
child.stdin.end();
}

Allow for each function's bin path to be seperately defined

Prerequisites

  • I have written a descriptive title

  • I have searched existing feature requests to ensure it has not already been proposed

  • I agree to follow the Code of Conduct that this project adheres to

Description

At present, all functions use the popplerPath property for their bin path.

This should be broken down to an individual level so that it can be modified if needs be:

const { Poppler } = require('node-poppler');

const poppler = new Poppler('/usr/bin');

poppler.pdfToTextPath = '/totallydifferentpath/bin';
poppler.pdfToHtmlPath = '/anotherpath';

await poppler.pdfToText(new Buffer('bleh'));

error dyld: Library not loaded: @rpath/libpoppler.100.dylib

Describe the bug

When i try to run the NodeJs code with the node-poppler lib i have the
"Library not loaded: @rpath/libpoppler.100.dylib" error. As can be seen in the printscreen below

To Reproduce

i have this simple function for converting pdf to jpg,just like one of the examples in the page of the lib, when ran in macOS had the error

Expected behavior

It was expected to convert the files as normal and as it happens in my Windows system too.

Screenshots

image

Desktop (please complete the following information)

  • Mac OS 10.13
  • Node.js Version 12.16

Documentation Fix?

Describe the bug

The first example of the documentation returns an error when ran because an output file is required. I suggest changing this to include the parameter, something like ./filepath.png
.

Can the user decide whether to export images or HTML to a directory?

Prerequisites

  • I have written a descriptive title

  • I have searched existing feature requests to ensure it has not already been proposed

  • I agree to follow the Code of Conduct that this project adheres to

Description

Currently, when I use the poppler.pdfToCairo method to export images, I don’t want to export them directly to the specified directory. I want to get the file stream, buffer, etc., and upload them directly to oss, so that I don’t have to read local files. I looked through the relevant documents. I haven't found any similar API. Can the user decide whether to export?

Why is the binPath needed?

Prerequisites

  • I have written a descriptive title

  • I have searched existing feature requests to ensure it has not already been proposed

  • I agree to follow the Code of Conduct that this project adheres to

Description

First, thank you for making this!

It took me a while to figure out what I was looking for in terms of the poppler-utils directory, but I figured it out. It does seem like brew install poppler put the utils onto the path, though, so I just wondering why this path is needed. On my local machine this path will be different from in CI, so just curious if there is a trick to get around providing the path or something.

The 'undefined' option for pdfToCairo's second parameter does not produce valid output for tiff files.

Prerequisites

  • I have written a descriptive issue title

  • I have searched existing issues to ensure it has not already been reported

  • I agree to follow the Code of Conduct that this project adheres to

API/app/plugin version

5.1.6

Node.js version

v16.13.2

Operating system

macOS

Operating system version (i.e. 20.04, 11.3, 10)

12.4

Description

I am trying to fetch a single page from a pdf without writing it down in a separate image file. This works beautifully for jpg's and png's as documented in https://github.com/Fdawgs/node-poppler/blob/master/README.md#popplerpdftocairo.
For tiff files it's a different story though: It works only if I give an output file as a second parameter in the pdfToCairo function, but not when I use 'undefined'. The result ist way to small - like it is only the header of the tiff file or something.

I checked wether my poppler version (22.05.0) is working correctly on the comand line. It does.
pdftocairo -tiff -f 1 -l 1 -singlefile example.pdf - > example.tiff works perfectly on the shell.

As far as I can see - node-poppler does send the correct params to the spawned child process. But the result is a very short string - sth like this:
image

From debugging index.js in node-poppler I can see that

image

is only called once for the whole childprocess. This could be the problem.

Steps to Reproduce

This code should be enough to see that foo.tif is not a valid tiff file.
You can use any or the provided pdf file: example.pdf

import { Poppler } from 'node-poppler';
import fs from 'fs';
import path from 'path';

const file = fs.readFileSync(path.join(__dirname, 'example.pdf'));

(async () => {
  const poppler = new Poppler('/usr/bin');
  const res: string | Error = await poppler.pdfToCairo(file, undefined, {
    firstPageToConvert:1,
    lastPageToConvert: 1,
    singleFile:true,
    tiffCompression: 'jpeg',
    tiffFile: true
    //pngFile: true

  });
  if (res instanceof Error) {
    console.log('Error: ' + JSON.stringify(res));
    return;
  }
  fs.writeFileSync('foo.tif', res, { encoding: 'binary' })
})();

Additional information:
Though I wrote this code on OSX I also tried it on a docker container with alpine linux expecting the behaviour to be an OSX glitch. But I could also reproduce the problem on linux successfully.

Expected Behaviour

The expected behaviour should be equal for all possible output formats - meaning when using an 'undefined' outputfile and the -singleFile Option the resulting string should contain valid image data.

How to handle errors?

Prerequisites

  • I have written a descriptive issue title

  • I have searched existing issues to ensure it has not already been reported

  • I agree to follow the Code of Conduct that this project adheres to

API/app/plugin version

Any

Node.js version

16

Operating system

Windows

Operating system version (i.e. 20.04, 11.3, 10)

10

Description

How to handle errors?

Error on server Error: Command failed: C:\helloworld\node_modules\node-poppler\src\lib\win32\poppler-21.03.0\Library\bin\pdfseparate C:\helloworld\cache\main.pdf C:\helloworld\cache\pdfs\small-pdf-%d.pdf
Syntax Warning: May not be a PDF file (continuing anyway)
Syntax Error (1): Illegal character '{'
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't find trailer dictionary
Syntax Error: Couldn't read xref table
Syntax Error: Could not extract page(s) from damaged file ('C:\helloworld\cache\main.pdf')

    at ChildProcess.exithandler (node:child_process:397:12)
    at ChildProcess.emit (node:events:390:28)
    at maybeClose (node:internal/child_process:1062:16)
    at Process.ChildProcess._handle.onexit (node:internal/child_process:301:5) {
  killed: false,
  code: 99,
  signal: null,
  cmd: 'C:\\helloworld\\node_modules\\node-poppler\\src\\lib\\win32\\poppler-21.03.0\\Library\\bin\\pdfseparate C:\\helloworld\\cache\\main.pdf C:\\helloworld\\cache\\pdfs\\small-pdf-%d.pdf',
  stdout: '',
  stderr: 'Syntax Warning: May not be a PDF file (continuing anyway)\r\n' +
    "Syntax Error (1): Illegal character '{'\r\n" +
    "Syntax Error: Couldn't find trailer dictionary\r\n" +
    "Syntax Error: Couldn't find trailer dictionary\r\n" +
    "Syntax Error: Couldn't read xref table\r\n" +
    "Syntax Error: Could not extract page(s) from damaged file ('C:\\helloworld\\cache\\main.pdf')\r\n"

Steps to Reproduce

Just used function like this:

 await poppler.pdfSeparate(mainPdf, pdfsDir)

Expected Behaviour

Should have given err that we can try catch but I can't .

pdfToText response limit to 1MB

Prerequisites

  • I have written a descriptive issue title

  • I have searched existing issues to ensure it has not already been reported

  • I agree to follow the Code of Conduct that this project adheres to

API/app/plugin version

node-poppler:5.1.1 | poppler:21.11.0

Node.js version

v17.9.0

Operating system

Linux

Operating system version (i.e. 20.04, 11.3, 10)

node:17-alpine

Description

Text is truncated to 1MB.
Is there some limit?

Steps to Reproduce

const { Poppler } = require('node-poppler');
const prettyBytes = require('pretty-bytes');

let popplerOptions = {};
popplerOptions.firstPageToConvert = 3; // 3rd page of 700

const poppler = new Poppler('/usr/bin/');
poppler.pdfToText('my-pdf-with-700-pages-and-30mb.pdf', undefined, popplerOptions)
.then((res) => {
    // res is only 1MB - text is cut off
    let textSize = prettyBytes(res.length);
    console.log(JSON.stringify([res.length, textSize]));
 });

// [1028143, "1.03 MB"]
# running as cli works OK and extracts all text in file
pdftotext -f 3 my-pdf-with-700-pages-and-30mb.pdf

Expected Behaviour

Returns all text found in PDF.

`singleFile` option for pdfToCairo produces corrupt files when writing to stdout

Prerequisites

  • I have written a descriptive issue title

  • I have searched existing issues to ensure it has not already been reported

  • I agree to follow the Code of Conduct that this project adheres to

API/app/plugin version

5.0.2

Node.js version

16.13.0

Operating system

Windows

Operating system version (i.e. 20.04, 11.3, 10)

10

Description

Use of singleFile option when combined with any image file option in pdfToCairo produces corrupted files when writing to stdout.

Steps to Reproduce

const file = 'test_file.pdf';
const poppler = new Poppler();
const options = {
	jpegFile: true,
	singleFile: true,
};

const res = await poppler.pdfToCairo(file, undefined, options);

fs.writeFileSync(`${testDirectory}test1.jpg`, res);

Resulting file also ends up being double the size of what would be generated if passing an output path to pdfToCairo.

Corruption does not occur when output path is defined.

Expected Behaviour

No response

I think you might want /usr/bin in place of ./usr/bin in the docs

Prerequisites

  • I have written a descriptive issue title

  • I have searched existing issues to ensure it has not already been reported

  • I agree to follow the Code of Conduct that this project adheres to

API/app/plugin version

No response

Node.js version

16.16.0

Operating system

Linux

Operating system version (i.e. 20.04, 11.3, 10)

Ubuntu

Description

Just a heads up... I spent some time debugging and feel like the docs should say

/usr/bin

in place of

./usr/bin

At least that is what I needed to get it to work, but I am no Linux expert :)
Cheers

Steps to Reproduce

Just a documentation note.

Expected Behaviour

No response

Binaries not found when packaging app as asar archive with Electron

Describe the bug

I consider this more as a caveat than an actual bug:

When using this library in an app that is packaged as asar archive, e.g. an Electron application packaged with electron-packager, electron-forge or electron-builder, the binaries will not be found when running the installed application.

This is due to execa using child_process.spawn and not child_process.execFile. Only the latter will cause unpacking the binaries/executing the unpacked binaries, while the first one will try to execute a path like ...\resources\app.asar\node_modules\node-poppler\src\lib\win32\poppler-0.90.1\bin. This behaviour is described in the Electron docs

This can be mitigated by setting the popplerPath in the Poppler constructor manually.

Expected behavior

  • Throw a propper error when the binaries are not found
  • Add an option to use execFile instead of spawn / migrate to execFile - this could be an option if the actual output is written to a file and not stdout.

Desktop (please complete the following information)

  • OS: Windows 10 / OSX
  • Node.js Version: 12.16.3

Additional context

Electron: ^9.0.0 and ^10.0.0

Drop support for Node 14 and 16

Node 14 is already EOL and Node 16 becomes EOL on 2023-09-11.
It's a waste of time and CI resources/electricity to continue to support these as users should be moving off of them.

Will drop support on 2023-10-01.

Poppler win32 binaries is missing language pack

pdftocairo convert to image some characters can not be display as below error message

Missing language pack for 'Adobe-GB1' mapping
Missing language pack for 'Adobe-CNS1' mapping
No font in show

Suggest to build win 32 poppler with poppler-data

below win32 build is included Poppler-data for your reference
https://tm23forest.com/contents/poppler-for-windows

Without poppler-data.
$ pdfinfo -listenc

Available encodings are:
ASCII7
Latin1
Symbol
UTF-16
UTF-8
ZapfDingbats

With poppler-data.
$ pdfinfo -listenc

Available encodings are:
ASCII7
Big5
Big5ascii
EUC-CN
EUC-JP
GBK
ISO-2022-CN
ISO-2022-JP
ISO-2022-KR
ISO-8859-6
ISO-8859-7
ISO-8859-8
ISO-8859-9
KOI8-R
Latin1
Latin2
Shift-JIS
Symbol
TIS-620
UTF-16
UTF-8
Windows-1255
ZapfDingbats

Put windows poppler binaries in a separate package

Prerequisites

  • I have written a descriptive title
  • I have searched existing feature requests to ensure it has not already been proposed
  • I agree to follow the Code of Conduct that this project adheres to

Description

Hello, I've noticed that this package bundles all the windows binaries regardless of what platform it's being installed on.

I suggest putting all the windows binaries into a separate package and adding

  "os": [
    "win32"
  ]

to that packages package.json.

You can then include that package using optionalDependencies which will only install the package if we are on windows and ignore it on any other platform.

Doing this will remove over 99% of the unpacked package size.

Having to include almost 50MB of unused files in a docker image is not great (especially since you need to install the Linux binaries on top of that anyway).

Also might as well include

  "cpu": [
    "x64",
  ]

while you're at it since the binaries won't work on x86 or arm builds of windows.

Poppler library not supported in Linux

When I run the application in windows, its working as expected.(pdf to image conversion using poppler.pdftocairo)
When the application is deployed in linux environment, pdf to image conversion is not working.
I am getting below error. Please let me know, how to fix. If there is any package need to be added, please share the procedure.
image

PDF-to-TIFF conversion with `pdfToCairo()` throws error, only converts first page

Describe the bug

PDF-to-TIFF conversion with pdfToCairo() throws Internal process error, only converts first page

To Reproduce

Steps to reproduce the behavior:

const poppler = new Poppler();
const options = {
	tiffFile: true,
};
const outputFile = `${testDirectory}pdf_1.3_NHS_Constitution`;

const res = await poppler.pdfToCairo(file, outputFile, options);

Screenshots

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information)

  • OS: Windows
  • Version: 10
  • Node.js Version: v14.17.6

Additional context

Appears to be a known issue.
See Belval/pdf2image#206

Add ability to read PDF files from stdin

Describe the solution you'd like
Poppler Utils support the ability to read PDF files from stdin for most of the binaries (besides pdfattach, pdfdetach, pdfseparate, pdfunite).

The functionality for this would ideally be reflected in this wrapper module as well.

Support for streams

Prerequisites

  • I have written a descriptive title

  • I have searched existing feature requests to ensure it has not already been proposed

  • I agree to follow the Code of Conduct that this project adheres to

Description

Thank you for a really nice library.

I have been wrapping parts of Poppler myself for some time, but I will probably switch over to node-poppler. Node-poppler is more thought through with argument handling and overall more neat than my code.

Is there any reason you don't support streams? Using streams is quite neat since you can build a pipeline with for example Sharp to convert output files and then stream them to storage (for example AWS S3).

I think stream support could fit nicely in your API, but I'm wondering if there are any drawbacks that I'm not aware of.

Update OSX Poppler binaries to 20.12.1

Is your feature request related to a problem? Please describe.
OSX Poppler binaries included with this module are currently at v0.89.0, (found in ./src/lib/darwin/poppler-0.89.0).

The latest at time of writing is v20.12.1 and provides the following fixes/enhancements to the binaries:

  • pdftoppm: Add option to set display profile
  • pdftoppm: report error and exit if output file cannot be written
  • pdftops: Add a -rasterize option with values always, never, or when needed
  • Document that PDF-file can be '-' to read it from stdin

pdfToCairo result "Couldn't open 'nameToUnicode' file"

Prerequisites

  • I have written a descriptive issue title

  • I have searched existing issues to ensure it has not already been reported

  • I agree to follow the Code of Conduct that this project adheres to

API/app/plugin version

22.04.0

Node.js version

14.18

Operating system

Windows

Operating system version (i.e. 20.04, 11.3, 10)

21H2

Description

convert pdf file to jpg files but encounter following error

I/O Error: Couldn't open 'nameToUnicode' file 'node_modules\node-poppler\src\lib\win32\poppler-22.04.0\share\poppler\nameToUnicode\Bulgarian'
I/O Error: Couldn't open 'nameToUnicode' file 'node_modules\node-poppler\src\lib\win32\poppler-22.04.0\share\poppler\nameToUnicode\Greek'
I/O Error: Couldn't open 'nameToUnicode' file 'node_modules\node-poppler\src\lib\win32\poppler-22.04.0\share\poppler\nameToUnicode\Thai'

Steps to Reproduce

const _opts = {
        scalePageTo:3072,
        jpegFile:true
    };
poppler.pdfToCairo(pdfFilePath,undefined, _opts)
            .then(res => {
               ...
            })
            .catch(error => {
                console.error(error);
                reject(error);
            })

Expected Behaviour

should result with jpg files

Add version checking of binary to determine what commands are available

Describe the solution you'd like
New releases of Poppler introduce new options/args to the util binaries, which are subsequently added to this module's functions.
Users may be using an older version of the Poppler util binaries with this module, and may attempt to use the new options.

The module should determine whether the Poppler util binaries provided to this module have the options passed to the functions, and throw an error if not.

pdfToCairo: Error opening output file fd://0.png

Prerequisites

  • I have written a descriptive issue title

  • I have searched existing issues to ensure it has not already been reported

  • I agree to follow the Code of Conduct that this project adheres to

API/app/plugin version

6.0.3

Node.js version

16.18.1

Operating system

Linux

Operating system version (i.e. 20.04, 11.3, 10)

Amazon Linux 2

Description

I moved my server from Heroku to AWS EC2 instance. On heroku everything worked fine, but on the AWS instance I get this error:

Error: Error opening output file fd://0.png
    at ChildProcess.<anonymous> (/home/ec2-user/repo/node_modules/node-poppler/src/index.js:774:14)
    at ChildProcess.emit (node:events:513:28)
    at ChildProcess.emit (node:domain:489:12)
    at maybeClose (node:internal/child_process:1100:16)
    at Process.ChildProcess._handle.onexit (node:internal/child_process:304:5)

To install poppler dependencies on heroku instance I added a buildpack in heroku settings: https://github.com/amitree/heroku-buildpack-poppler

To install poppler dependencies on AWS EC2 instance I installed them with:

sudo yum install poppler-data
sudo yum install poppler-utils

Related issue

I found this StackOverflow issue. Which says that this is a bug in pdfToCairo. But the same code worked in Heroku.
Do you think this is an issue of different linux os. Or is there something I am missing and maybe I just need to install some kind of dependencies for this to work?

Differences between heroku and aws linux.

On Heroku, this is called the "stack"β€”an operating system image curated and maintained by Heroku. The stack is based on Ubuntu, the open source Linux distribution.
AWS's Amazon Linux will be based on Red Hat's Fedora community Linux.

Steps to Reproduce

I just created a AWS EC2 instance with default settings, installed node, installed poppler dependencies and tried running the code below.
I need to generate a png file from pdf which has a single page.

First I generate the pdf buffer:

    const PdfOptions = {
      base: `file:///${base}/`,
      format: 'letter',
      height: 2551,
      localUrlAccess: true,
      orientation: 'landscape',
      timeout: '100000',
      width: 3295,
    };

    const html = this.getHtml();
    const fileName = await PdfService.GenerateFileName(FileExtension.Pdf);

    return new Promise((resolve, reject) => {
      pdf.create(html, PdfOptions).toBuffer(function (err, buffer) {
        if (err) {
          reject(err);
          return logger.error(err);
        }

        resolve({ buffer, fileName });
      });
    });

Then I try to generate the png from the pdf buffer:

     const pdfToCairoOptions = {
      pngFile: true,
      singleFile: true,
      resolutionXYAxis: 72,
    };

    const pngBuffer = await poppler.pdfToCairo(pdfPath, undefined, pdfToCairoOptions); // Crashes here

    const binaryBuffer = Buffer.from(pngBuffer, 'binary');

    return { pngBuffer: binaryBuffer };

Expected Behaviour

Expected behaviour should be that the png file is generated. On my development machine(macOS) and heroku it works. But on AWS EC2 instance it doesn't work.

Support for Linux/ Ubuntu support without os dependencies

Please Add support for Linux/Ubuntu os support of pdftotext...&all. This is really in demand and i don't find it anywhere. PDF.js is one alternative but it didn't extracts every text but pdftotext from poppler does.

Please Please Please add support. The os dependency is a problem because we want to use it in our backend server in firebase functions. which uses ubuntu 18.0.4 LTS and won't allow installing system wide libraries.

Thank you. Hoping a great response.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.