bitcrowd / chromic_pdf Goto Github PK
View Code? Open in Web Editor NEWConvenient HTML to PDF/A rendering library for Elixir based on Chrome & Ghostscript
License: Apache License 2.0
Convenient HTML to PDF/A rendering library for Elixir based on Chrome & Ghostscript
License: Apache License 2.0
This is probably a dumb question but i'll ask anyway.
My goal:
My users have multiple records, and each records on display has a download button. when user clicks the button it will generate the pdf containing specific record inside and triggers download on the browser.
Is there a way to do that using this. when trying to pipe this on my controller it only saves a file on the project folder. Thank you so much!!
UPDATE:
I was able to trigger browser upload by passing the base64 encoded pdf to an a tag:
/template.html.eex
<% html = (render Web.SharedView, "_pdf_template.html", assigns) %>
<a href="data:application/pdf;base64,<%= render_pdf(html) %>" download>DOWNLOAD</a>
/render_view.ex
def render_pdf(html) do
ChromicPDF.print_to_pdf({:html, html})
|> elem(1)
end
PROBLEM:
Images inside the pdf file are all corrupted.
I've already started this discussion on this elixir forum post.
What do you exactly mean by "users should be well aware of the security implications"?
In my opinion, having to wait a couple of seconds while the PDF is generated is totally fine (it is what happens when you try to export your document to a PDF format on Google Docs/Google Sheets for example).
I also like to print PDF from an url instead of a file/string because you do not need to write inline CSS (as far as I know the styles are not loaded when you print to PDF from an HTML file or when you render a view as a string in Phoenix).
Setting cookies is easily doable with the Ferrum library written in Ruby, as well as with Puppeteer. It'll be great if we could also set cookies in Chrome with Elixir, since, to the best of my knowledge, this feature is neither implemented in pdf-generator nor in puppeteer-pdf.
I've been searching around and don't really see support for fillable forms in the generated PDF, is this something that's possible to implement?
I've tried creating the forms manually but it's a huge p.i.t.b. trying to get everything aligned properly with the generated output, or even from scratch.
Would be a life-saver if we could peep form inputs and convert them to fillable fields.
Hi,
We just started running into this issue in production:
2023-01-19T20:31:56.032465131Z 20:31:55.994 [error] Error during ChromicPDF.Browser.SessionPool.terminate_worker/3 callback:
2023-01-19T20:31:56.032514431Z ** (FunctionClauseError) no function clause matching in ChromicPDF.Browser.SessionPool.terminate_worker/3
2023-01-19T20:31:56.032520231Z (chromic_pdf 1.6.0) lib/chromic_pdf/pdf/browser/session_pool.ex:191: ChromicPDF.Browser.SessionPool.terminate_worker(:DOWN, %{session: %{session_id: "B4192E860C6DB1C83D3A4404A257EFD0", target_id: "B3EDCFD2C0E3D3349617B0A1F59AEC0F"}, uses: 14}, %{browser: #PID<0.2847.0>, init_timeout: 5000, max_session_uses: 1000, spawn_protocol: %ChromicPDF.Protocol{steps: [call: &ChromicPDF.SpawnSession.create_browser_context/2, await: &ChromicPDF.SpawnSession.browser_context_created/2, call: &ChromicPDF.SpawnSession.create_target/2, await: &ChromicPDF.SpawnSession.target_created/2, call: &ChromicPDF.SpawnSession.attach/2, await: &ChromicPDF.SpawnSession.attached/2, call: &ChromicPDF.SpawnSession.set_user_agent/2, call: &ChromicPDF.SpawnSession.offline_mode/2, call: &ChromicPDF.SpawnSession.enable_page/2, await: &ChromicPDF.SpawnSession.page_enabled/2, call: &ChromicPDF.ResetTarget.reset_history/2, await: &ChromicPDF.ResetTarget.history_reset/2, call: &ChromicPDF.ResetTarget.blank/2, await: &ChromicPDF.ResetTarget.blanked/2, await: &ChromicPDF.ResetTarget.fsl_after_blank/2, output: &ChromicPDF.SpawnSession.output/1], state: %{protocol: ChromicPDF.SpawnSession, chrome_args: "--disable-dev-shm-usage", discard_stderr: false, ignore_certificate_errors: false, no_sandbox: true, offline: true, on_demand: false}}, timeout: 5000})
2023-01-19T20:31:56.032529731Z (nimble_pool 0.2.6) lib/nimble_pool.ex:932: NimblePool.do_apply_worker_callback/4
2023-01-19T20:31:56.032532831Z (nimble_pool 0.2.6) lib/nimble_pool.ex:867: NimblePool.maybe_terminate_worker/3
2023-01-19T20:31:56.032536031Z (nimble_pool 0.2.6) lib/nimble_pool.ex:769: NimblePool.remove_worker/3
2023-01-19T20:31:56.032539231Z (nimble_pool 0.2.6) lib/nimble_pool.ex:651: NimblePool.cancel_request_ref/3
2023-01-19T20:31:56.032542331Z (stdlib 4.0.1) gen_server.erl:1120: :gen_server.try_dispatch/4
2023-01-19T20:31:56.032545331Z (stdlib 4.0.1) gen_server.erl:1197: :gen_server.handle_msg/6
2023-01-19T20:31:56.032549331Z (stdlib 4.0.1) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
I'm not sure if it's related but we recently upgraded to ChromicPDF 1.6.0. I'm not sure how to diagnose from here. If it's any help, this is when trying to generate about 15 PDFs at the same time. This error is not happening locally for the same set of data, only on our production server.
Hello,
thanks for the work.
I have html code that I'd like to convert to pdf. It works well but fonts does not seem to be rendered.. instead I just see the default one (times new roman i guess). Also Font-Awesome icons are not visible.
Code for the job (inside my controller):
...
Phoenix.View.render_to_string(MyAppWeb.MenuView, "menu-page.print.html", merged_assigns)
|> prepare_menu_pdf()
|> ChromicPDF.print_to_pdf(
output: fn path ->
conn
|> send_download({:file, path}, filename: merged_assigns.filename <> ".pdf", disposition: :inline)
end
)
...
and prepare_menu_pdf()
:
defp prepare_menu_pdf(string_content) when is_binary(string_content) do
{:ok, styles} =
Path.join(:code.priv_dir(:my_app_web), "static/css/menu-print-styles.css")
|> File.read()
{:ok, font_styles} =
Path.join(:code.priv_dir(:my_app_web), "static/fonts/fira/fira.css")
|> File.read()
[
content: [
"<style>" <> styles <> "</style>",
"<style>" <> font_styles <> "</style>",
string_content
]
]
|> ChromicPDF.Template.source_and_options()
end
I have tried to download woff files mentioned in fira/fira.css
(where those @font-family
-ies are specified) so it can be available locally but no luck
FontAwesome is inserted with remote url..
Am I doing anything wrong?
Thanks in advance.
See https://elixirforum.com/t/chromicpdf-pdf-generator/29473/41
** (RuntimeError) /usr/local/bin/gs exited with status 1!
GPL Ghostscript 9.56.1: Unrecoverable error, exit code 1
(chromic_pdf 1.2.0) lib/chromic_pdf/utils.ex:53: ChromicPDF.Utils.system_cmd!/3
Haven't confirmed it yet.
Now with the "multiple sources" feature being in place, it becomes apparent that the print_to_pdf/2
/ print_to_pdfa/2
separation wasn't a good call. Refactor as follows:
print_to_pdf/2
, dependent on presence of new pdfa: true
flagprint_to_pdfa/2
(route it to print_to_pdf/2
with pdfa: true
)Getting the schedulers at compile time doesn't make sense. Also probably we want to set it to a minimum of 1.
See #100 (comment)
Hi. Thanks for this awesome library. In general, it works really well. We have no issues in development.
I tried to get ChromicPDF working in github actions CI and am getting this error/warning:
Error: t it renders a pdf blob [L#12]08:58:31.509 [error] Task #PID<0.627.0> started from #PID<0.626.0> terminating
Warning: ** (RuntimeError) Timeout in Channel.run_protocol/3!
The underlying GenServer.call/3 exited with a timeout. This happens when the browser was
not able to complete the current operation (= PDF print job) within the configured
5000 milliseconds.
If you are printing large PDFs and expect long processing times, please consult the
documentation for the `timeout` option of the session pool.
If you are *not* printing large PDFs but your print jobs still time out, this is likely a
bug in ChromicPDF. Please open an issue on the issue tracker.
(chromic_pdf 1.2.2) lib/chromic_pdf/pdf/browser/channel.ex:24: ChromicPDF.Browser.Channel.run_protocol/3
Warning: (chromic_pdf 1.2.2) lib/chromic_pdf/pdf/browser/session_pool.ex:120: [481](https://github.com/westarete/novo/runs/8127312057?check_suite_focus=true#step:9:482)
(elixir 1.13.4) lib/task/supervised.ex:89: Task.Supervised.invoke_mfa/2
Warning: (elixir 1.13.4) lib/task/supervised.ex:34: Task.Supervised.reply/4
(stdlib 3.17.2) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
Warning: Function: #Function<0.14377584/0 in ChromicPDF.Browser.SessionPool.init_worker/1>
Args: []
I'm using these options to configure Chromic:
on_demand: false,
offline: true,
discard_stderr: false,
no_sandbox: true,
session_pool: [timeout: 30_000]
In development, rendering takes around half a second and that's with on_demand
set to true
. This is for a one page PDF that's mostly just text.
I'm using an ubuntu-latest
github actions runner. The runner image includes google chrome, chromium and chrome driver so I did not do anything to download another version of chromium, etc.. When checking which version Chromic was using, it looks like it selected "/usr/bin/chromium-browser"
and I don't get any errors on boot about not being able to find chromium.
Of note, when I SSH into the runner and try running the test suite manually, it passes without any issues. I also tried rendering a pdf manually in IEx on the runner image and I got a blob back almost instantly. But for some reason, as part of CI, I'm getting flaky tests that take a long time to run and sometimes timeout. Sometimes the tests are marked as "passed" even when I see these errors. Other times, I see these errors and the test suite is marked as failed. I can't discern any difference in output between those two though.
Here's the test I'm running:
test "it renders a pdf blob" do
student = StudentsFixtures.student_fixture()
student = LegacyData.get_student_for_transcript!(student.id)
transcript = Transcript.get_transcript(student.id)
assert {:ok, _} = TranscriptPDFRenderer.render(student, transcript)
end
Here's the render
function:
def render(student, transcript) do
opts =
options(
content: content(student, transcript),
header: header(student),
footer: footer()
)
with {:ok, data} <- ChromicPDF.print_to_pdf(opts),
{:ok, binary} <- Base.decode64(data) do
{:ok, binary}
else
error ->
{:error, error}
end
end
defp options(opts) do
[
size: :us_letter,
header_height: "75mm",
footer_height: "20mm"
]
|> Keyword.merge(opts)
|> ChromicPDF.Template.source_and_options()
end
# ... snip
I've fiddled with this for many hours and can't seem to get a configuration that is reliable on github actions CI. Any thoughts?
Hey,
I recently switched to this library for pdf generation from wkhtmltopdf, and the cpu usage went up from 5-10% generally to 60-80+%.
Did I miss something?
Any tips?
chromic_pdf is working in production but a few times a day we get errors like below. We are using chromium. We are not using sandbox mode and none of the page assets are loaded via network. They are all embeded via base64. If you have any insights or debugging tips, let me know.
GenServer ChromicPDF.Browser terminating
** (FunctionClauseError) no function clause matching in ChromicPDF.Browser.handle_info/2
(chromic_pdf 0.5.2) lib/chromic_pdf/pdf/browser.ex:74: ChromicPDF.Browser.handle_info({:EXIT, #PID<0.7925.0>, :chrome_has_crashed}, %{dispatch: #Function<1.51776475/1 in ChromicPDF.Browser.init/1>, protocols: []})
(stdlib 3.13) gen_server.erl:680: :gen_server.try_dispatch/4
(stdlib 3.13) gen_server.erl:756: :gen_server.handle_msg/6
(stdlib 3.13) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
Last message: {:EXIT, #PID<0.7925.0>, :chrome_has_crashed}
We noticed that in some cases it is useful to delay the printing until dynamic content on the page is ready. This is admittedly rare but sometimes handy. The way we've approached it is to wait until a specific element has a defined attribute set.
Does the approach sound sane and would it make sense to include it in Chromic PDF? See the PR #87 for the approach.
Just adding this as an issue, to not get lost :)
Implementation ideas:
GenServer.call
in Channel
)Hi!
Currently we are using paged.js (pagedjs-cli) to generate pdfs using https://www.pagedjs.org/ since it supports a lot of new css to control the output that chrome doesn't support natively.
However, it's pretty slow to start a new node/chrome instance each time so I'm looking for alternatives.
I tried to instead using the pagedjs polyfill js in the web page and the following option (since that node is generated by pagedjs)
wait_for: %{selector: ".pagedjs_pages", attribute: "style"}
However, it seems it does not get a reply and timeouts. I guess this is because the node with class=".pagedjs_pages" does not exist at the start.
The corresponding code in pagedjs-cli is here: https://gitlab.pagedmedia.org/tools/pagedjs-cli/blob/master/src/printer.js#L220
await page.waitForSelector(".pagedjs_pages");
Any help would be much appreciated!
Hello! First of all, thanks for the awesome work!
I am unable to display font-awesome icons in the exported pdf. In a normal website lifecycle, the font-awesome icon will display after some time. That is why I thought the wait_for
option will be of great use. But it seems it always times out and can't find whatever selector and attr I set. This is how I set it up:
Version: 0.7.1
config.exs
config :my_app, ChromicPDF, on_demand: false, session_pool: [timeout: 60_000], offline: false
my_pdf_template.html.eex
<head>
<link rel="stylesheet" href="<%= Routes.static_url(@conn, "/css/tailwind.css") %>">
</head>
<body class="bg-white">
<div id="print-ready"></div>
<i class="far fa-envelope"></i>
... more contents ...
<script defer src="<%= Routes.static_url(@conn, "/js/app.js") %>" ></script>
</body>
app.js
import { faEnvelope } from "@fortawesome/pro-regular-svg-icons";
import { dom, library } from "@fortawesome/fontawesome-svg-core";
function handleDOMContentLoaded() {
library.add(faEnvelope);
dom.watch();
$("#print-ready").attr("ready-to-print", "");
}
window.addEventListener("DOMContentLoaded", handleDOMContentLoaded, false);
Then I just call an endpoint that will execute the download of the pdf via a controller:
pdf_controller.ex
def export(conn, params) do
... some code to build assigns ...
template = Phoenix.HTML.safe_to_string(PdfView.render("my_pdf_template.html", assigns))
[content: template]
|> ChromicPDF.Template.source_and_options()
|> ChromicPDF.print_to_pdf(
wait_for: %{selector: "#print-ready", attribute: "ready-to-print"},
output: fn path ->
conn
|> put_resp_content_type("application/pdf")
|> send_download({:file, path}, filename: "export.pdf")
|> halt()
end
)
end
Even I add an inline script in the template that adds the attribute ready-to-print
it seems it still cannot find the element and eventually times out.
Without wait_for
the pdf can be downloaded successfully, although the font-awesome icons are not visible.
Hello,
I'm attempting to generate a large pdf (100 pages) however the GenServer times out after only 5 (!) seconds. Is there a way to override this? I don't see a supervisor option to do so, and I can't seem to track down the timeout in the source.
Regards,
Dakora
From @jarimatti in #104
Would be nice if we could log the DevTools protocol messages for diagnostics via some means, e.g. config flag or some other setting: the messages should shed some light into this. Not sure if Chrome can do that natively?
Agreed. I'm also constantly going back to the Channel
and Connection
to put in IO.inspect
s. I suggest adding a compile time switch that conditionally compiles in some debug code, either Logger.debug
or just direct IO.inspect
s.
Hi, could you please help me with this one. I'm trying to use your library on a free Gigalixir tier but it keeps failing on an application startup with a following error:
** (RuntimeError) Timeout in Channel.run_protocol/3!
16:53:07.958 [error] Task #PID<0.4273.0> started from #PID<0.4239.0> terminating
The underlying GenServer.call/3 exited with a timeout. This happens when the browser was
not able to complete the current operation (= PDF print job) within the configured
If you are printing large PDFs and expect long processing times, please consult the
5000 milliseconds.
documentation for the `timeout` option of the session pool.
bug in ChromicPDF. Please open an issue on the issue tracker.
If you are *not* printing large PDFs but your print jobs still time out, this is likely a
(chromic_pdf 1.1.0) lib/chromic_pdf/pdf/browser/session_pool.ex:110: ChromicPDF.Browser.SessionPool.do_init_worker/2
(chromic_pdf 1.1.0) lib/chromic_pdf/pdf/browser/channel.ex:24: ChromicPDF.Browser.Channel.run_protocol/3
(elixir 1.12.1) lib/task/supervised.ex:35: Task.Supervised.reply/5
(elixir 1.12.1) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
(stdlib 3.13) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
Function: #Function<0.33738222/0 in ChromicPDF.Browser.SessionPool.init_worker/1>
Args: []
[0824/165309.842598:ERROR:zygote_host_impl_linux.cc(263)] Failed to adjust OOM score of renderer with pid 1731: Permission denied (13)
16:53:13.065 [error] Task #PID<0.4275.0> started from #PID<0.4239.0> terminating
I'm using mix releases deployment option to Gigalixir. Here is my config:
env.sh.eex
#!/bin/sh
apt-get update
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
apt-get -y install ./google-chrome-stable_current_amd64.deb
application.ex
defmodule Flip.Application do
# See https://hexdocs.pm/elixir/Application.html
# for more information on OTP Applications
@moduledoc false
use Application
def start(_type, _args) do
children = [
# Start the Ecto repository
Flip.Repo,
# Start the Telemetry supervisor
FlipWeb.Telemetry,
# Start the PubSub system
{Phoenix.PubSub, name: Flip.PubSub},
# Start the Endpoint (http/https)
FlipWeb.Endpoint,
# Start a worker by calling: Flip.Worker.start_link(arg)
# {Flip.Worker, arg}
{ChromicPDF, no_sandbox: true, discard_stderr: false, session_pool: [timeout: 10_000]}
]
# See https://hexdocs.pm/elixir/Supervisor.html
# for other strategies and supported options
opts = [strategy: :one_for_one, name: Flip.Supervisor]
Supervisor.start_link(children, opts)
end
# Tell Phoenix to update the endpoint configuration
# whenever the application is updated.
def config_change(changed, _new, removed) do
FlipWeb.Endpoint.config_change(changed, removed)
:ok
end
end
So basically, I use env.sh.eex to download and install chrome and setup no_sandbox as Gigalixir uses Docker. Moreover sometimes I get no errors after deployment and everything works perfectly. But most of the times it keeps failing. I have no idea why this happens, could you have a look please?
https://community.fly.io/t/cant-install-chrome-chromium-via-dockerfile-or-ssh/5303/5
Not sure how this happened ^^, theoretically System.find_executable/1
should be able to pick up the /usr/bin/google-chrome
path from $PATH
๐คท
Hey! Sorry for not taking a look at this when the PR was originally in review but there are a few things you should know related to merging chromic_pdf generated documents with ghostscript:
In my testing, I found that, with a large PDF file (~200 pages), GhostScript took 12 seconds to handle the request, and pdfunite took... 4.47 seconds! This should therefore speed up merging PDFs significantly. Turns out, ghostscript is a significant and slow bottleneck.
--export-tagged-pdf
. This seems like something that you don't want happening implicitlyShould we allow users to override the default timeout of 5 seconds for NimblePool.checkout/4
? It seems like there's now way to do this currently (we still sometimes have some ci failing because of this now ๐ )
Thanks for the awesome job
Really useful lib
I'm bringing here what we discussed on the forum:
Sometimes I need to print very large pdf files and the memory consumption jumps too high
It would be nice if we could support transferMode as ReturnAsStream
(I've tried to look at the code but, because of my lack of experience in elixir, I could not find an easy way to implement IO.read/IO.close in the current macro schema)
We run our umbrella project's test suite with mix cmd mix test
. ChromicPDF is started as child of an umbrella apps' supervisor automatically. The process hangs indefinitely when mix test
is finished.
Mix.Tasks.Cmd
opens a port for the nested command and waits for an :EXIT
message for the port in a receive
block in Mix.Shell.cmd/3
.Since ChromicPDF's test suite starts it with ExUnit's start_supervised
, mix cmd mix test
succeeds in this project. However, simply starting the supervisor with mix run
breaks.
mix cmd mix run -e "\"ChromicPDF.start_link()\""
# hangs forever...
iex(1)> Mix.Shell.cmd("mix run -e \"ChromicPDF.start_link()\"", [], &IO.puts(&1))
# hangs forever...
$ mix cmd mix run -e "\"Agent.start_link(fn -> nil end)\""
# exits immediately with exit code 0
Hi again!
In puppeteer you can pass ignoreHTTPSErrors: true
to the launch config which sends the command
'Security.setIgnoreCertificateErrors', {ignore: true}.
It would be great to have this option in chromic_pdf too.
Hi there,
I'm currently struggling at generating a PDF from an url in landscape mode.
I tried with the following options:
ChromicPDF.print_to_pdf(
%{
source: {:url, "http://acme.com/pdf"},
opts: [
print_to_pdf: %{landscape: true}
]
}
)
or
ChromicPDF.print_to_pdf(
%{
source: {:url, "http://acme.com/pdf"},
opts: [
print_to_pdf: %{paperWidth: 11, paperHeight: 8.5}
]
}
)
None is working. Any idea?
Maybe we want to deal with those:
08:16:15.577 [info] Function passed as a handler with ID "print_to_pdf_start" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
08:16:15.577 [info] Function passed as a handler with ID "print_to_pdf_stop" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
.
08:16:16.047 [info] Function passed as a handler with ID "convert_to_pdfa_start" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
08:16:16.047 [info] Function passed as a handler with ID "convert_to_pdfa_stop" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
.
08:16:16.246 [info] Function passed as a handler with ID "convert_to_pdfa_start" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
08:16:16.246 [info] Function passed as a handler with ID "convert_to_pdfa_stop" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
08:16:16.246 [info] Function passed as a handler with ID "print_to_pdf_start" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
08:16:16.246 [info] Function passed as a handler with ID "print_to_pdf_stop" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
.
08:16:16.715 [info] Function passed as a handler with ID "capture_screenshot_start" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
08:16:16.716 [info] Function passed as a handler with ID "capture_screenshot_stop" is local function.
This mean that it is either anonymous function or capture of function without module specified. That may cause performance penalty when calling such handler. For more details see note in `telemetry:attach/4` documentation.
https://hexdocs.pm/telemetry/telemetry.html#attach-4
This is more of a question I suppose. I currently generate a PDF for a specific user based on their account information.
I have the need to generate a single PDF of a group of these individual PDFs for archival purposes. Currently I'm rendering them individually to disk and then concatenating them via a System.cmd
call to qpdf
.
I thought about rendering one long PDF but I'm currently using both the header (for user profile info) and footer (for page numbers) to generate the pdf so I don't believe that'll work since headers and footers span the entire document.
Is there any way for me to leverage chromicPDF to generate this combined file directly?
Hi again!
We need to pass a custom option to chromium on startup, in this case we need "--font-render-hinting=none"
. I don't see an immediate way to do this.
See puppeteer/puppeteer#2410 for why we need font-render-hinting=none. In any case, it would be useful to be able to pass custom options.
Hi again!
In puppeteer you can pass ignoreHTTPSErrors: true
to the launch config which sends the command
'Security.setIgnoreCertificateErrors', {ignore: true}.
It would be great to have this option in chromic_pdf too.
Now that we're calling Ghostscript from print_to_pdf/2
(optionally, for multiple sources), we may as well benefit from it.
GhostscriptWorker/Interface/Impl
messinfo_option
to pdf_option
and add it as optional flag to print_to_pdf/2
When specifying the HTML content directly to the print_to_pdf
, the rendered PDF doesn't contain the image. It works ok when specifying the image through base64. Going through the code, it seems that when in :html
mode, there is no wait for the Page.frameStoppedLoading
notification like it is for the :url
mode.
chromic_pdf/lib/chromic_pdf/pdf/protocols/print_to_pdf.ex
Lines 13 to 34 in 29a23ac
Reproduction:
path = System.tmp_dir() |> Path.join("output.pdf")
result = ChromicPDF.Template.source_and_options(content: content()) |> ChromicPDF.print_to_pdf(output: path)
IO.inspect(result)
defp content() do
:erlang.iolist_to_binary(["<img src=\"https://homepages.cae.wisc.edu/~ece533/images/peppers.png\">"])
end
Hey, thanks for creating this project. I like how it works almost right out of the box.
I'm in the middle of testing, and the html file I want to convert to a pdf contains a single image.
[content: [ "<div><img src=\"potato.png\"></div>]]
|> ChromicPDF.Template.source_and_options()
|> ChromicPDF.print_to_pdf(output: "lib/pdftest_web/templates/pdf_templates/potatoResult.pdf")
The potato.png file is located in the outer most phoenix directory (next to /lib, /priv, etc..) If I run that in the IEx terminal, the resulting pdf is empty.
However, if I put some other html as input, it functions normally.
See stacktrace at #104
await_response
and await_notification
both call extract_from_payload
extract_from_payload
doesn't error when it can't find the desired key in the payload, but instead sets it to nil
in the stateget_in_state!
errors out when the value is nil
.await_response
or await_notification
can't satisfy their payload extractionsHello, i'm getting a rather odd error spam out of the blue here. About 5000 error events popped up in sentry after running an update, which resolved with a reboot:
11:30:31.085 [error] Task #PID<0.2538.0> started from #PID<0.2530.0> terminating
** (stop) exited in: GenServer.call(#PID<0.2525.0>, {:run_protocol, %ChromicPDF.Protocol{result_fun: nil, state: %{ignore_certificate_errors: false, offline: true, session_pool: [size: 10, timeout: 10000]}, steps: [call: &ChromicPDF.SpawnSession.create_browser_context/2, await: &ChromicPDF.SpawnSession.browser_context_created/2, call: &ChromicPDF.SpawnSession.create_target/2, await: &ChromicPDF.SpawnSession.target_created/2, call: &ChromicPDF.SpawnSession.attach/2, await: &ChromicPDF.SpawnSession.attached/2, call: &ChromicPDF.SpawnSession.set_user_agent/2, call: &ChromicPDF.SpawnSession.offline_mode/2, call: &ChromicPDF.SpawnSession.enable_page/2, await: &ChromicPDF.SpawnSession.page_enabled/2, call: &ChromicPDF.SpawnSession.blank/2, await: &ChromicPDF.SpawnSession.blanked/2, await: &ChromicPDF.SpawnSession.fsl_after_blank/2, output: &ChromicPDF.SpawnSession.output/1]}}, 5000)
** (EXIT) :connection_terminated
(elixir 1.11.4) lib/gen_server.ex:1027: GenServer.call/3
(chromic_pdf 0.7.2) lib/chromic_pdf/pdf/browser/channel.ex:21: ChromicPDF.Browser.Channel.run_protocol/3
(chromic_pdf 0.7.2) lib/chromic_pdf/pdf/browser/session_pool.ex:110: ChromicPDF.Browser.SessionPool.do_init_worker/2
(elixir 1.11.4) lib/task/supervised.ex:90: Task.Supervised.invoke_mfa/2
(elixir 1.11.4) lib/task/supervised.ex:35: Task.Supervised.reply/5
(stdlib 3.10) proc_lib.erl:249: :proc_lib.init_p_do_apply/3
Function: #Function<0.13638364/0 in ChromicPDF.Browser.SessionPool.init_worker/1>
Args: []
11:30:31.209 [error] GenServer #PID<0.2550.0> terminating
** (stop) :connection_terminated
Last message: {:EXIT, #Port<0.63>, :normal}
It's weird because chromium wasn't updated D:
Perhaps the reconnect frequency could be toned down? I got 100s of these errors every second.
With #153 we can see that it is sometimes useful to test not only against one but multiple versions of chrome, elixir, erlang & ghostscript. Maybe we can use a build matrix in the CI, so new versions can easily be added.
We could also use multiple versions of alpine images, since the chrome and ghostscript parts are pretty separate and there is always an update in major versions in a new alpine major version for chrome (and often ghostscript as well) and they reflect what is state of the art pretty well.
Hi @jarimatti ,
posting this here so we can discuss it and you feel involved and not replaced (because replacing is what I intend to do ๐ ):
wait_for
I looked into the wait_for
option a bit more and found out that in its current implementation it's definitely a bit cumbersome to use, and fixing it is unfortunately non-trivial.
querySelector
call, we won't get any attributeModified
notification and run into a timeout. In fact, the "Example HTML" we have in the docs has this error if used as-is (i.e. without any setTimeout
or JS lib doing some work in between).DOM.getAttributes
instead of waiting for attributeModified
-> this should work, but feels a little clunky.DOM.setChildNode
notifications we get right after DOM.getDocument
and filter out the relevant node and its attributes. This would be my preferred approach, but turned out to be rather tricky unfortunately
DOM.setChildNode
messages come after DOM.getDocument
, i.e. theoretically before we get the result of DOM.querySelector
-> hence at this point in time we might not know the nodeId
of the element in question. We could instead match the node based on the id
attribute here.DOM.setChildNode
notifications traverse the tree, so technically we will get between one and many of them, and would need to read them all. That's another thing that the Protocol
machine can't do right now.In summary: Either way outlined above, in combination with some kind of if_state
runtime conditional, is a whole lot of complexity we would introduce only to make sure the wait_for
logic is race-free, i.e. can deal with elements which have the attribute already set. Therefore, even if I like the API interface quite a lot (as it covers a lot and doesn't need help from the client-side), IMO it is not worth the added complexity in the code. So, at this point I discarded the entire idea and looked for alternatives.
Actually I had planned to support this devtools call even before you added wait_for
, but didn't move on with it as I personally had no use for it at the time: Runtime.evaluate
. It might be a bit of a sledgehammer solution to replace wait_for
, but of course it has potential to solve other use-cases as well. Looking into it this morning, I wanted to specifically see what it takes to mimick wait_for
though, and here it is:
# protocols/print_to_pdf.ex
if_option :evaluate do
call(:evaluate, "Runtime.evaluate", [{"expression", [:evaluate, :expression]}], %{awaitPromise: true})
await_response(:evaluated, [])
end
# client call
@wait_for_js """
async function waitForAttribute(selector, attribute) {
while (!document.getElementById(selector).hasAttribute(attribute) {
await new Promise(resolve => requestAnimationFrame(resolve));
}
};
waitForAttribute('testdiv', 'ready-to-print');
"""
ChromicPDF.print_to_pdf(..., evaluate: %{expression: @wait_for_js})
Of course I didn't come up with this script myself: stackoverflow link. In my test setup, this worked for both cases; when I had the attribute already set on the element as well as with setTimeout(... setAttribute..., 1000)
.
In my opinion, this is quite neat. What do you think? Would you be ok with replacing the wait_for
option with this instead?One could theoretically also emulate the wait_for
behaviour with this script underneath, i.e. put this script into ChromicPDF and make the wait_for
option set an evaluate
option instead. But I'm not yet convinced maintaining this in code has much added benefit over just mentioning it in the docs.
I'm trying to run ChromicPDF in a docker container.
My Docker container looks something like this:
# runtime is built first to prevent invalidating cache, as this is one of the slower stages.
FROM alpine:3.13 as runtime
# Install Chromium for PDF generation
RUN echo "http://dl-cdn.alpinelinux.org/alpine/edge/main" > /etc/apk/repositories \
&& echo "http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories \
&& echo "http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories \
&& echo "http://dl-cdn.alpinelinux.org/alpine/v3.12/main" >> /etc/apk/repositories \
&& apk upgrade -U -a \
&& apk add \
libstdc++ \
chromium \
harfbuzz \
nss \
freetype \
ttf-freefont \
font-noto-emoji \
wqy-zenhei \
&& rm -rf /var/cache/* \
&& mkdir /var/cache/apk
ENV CHROME_BIN=/usr/bin/chromium-browser \
CHROME_PATH=/usr/lib/chromium/
FROM elixir:1.11-alpine as build
ENV MIX_ENV=prod
WORKDIR /tmp/app
RUN mix local.hex --force && mix local.rebar --force
COPY mix.exs mix.lock ./
RUN mix do deps.get --only prod
FROM node:15-alpine as frontend
WORKDIR /tmp/app_front
COPY --from=build /tmp/app/deps/phoenix/ ./deps/phoenix/
COPY --from=build /tmp/app/deps/phoenix_html/ ./deps/phoenix_html/
COPY --from=build /tmp/app/deps/phoenix_live_view/ ./deps/phoenix_live_view/
COPY assets/ ./assets/
RUN npm ci --progress=false --no-audit --loglevel=error --prefix assets
ENV NODE_ENV=production
RUN npm run deploy --prefix assets
FROM build as release
COPY config ./config
COPY lib ./lib
COPY priv ./priv
COPY --from=frontend /tmp/app_front/priv/static ./priv/static/
RUN mix do phx.digest, compile, release
FROM runtime as app
WORKDIR /app
RUN chown -R nobody:nobody /app
RUN apk add --no-cache openssl ncurses-libs
USER nobody:nobody
COPY --from=release --chown=nobody:nobody /tmp/app/_build/prod/rel/app ./
ENV HOME=/app
CMD ["bin/app", "start"]
I simply added ChromicPDF to my mix dependencies, and in the supervision tree without additional parameters.
However, when running the application I get the following in the output (keeps spamming until the supervisor gives up and terminates the entire runtime):
16:37:26.489 [error] GenServer #PID<0.3047.0> terminating
app_1 | ** (stop) :connection_terminated
app_1 | Last message: {:EXIT, #Port<0.12>, :normal}
app_1 | 16:37:26.489 [error] Task #PID<0.3049.0> started from #PID<0.3048.0> terminating
app_1 | ** (stop) exited in: GenServer.call(#PID<0.3046.0>, {:run_protocol, %ChromicPDF.Protocol{result_fun: nil, state: %{ignore_certificate_errors: false, offline: true}, steps: [call: &ChromicPDF.SpawnSession.create_browser_context/2, await: &ChromicPDF.SpawnSession.browser_context_created/2, call: &ChromicPDF.SpawnSession.create_target/2, await: &ChromicPDF.SpawnSession.target_created/2, call: &ChromicPDF.SpawnSession.attach/2, await: &ChromicPDF.SpawnSession.attached/2, call: &ChromicPDF.SpawnSession.set_user_agent/2, call: &ChromicPDF.SpawnSession.offline_mode/2, call: &ChromicPDF.SpawnSession.enable_page/2, await: &ChromicPDF.SpawnSession.page_enabled/2, call: &ChromicPDF.ResetTarget.reset_history/2, await: &ChromicPDF.ResetTarget.history_reset/2, call: &ChromicPDF.ResetTarget.blank/2, await: &ChromicPDF.ResetTarget.blanked/2, await: &ChromicPDF.ResetTarget.fsl_after_blank/2, output: &ChromicPDF.SpawnSession.output/1]}}, 5000)
I've tried setting debug_protocol: true
in the config, but that did not reveal additional information.
The Chromium binary is definitely present and detected (System.find_executable returns the correct path).
When I start my app with iex -S mix
or mix phx.server
and kill it with an interrupt (ctrl+c), the Chrome process isn't cleaned up. In fact, three processes remain.
13039 0.0 0.1 4920020 24212 ?? S 9:41AM 0:00.06 /Applications/Google Chrome.app/Contents/Frameworks/Google Chrome Framework.framework/Versions/86.0.4240.75/Helpers/Google Chrome Helper.app/Contents/MacOS/Google Chrome Helper --type=utility --utility-sub-type=network.mojom.NetworkService --field-trial-handle=1718379636,10962548680910408603,528382615615534866,131072 --lang=en-US --service-sandbox-type=network --use-mock-keychain --use-gl=swiftshader-webgl --headless --shared-files --seatbelt-client=23
13038 0.0 0.4 6001936 120940 ?? S 9:41AM 0:00.20 /Applications/Google Chrome.app/Contents/Frameworks/Google Chrome Framework.framework/Versions/86.0.4240.75/Helpers/Google Chrome Helper (GPU).app/Contents/MacOS/Google Chrome Helper (GPU) --type=gpu-process --field-trial-handle=1718379636,10962548680910408603,528382615615534866,131072 --headless --headless --gpu-preferences=MAAAAAAAAAAgAAAAAAAAAAAAAAAAAAAAAABgAAAAAAAQAAAAAAAAAAAAAAAAAAAA6AAAABwAAADgAAAAAAAAAOgAAAAAAAAA8AAAAAAAAAD4AAAAAAAAAAABAAAAAAAACAEAAAAAAAAQAQAAAAAAABgBAAAAAAAAIAEAAAAAAAAoAQAAAAAAADABAAAAAAAAOAEAAAAAAABAAQAAAAAAAEgBAAAAAAAAUAEAAAAAAABYAQAAAAAAAGABAAAAAAAAaAEAAAAAAABwAQAAAAAAAHgBAAAAAAAAgAEAAAAAAACIAQAAAAAAAJABAAAAAAAAmAEAAAAAAACgAQAAAAAAAKgBAAAAAAAAsAEAAAAAAAC4AQAAAAAAABAAAAAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAAAGAAAAEAAAAAAAAAAAAAAABwAAABAAAAAAAAAAAAAAAAgAAAAQAAAAAAAAAAAAAAAKAAAAEAAAAAAAAAAAAAAACwAAABAAAAAAAAAAAAAAAA0AAAAQAAAAAAAAAAEAAAAAAAAAEAAAAAAAAAABAAAABgAAABAAAAAAAAAAAQAAAAcAAAAQAAAAAAAAAAEAAAAIAAAAEAAAAAAAAAABAAAACgAAABAAAAAAAAAAAQAAAAsAAAAQAAAAAAAAAAEAAAANAAAAEAAAAAAAAAAEAAAAAAAAABAAAAAAAAAABAAAAAYAAAAQAAAAAAAAAAQAAAAHAAAAEAAAAAAAAAAEAAAACAAAABAAAAAAAAAABAAAAAoAAAAQAAAAAAAAAAQAAAALAAAAEAAAAAAAAAAEAAAADQAAABAAAAAAAAAABgAAAAAAAAAQAAAAAAAAAAYAAAAGAAAAEAAAAAAAAAAGAAAABwAAABAAAAAAAAAABgAAAAgAAAAQAAAAAAAAAAYAAAAKAAAAEAAAAAAAAAAGAAAACwAAABAAAAAAAAAABgAAAA0AAAA= --use-gl=swiftshader-webgl --shared-files 13035 0.0 0.1
5461308 41252 ?? Ss 9:41AM 0:00.23 /Applications/Google Chrome.app/Contents/MacOS/Google Chrome --headless --disable-gpu --remote-debugging-pipe
@andreasknoepfle confirmed this again and will try to fix it soon โ๏ธ unless you're eager to dig into it... It's not noticable usually until you reach the point where you have 500 chrome instances running and your OS starts doing funky things.
Hi, how to solve this issue? "function :telemetry.span/3 is undefined or private"
tried doing it in postman, tia
Running a deploy on fly.io with zero activity and no pdfs being generated. I'm unsure if this is from the pool shutting down or starting up.
00:19:00.549 [error] Task #PID<0.510.0> started from #PID<0.509.0> terminating
** (RuntimeError) Timeout in Channel.run_protocol/3!
The underlying GenServer.call/3 exited with a timeout. This happens when the browser was
not able to complete the current operation (= PDF print job) within the configured
5000 milliseconds.
If you are printing large PDFs and expect long processing times, please consult the
documentation for the `timeout` option of the session pool.
If you are *not* printing large PDFs but your print jobs still time out, this is likely a
bug in ChromicPDF. Please open an issue on the issue tracker.
(chromic_pdf 1.3.0) lib/chromic_pdf/pdf/browser/channel.ex:26: ChromicPDF.Browser.Channel.run_protocol/3
(chromic_pdf 1.3.0) lib/chromic_pdf/pdf/browser/session_pool.ex:122: ChromicPDF.Browser.SessionPool.do_init_worker/3
(elixir 1.13.4) lib/task/supervised.ex:89: Task.Supervised.invoke_mfa/2
(elixir 1.13.4) lib/task/supervised.ex:34: Task.Supervised.reply/4
(stdlib 4.0.1) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
Function: #Function<0.72123262/0 in ChromicPDF.Browser.SessionPool.init_worker/1>
Args: [
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.