Coder Social home page Coder Social logo

Widow and orphan support (CSS property) about mpdf HOT 29 OPEN

mpdf avatar mpdf commented on September 23, 2024
Widow and orphan support (CSS property)

from mpdf.

Comments (29)

danielhjames avatar danielhjames commented on September 23, 2024

Hi Marc, thanks for filing this. We would be prepared to offer, or contribute to, a bug bounty for this issue to be fixed.

from mpdf.

gnunicorn avatar gnunicorn commented on September 23, 2024

After quite some digging, I was able to add the configuration (CSS params) and even have the code in place to detect whether we'd create orphans or widdows – see here. Now my problem is, that we'd be doing that detecting as reflowing while we are already kinda printing stuff to the buffer – and I was simply not able to get it to put a page break before the block we are currently in processing: changes of margin, padding all had no effect anymore (only pre flowing...) and any break I did still spit out the buffered text before doing the page break. I am super close, just not familiar enough with mpdf...

If someone want to give me a hint, which of the many things I have to tweak how, to make a page break before the block I am currently processing, I am sure I can this done in no-time.

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

Thanks, great work! It may be that you've uncovered an aspect of the mPDF design which requires a rethink for the 7.0 branch. By the way, in the CSS spec http://www.w3.org/wiki/CSS/Properties/widows the word 'widows' is spelled with only one 'd'.

from mpdf.

gnunicorn avatar gnunicorn commented on September 23, 2024

Ha. right. Widows – one 'd'. Will fix that.

I was wondering about that in general. As the CSS specs states, that the default value is '2' (if not specified) that means that a bunch of previously rendered PDFs, which contained widows and orphans, will render differently after this has been implemented. One could consider that breaking changes. I was wondering if we maybe want to put the feature behind a command-line-flag or something – but that is only a bother once it can be finished ;) .

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

In general I don't think mPDF is used with command line flags. What I would suggest is that in the mPDF configuration file config.php there is a default widows and orphans setting of 1 which produces the same behaviour as the previous mPDF 6.0 release, i.e. no widows and orphans protection. This would therefore not be a breaking change for users. However there might be unintended consequences of changing the page/column break model.

What if when a widow or orphan is detected, the page break or column break is applied within the current block rather than before the current block? That might mean the block has to be at least as large as the number of lines specified in the widows and orphans settings.

from mpdf.

gnunicorn avatar gnunicorn commented on September 23, 2024

What if when a widow or orphan is detected, the page break or column break is applied within the current block rather than before the current block? That might mean the block has to be at least as large as the number of lines specified in the widows and orphans settings.

Like, let's say, we have a paragraph of 7 lines and 6 would fit on one page, leaving a widow on the page after? Well, As of now, I am only able to tell that once we have already printed line 6, and as we can't move back and "unprint" them at that point (it seems), it is still the same issue.

But yes, I was thinking of adding support for more complex behaviors (which might possibly be configuration options). So in this case, it breaking at line 5 and printing 6 and 7 on the second page (if configured to do so). Not actually that hard to do, once I am able to put a page break where I want 😝 .

What I would suggest is that in the mPDF configuration file config.php there is a default widows and orphans setting of 1 which produces the same behaviour as the previous mPDF 6.0 release, i.e. no widows and orphans protection.

In light of that, I'd actually suggest to have the configuration option a number defining the way to deal with them (allows to have more complex behaviors later, too, like letter spacing ;) ). Putting it per default to 0 (don't deal with widows or orphans), 1 as "break before paragraph and 2 meaning "smart break the paragraph if possible"...

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

Your proposed configuration option seems like a totally different setting to a CSS default, I can see the usefulness of it though.

That implies that each block cannot be fully committed to the document before it has passed a widows and orphans check, if that check is enabled.

There's an interesting comment at line 23599 of https://github.com/mpdf/mpdf/blob/development/mpdf.php as follows:

If page-break-inside:avoid section has broken to new page but fits on one side - then move

Maybe debugging page-break-inside:avoid would shed some light on how this is meant to be done.

from mpdf.

gnunicorn avatar gnunicorn commented on September 23, 2024

Your proposed configuration option seems like a totally different setting to a CSS default, I can see the usefulness of it though.

Not entirely. The spec states that orphan and widow have to taken care for whenever you break, but it also contains a "best breaking practice" section, which states, what is a desired behavior on how to break exactly. That would be closely resembling what I outlined as 2. Breaking the entire paragraph would still totally be according to spec though, as that paragraph is a "should" not a "must".

If page-break-inside:avoid section has broken to new page but fits on one side - then move

That is one of the starting points I had. The thing is, that here it is directly defined at parsing level, that this paragraph can't be broken. The (ehem, rather hackish) way this is achieved is by forcing a page break before processing and then move it back up one page if there is still enough space left on the page. I was thinking of doing something similar, but that means we'd have to break before every paragraph and move them back up after we are done processing, according to the rules of orphans and widows. While that could be understood as a sufficient approach, it is a rather big change in the way processing works right now and I didn't want to do that. Also, it feels wrong – create each paragraph on their own page and then move them back once you know its size ....

from mpdf.

marclaporte avatar marclaporte commented on September 23, 2024

Thank you Benjamin for giving this a go!

Everyone: progress can be followed here:
https://github.com/ligthyear/mpdf/commits/orphan-support

from mpdf.

gnunicorn avatar gnunicorn commented on September 23, 2024

thanks, @marclaporte .

But unless there are any new ideas on how to solve this, coming up here in this conversation, I don't really have any way to continue ... so work is on a standby until then ...

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

Hi Ben, hi Marc, I have talked to the Booktype developer team about holding an online workshop to devise the best solution, would you have some time this week to participate?

In principle, as we know the starting position of a paragraph, the number of characters, the column width and the line height, do you think it would be possible to determine in advance which paragraphs are going to break over the column or page end? Or would it be simpler to create every paragraph on the next column or page and move it back it up, as Ben mentions above? What I like about the latter solution is that we will be very sure about what the paragraph contains before we finalise its placement.

from mpdf.

marclaporte avatar marclaporte commented on September 23, 2024

To all: There will be a conference call Thursday. Check time for your city:
http://www.timeanddate.com/worldclock/fixedtime.html?msg=mPDF%20Widows%20and%20Orphans&iso=20151112T15

The call will be here (or perhaps another solution)
https://meet.jit.si/mPDF

from mpdf.

jakejackson1 avatar jakejackson1 commented on September 23, 2024

It'll be 2am on my side of the world so I'll miss it (would love to read a summary of how the meeting went though).

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

I'll be there, thanks for arranging!

from mpdf.

marclaporte avatar marclaporte commented on September 23, 2024

Hi everyone!

I did record the conference with intent (and permission) to upload for the community, but I didn't set my screen recorder properly, and the default setting is to record my voice, but not the others. I am sorry about this. I will make sure it works next time.

We were 4:

  • Benjamin Kampmann (ligthyear)
  • Daniel James, Booktype
  • Aleksandar Erkalović, Booktype lead developer (aerkalov)
  • Marc Laporte, Tiki Wiki CMS Groupware

There was a great explanation of the challenge to do this properly. If it was easy, it would already have been done.

As a follow-up:

  • Aco and Benjamin have some things to try out.
  • Daniel will do some research on how other software handles this

We'll do another such conference call in a few weeks to discuss community organisation and roadmap. Picking a time for this will be tricky given Jake is in Australia. Please see:
https://github.com/mpdf/mpdf/wiki/Community-Conference-Calls

from mpdf.

jakejackson1 avatar jakejackson1 commented on September 23, 2024

Great. Nice work @marclaporte!

from mpdf.

marclaporte avatar marclaporte commented on September 23, 2024

@ligthyear: any interesting news? Thanks!

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

Hi Marc, I've been discussing the issue with Mark Lewis @thnkloud9 from our team, we will report back shortly.

from mpdf.

thnkloud9 avatar thnkloud9 commented on September 23, 2024

Hi @marclaporte. As @danielhjames mentioned, I've been looking into the issue and catching up the progress so far. I've forked and started estimating a solution that does not involve adding extraneous pages or post processing. It will calculate the required and available space for each block, accounting for current line height, padding, borders, and the orphans and widows limit config vars, before printing any of the block elements........at least in theory so far ;-)

Still not certain of level of effort at the moment, as this thing is a beast of a "party like its 1999" PHP 4 hot mess. Although I am guessing I should have a PR to submit for review early next week.

from mpdf.

thnkloud9 avatar thnkloud9 commented on September 23, 2024

#71

proposed solution uses:

function EstimateFlowingBlockWriteLines($s, $sOTLdata)

which just clones the current object ($this) and runs WriteFlowingBlock($s, $sOTLdata) on the cloned copy ($mock) to determine required lines before actually running them on the current object ($this).

from mpdf.

marclaporte avatar marclaporte commented on September 23, 2024

Any news? Thanks!

from mpdf.

thnkloud9 avatar thnkloud9 commented on September 23, 2024

Hi guys,

There has been some progress here:

#71

This PR fixes all the reported issues in examples/example02_CSS_styles.php. However, there are still several issues parsing examples/amnesty/amnesty2014-report-english-litho-full.php file, which is a multi-column document. I've fixed many of these issues, but currently, I have not been able to fix issues on pages:

38, 51, 55, 56, 71, 75, 84, 86, and several others.

All of which appeared only after resolving other issues with multi-column widow and orphan support. I've added additional debug output to help troubleshooting, however I'm finding that I'm running into a slippery slope of new issues related to multi-column documents, and not sure I have the time to identify and fix all of these.

from mpdf.

jakejackson1 avatar jakejackson1 commented on September 23, 2024

I've only used multi-columns a handful of times and I always have to compromise on style and aesthetics because of bugs with it. It's one component of mPDF that needs a full overhaul.

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

I've had no problems with styles in multi-column PDFs when using mPDF, they worked as expected. My only tip would be that if you want the baseline grid to align reliably between columns, any fonts larger or smaller than the body font size need to have a line height which is the same as or an exact multiple of the body font line height.

from mpdf.

marclaporte avatar marclaporte commented on September 23, 2024

Where can I find a copy of examples/amnesty/amnesty2014-report-english-litho-full.php ?

Thanks!

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

Hi Marc, you can find this file here: https://github.com/thnkloud9/mpdf/tree/85c674200c0b28586c9b3174af95999145114857/examples/amnesty

I made a pull request (now merged) to remove examples/amnesty/ from the development branch, as it's not appropriate to have such large example files in the mPDF source itself. This particular example does not use the correct custom fonts for the book and the CSS is not up to date, it's strictly for testing widows and orphans in a two-column layout, but it may give you a useful insight into how we use mPDF with Booktype.

We could use this separate repo for longform test materials, as long as they are under an appropriate copyright licence which enables redistribution: https://github.com/mpdf/mpdf-examples

from mpdf.

marclaporte avatar marclaporte commented on September 23, 2024

Thank you Daniel!

The Amnesty report is a fantastic showcase of the power of mPDF. That would be great for that new repo.

from mpdf.

danielhjames avatar danielhjames commented on September 23, 2024

The 2016 edition is Creative Commons licensed, so that is a possibility for a separate repo. I wouldn't like to mix it into the repo of a GPL licensed program like mPDF, that could get complicated :-)

from mpdf.

marclaporte avatar marclaporte commented on September 23, 2024

For the record, here is a branch to attempt to address this challenge:
https://github.com/thnkloud9/mpdf/commits/widows_and_orphans

But it is not at the level it was felt it could be merged.

from mpdf.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.