Coder Social home page Coder Social logo

Comments (3)

metebalci avatar metebalci commented on July 30, 2024

You have to use the --page-number argument. pdftitle does not check all the file, it only checks a single page (first page by default).

$ pdftitle -p test.pdf --page-number 2
C++/CLI in Action

from pdftitle.

dufferzafar avatar dufferzafar commented on July 30, 2024

@metebalci Since it can't be known before-hand which PDFs will have title on first page.

Don't you think a better option would be to specify the last page that is checked? By default --last-page-number would be 1, so only 1st would be check. But I could set --last-page-number to something like 2 or 3 where title would be detected in the FIRST 3 pages.

BTW, I use pdftitle in a script that renames PDFs with their titles: https://github.com/dufferzafar/.scripts/blob/master/pdf-titles

from pdftitle.

metebalci avatar metebalci commented on July 30, 2024

For an ultimate tool to extract a title from anywhere in a PDF file, this would be correct, but it is pretty difficult to do this I think with traditional methods (I mean without using something more smart from gestalt theory etc.). The main purpose of the tool is to extract titles of (peer-reviewed) articles and they do not have a cover page and they usually have a simple layout. On the other hand, I am not 100% sure but it might not be difficult to implement what you say and it might have some use. So I reopen the issue, I will check this when I do some implementation. So the changes can be:

  • deprecate but do not remove --page-number, defaults to 1
  • introduce --first-page-number, defaults to --page-number
  • introduce --last-page-number (inclusive), defaults to --first-page-number. If --last-page-number is different and the actual number of pages is less than this, I guess it makes sense to terminate the process silently at the end of the document.

from pdftitle.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.