Coder Social home page Coder Social logo

Comments (17)

mjordan avatar mjordan commented on June 21, 2024

I think this is more to do with using TIF as an extension that using a period. The validation error is coming from https://github.com/MarcusBarnes/mik/blob/master/src/inputvalidators/CsvBooks.php#L140. Coincidentally I opened a PR (#496) two weeks ago that will let you indicate TIF as a valid file extension.

Just to be sure, can you change the filenames back to using periods as separators but leave the (currently invalid until we merge #496) TIF extension as is and see what happens?

from mik.

mjordan avatar mjordan commented on June 21, 2024

Sorry, my logic is wrong. Change the extension to tif and try using periods as separators.

from mik.

bondjimbond avatar bondjimbond commented on June 21, 2024

OK, a couple of tests:

Separator _, extension tif = success
Separator _, extension TIF = fail
Separator ., extension TIF = fail
Separator ., extension tif = success

So it looks like the uppercase/lowercase extension is the culprit.

But interestingly, it behaves very unpredictably in my set -- because the first record is not .0001.tif but instead .0000.tif.

The presence of this file results in weird behaviours. Typically directory 0 just isn't generated, but it can also cause MIK to skip, say, directory 2, or more directories.

from mik.

mjordan avatar mjordan commented on June 21, 2024

@bondjimbond when you say "can also", are you saying that it behaves differently across runs of MIK with the same input data?

from mik.

bondjimbond avatar bondjimbond commented on June 21, 2024

when you say "can also", are you saying that it behaves differently across runs of MIK with the same input data?

Yes... I was running with a test directory of just four images (0000.tif through 0003.tif).

The first few runs, it produced directories 1, 2, and 3.

Later runs, just 1 and 3.

Another later run, just 3.

After removing 0000.tif, it consistently produced 1, 2, and 3.

from mik.

mjordan avatar mjordan commented on June 21, 2024

Could you zip up all your data and config files and send them to me so I can try to replicate that?

from mik.

bondjimbond avatar bondjimbond commented on June 21, 2024

Sure, here's my test directory and ini file: https://vault.sfu.ca/index.php/s/ChGaez7NLOygY3w

from mik.

mjordan avatar mjordan commented on June 21, 2024

OK, got it, I'll give it a try this evening.

from mik.

mjordan avatar mjordan commented on June 21, 2024

Can you send me your mappings file?

from mik.

bondjimbond avatar bondjimbond commented on June 21, 2024

Ack, of course, sorry
barkerville_mapping.txt

from mik.

mjordan avatar mjordan commented on June 21, 2024

@bondjimbond Strangely, I can't replicate the behavior you are seeing. I ran MIK about 10 times and always got the same thing: page objects for pages 1-3 and an error indicating a problem with the 0000 file ([...] added by me):

"message":"mkdir(): File exists" [...] "filename_segments":["1987","0019","0039","0000"],"page_number":""

The problem is coming from https://github.com/MarcusBarnes/mik/blob/master/src/writers/CsvBooks.php#L132-L134: since we trim all left padding 0s, we need something other than a 0000 as the page number. I'm not sure a fix to allow 0000 as a page number would be trivial.

from mik.

mjordan avatar mjordan commented on June 21, 2024

Although a check to see if $page_number is an empty string, and if it is, assign it a value of 0 to create a 0 directory, might be a simple fix. But do you want 0 to be the first page number instead of 1?

Something like:

$page_number = ltrim(end($filename_segments), '0');
if (strlen($page_number) === 0) {
  $page_number = '0';
}
$page_level_output_dir = $book_level_output_dir . DIRECTORY_SEPARATOR . $page_number;
mkdir($page_level_output_dir);

from mik.

mjordan avatar mjordan commented on June 21, 2024

Just tried that, it worked:

/tmp/brandon_books/
└── 1987
    ├── 0
    │   ├── MODS.xml
    │   └── OBJ.tif
    ├── 1
    │   ├── MODS.xml
    │   └── OBJ.tif
    ├── 2
    │   ├── MODS.xml
    │   └── OBJ.tif
    ├── 3
    │   ├── MODS.xml
    │   └── OBJ.tif
    └── MODS.xml

MODS.xml for page 0 is:

  <titleInfo>
    <title>This is a title, page 0</title>
  </titleInfo>
</mods>

MODS.xml for page 1 is:

  <titleInfo>
    <title>This is a title, page 1</title>
  </titleInfo>
</mods>

from mik.

bondjimbond avatar bondjimbond commented on June 21, 2024

That's exactly what I need! :)

from mik.

mjordan avatar mjordan commented on June 21, 2024

OK, I can open a PR for this if you want.

from mik.

bondjimbond avatar bondjimbond commented on June 21, 2024

Please do!

from mik.

mjordan avatar mjordan commented on June 21, 2024

I've made the same change to the CsvNewspapers writer and pushed up the issue-498 branch. I'll need to assemble some test data later but once I do that I'll open a PR.

from mik.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.