Coder Social home page Coder Social logo

Comments (11)

artulab avatar artulab commented on August 19, 2024

Hello. This issue was a problem on big computer cluster systems where the number of opened files are extremely high that exceeds the limit of maximum open file descriptors, whereas in single-file version, because the open file handles are automatically closed by the operating system after the program terminates, it should not be considered a bug. If your intention is to read output files while the program running, I recommend you to wait till the program completes its computation.

from taudem.

dakcarto avatar dakcarto commented on August 19, 2024

Hmm. Apparently, when I was messing around with the open-mpi network interfaces, I ended up with some open files that I mistook as the result of the regular file writing. On second look, it seems that the TIFF writing is not always creating a valid TIFF.

If I run the following from the tutorial using the single-file 5.1.1:

mpiexec -n 8 PitRemove logan.tif

Then the output file can often not be opened. Randomly, it writes correctly. This is the same case with other tools that write to TIFF.

I am using open-mpi: stable 1.7.3 via the Mac Homebrew project, with a new 'formula' I am writing for Taudem, which will be used via the Processing plugin in QGIS.

from taudem.

dtarb avatar dtarb commented on August 19, 2024

Elaborating a bit on Ahmet's comment, prior to 4568661 the multifile version was opening all files on all processes at the same time which resulted in too many file handles in some cases. This should not be an issue in the single file version as there is only one input file and it has to be opened and read by all processes. This is done in parallel using MPI reads. If the problem you are having is with output files it seems like the problem is something else.

I suggest looking into the tiffIO::write code and the arguments it is being passed. Since parallel writes are used for different parts of the file, and the headers are actually at the end of the file, there is the possibility that if one of the data blocks being written overflows into the header it may corrupt the file and the occurrence of this would be random depending on the order in which the processes involved do their writing.

The write function has arguments
tiffIO::write(long xstart, long ystart, long numRows, long numCols, void* source)
If numRows and xstart passed to it are not consistent with the partitioning TauDEM is assuming then overwriting may happen. More robust code would check for this, but at present tiffIO::write does not check for this.

from taudem.

snorfalorpagus avatar snorfalorpagus commented on August 19, 2024

@dakcarto I seem to be having the same problem with taudem installed via homebrew. PitRemove appears to run correctly, and the output file is a similar size to the input, but it is unreadable (for instance, by gdalinfo).

from taudem.

dtarb avatar dtarb commented on August 19, 2024

@snorfalorpagus it is unclear what the exact problem is without more details so I do not know what to suggest for an immediate fix. Over the long term (as noted in the Some-additional-functionality-and-programming-needs part of the wiki) I would like to migrate TauDEM away from its own TiffIO library to using GDAL directly. This would avoid problems in the tiffio library and allow merging back of the separate threads of code between the multifile version (that is in this git repo) and the single file version that is on the TauDEM website.

from taudem.

snorfalorpagus avatar snorfalorpagus commented on August 19, 2024

@dtarb I thought I might try running pitremove on different machines – once on OS X, and once in a virtual machine running Arch Linux. Both used the same version of taudem build from this repo. The differences between the output files are localised to one part of the file - see the attached screenshot showing a hex comparison of the binaries. The (broken) Mac output is on the left, the (working) Linux output on the right. Also, the output from the Mac is consistent.

screen shot 2014-10-07 at 22 45 24

The input data was downloaded from this URL. This isn't something I'd normally run through taudem, but was the first hit on Google for a geotiff elevation model. Trying to rule out a problem with GDAL.

http://www.eea.europa.eu/data-and-maps/data/world-digital-elevation-model-etopo5/zipped-dem-geotiff-raster-geographic-tag-image-file-format-raster-data/zipped-dem-geotiff-raster-geographic-tag-image-file-format-raster-data/download

I'll try and debug this further, but I don't have any knowledge of the TIFF specification so it might take a while.

from taudem.

dtarb avatar dtarb commented on August 19, 2024

@snorfalorpagus it is hard to compare pitremove results in binary. Pitremove uses a 32 bit floating point representation of the numbers and there may be differences in the rounding of some numbers from the 16 bit integer source across platforms (and compilers). I suggest looking at the difference using something like Raster Calculator in ArcGIS or by raster subtraction using R. An example of working with TauDEM outputs in R is in http://hydrology.usu.edu/taudem/taudem5/TauDEMRScript.txt.

Also since these differences seem so localized for a large file they may not even be in the data part of the file, but could be in the metadata, perhaps the tag used to represent no data, which is actually an ASCII string. Asftifftagviewer http://www.awaresystems.be/imaging/tiff/astifftagviewer.html is a tool useful for examining tiff tags that may help here. Alternatively if you send both files to me I can take a look.

from taudem.

snorfalorpagus avatar snorfalorpagus commented on August 19, 2024

@dtarb The file produced by the mac (felr0c0_mac.tif) isn't readable. QGIS can't open it, and neither can the standard "Preview.app" on OS X. So I suspect it's an issue in the metadata or something to do with the TIFF structure, rather than the data itself. I don't have access to the files now, but I'll upload them this evening when I get home.

Edit: The files are uploaded here: http://snorf.net/pitremove.tar.bz2

from taudem.

snorfalorpagus avatar snorfalorpagus commented on August 19, 2024

Any more thoughts on this issue?

from taudem.

dtarb avatar dtarb commented on August 19, 2024

@snorfalorpagus sorry it took me so long to get to this. I finally got to look at this. Your files appear to be pre-pended with some Mac metadata. When I look at your file in hex I get
c1

This is header information giving the file name and your username.

Looking further down I see the signature of the start of a tiff file 49492A00
c2

I am able to edit the content before 49492A00 out of the file then read it in ArcMap and use it as an input to TauDEM which then works.

from taudem.

dtarb avatar dtarb commented on August 19, 2024

The questions above appear to be answered. The GDAL implementation changes the file writing strategy to be the following

(rank indicates processor number)
If rank ==0 create and initialize raster file
write partition
close raster
send message to rank 1
if rank > 0
wait till message received from rank - 1
open raster
write partition
flush and close raster
if rank < size - 1 send message to rank+1

This effectively performs writing in series with only one process having the raster open at a time and should resolve these issues. So I am closing this issue.

from taudem.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.