Coder Social home page Coder Social logo

darshan-hpc / darshan Goto Github PK

View Code? Open in Web Editor NEW
55.0 10.0 27.0 20.34 MB

Darshan I/O characterization tool

License: Other

C 49.60% Makefile 0.62% Shell 2.05% Perl 8.58% CSS 0.66% C++ 0.06% Fortran 0.20% Gnuplot 0.16% TeX 0.13% Python 16.01% Batchfile 0.03% Jupyter Notebook 18.79% M4 3.04% HTML 0.08%

darshan's Introduction

Darshan is a lightweight I/O characterization tool that transparently captures I/O access pattern information from HPC applications. Darshan can be used to tune applications for increased scientific productivity or to gain insight into trends in large-scale computing systems.

Please see the Darshan web page for more in-depth news and documentation.

The Darshan source tree is divided into two main parts:

  • darshan-runtime: to be installed on systems where you intend to instrument MPI applications. See darshan-runtime/doc/darshan-runtime.txt for installation instructions.

  • darshan-util: to be installed on systems where you intend to analyze log files produced by darshan-runtime. See darshan-util/doc/darshan-util.txt for installation instructions.

The darshan-test directory contains various test harnesses, benchmarks, patches, and unsupported utilites that are mainly of interest to Darshan developers.

darshan's People

Contributors

alexsim23 avatar carns avatar congxu-ml avatar csimarro2 avatar glennklockwood avatar jeanbez avatar morrone avatar nawtrey avatar roblatham00 avatar shanedsnyder avatar snell1224 avatar sudheerchunduri avatar tapplencourt avatar tylerjereddy avatar wkliao avatar yzanhua avatar zimmercj avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

darshan's Issues

poor error message if bzip support not available

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

If you compile darshan-utils without bzip support and then try to run darshan-job-summary on a bzip'd log file, then this message is displayed:

Error: incompatible darshan file.
Error: expected version 2.01, but got BZh91AY&SY
Error: unable to read job information from log file.
Use of uninitialized value $starttime in localtime at /tmp/darshan-util/bin/darshan-job-summary.pl line 285.
This darshan log has no file records. No summary was produced.
Use of uninitialized value $jobid in concatenation (.) or string at /tmp/darshan-util/bin/darshan-job-summary.pl line 288.
    jobid:
Use of uninitialized value $uid in concatenation (.) or string at /tmp/darshan-util/bin/darshan-job-summary.pl line 289.
      uid:
Use of uninitialized value $starttime in concatenation (.) or string at /tmp/darshan-util/bin/darshan-job-summary.pl line 290.
starttime: Wed Dec 31 19:00:00 1969 ( )
Use of uninitialized value $runtime in concatenation (.) or string at /tmp/darshan-util/bin/darshan-job-summary.pl line 291.
  runtime: (seconds)
Use of uninitialized value $nprocs in concatenation (.) or string at /tmp/darshan-util/bin/darshan-job-summary.pl line 292.
   nprocs:
Use of uninitialized value $version in concatenation (.) or string at /tmp/darshan-util/bin/darshan-job-summary.pl line 293.
  version: 

The utilities should detect this gracefully (by checking magic numbers in header, for example) and provide a more helpful error message.

optimize darshan startup routines

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

Darshan scans mount points and collects information about them at startup. In the past we were forced to do this on every process in order to correctly match files to mount points at run time. This is no longer necessary, however.

We should eliminate the logic that collects device ids from each mount points. We should also modify the algorithm that collects the remaining information (mount point path and default block size) so that it is only collected at rank 0 and then broadcasted to ranks 1 through N-1.

darshan-job-summary.pl fails when lots of files created

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

The darshan-job-summary.pl as installed on eureka fails to generate a pdf with log files containing lots (93,000+) records.

you can sort of see for yourself (except the error reporting is horrible here: i'll open a new bug for that):

darshan-job-summary.pl /intrepid-fs0/logs/darshan/2010/10/1/mmin_nekcem_id314635_10-1-13849_4.darshan.gz

after 3 minutes there will be an error about unable to move summary.pdf

I had to hack up the perl script to keep the temp dir handy and run pdflatex myself. the error from pdflatex (once it is no longer diverted to the output file) is the fairly terse


Overfull \hbox (44.47777pt too wide) in paragraph at lines 95--96
 [][]
! Dimension too large.
\@currbox ->\bx@C

l.96 \end{figure*}

track per-process information in addition to per-file information

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

This ticket would be best done after modularizing the log format (see #46). In addition to storing per-file records, we could store summary record per process (but spanning files) as well.

For example, we might want to know the total amount of time a process spent doing metadata, reads, and writes, as well as the number of bytes it read and wrote, regardless of how many files it opened or how many threads it used.

We could use this per-process data (in conjunction with a reduction step) to produce an immediate performance estimate without much post processing.

This also depends on accurate thread accounting in #81.

address cuserid() issue on BG/Q

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

The current BG/Q environment doesn't support cuserid(), which Darshan uses to identify the userid for each job. There is a ticket open with IBM to track this issue. If IBM doesn't address it, our fallback plan on BG/Q will be to use the $USER environment variable, and possibly hash it to generate a unique numerical id for each user.

mpi wrappers no longer report mpich version

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

expected behavior: 'mpicc -v' should tell me something like "mpicc for MPICH2 version 1.4a1" or "mpicc for 1.1".

observed behavior: 'mpicc -v' give me only compiler flags, mentioning nothing about the mpi implementation version.

fallout: the HDF5 configure checks do not find the string they expect and then fail to enable some significant optimizations.

add darshan-configure utility to show link flags and other parameters

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

In particular, darshan-configure could at least show --pre-ld-flags and --post-ld-flags to indicate the manual link flags that must be added before and after existing link flags in order to manually add Darshan instrumentation.

Right now there isn't any good way to find these settings except to inspect a script produced by darshan-gen-cc.pl.

environment variable to enable timing

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

The darshan shutdown routine includes the ability to time various steps and print the results. This is normally used only by the cp-shutdown-bench program.

It would probably be a good idea to add an environment variable that can be checked at run time to enable this timing on any job run.

evaluate how many jobs hit wall time on Intrepid

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

This is just an exercise to better understand why we lack coverage of some jobs/projects. One reason is if the job end without calling MPI_Finalize. We should be able to detect these jobs based on job database information and cross reference against darshan coverage information.

Use of off_t in darshan wrapper

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

Since off_t can defined as 32 or 64 bits in the calling program. Some interfaces that use off_t could potentially fail. I'm not sure what happens in each scenario, but we should evaluate this.

use statfs() to detect file alignment when --enable-stat-at-open is not set

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

Right now we set alignment to -1 if we aren't able to stat the file at open time. As an alternative, we can call statfs() on all mounted file systems to get the default block size. This can be done on rank 0 at startup and broadcast to all processes.

We also need to add an exception for Lustre, because it appears to always set the block size in statfs() to 4K.

concurrent I/O from threads gets counted twice in timing

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

If two threads (in the same MPI process) access the same file concurrently, then the cumulative time counters are incremented too far.

We need to add a reference count to the run-time data structure to tell how many threads are accessing the same file at once. The time should not be incremented until the reference counter hits zero.

This does not require a log format change.

install darshan-job-summary.pl with make install target

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

The darshan-job-summary.pl is not currently installed by default. If it were, then we could place the associated perl modules etc. in a consistent location (the install prefix) and prevent problems that occur when the darshan-job-summary.pl is moved.

The script itself should be updated to print helpful messages if pdflatex is not found, gnuplot is not found, or gnuplot does not have pdf support.

Collisions on darshan generated file name

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

Ray Loy reported this with cobalt-subrun but could also happen in a normal mpirun environment.

Jobs started with cobalt-subrun all have the same jobid and user and possibly the same start/end time to the second resolution. This will cause one darshan log to be written but other mpiruns will have errors in the standard error complaining that the darshan log could not be written.

use darshan-gen-* scripts at make install time by default

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

The current Darshan releases install hand-coded mpi scripts for the IBM BG/P. We should convert to using the darshan-gen-cc.pl, darshan-gen-cxx.pl, and darshan-gen-fortran.pl scripts instead.

We should also make sure to include the XL compiler wrappers on BG/P.

detect and reduce partially shared files

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

At runtime, Darshan detects files that are shared across MPI_COMM_WORLD and reduces them to a single record. It also finds the min, max, and variance of both time and bytes moved for each process.

We do not have the same functionality on partially shared files (ie, files in which only a subset of nodes open the same file). Detecting partially shared files and reducing them may be difficult at run time; we might want to consider building that functionality into darshan-parser.

update darshan-gen-* scripts to use darshan-configure

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

I'm not certain about ticket yet; we need to re-evaluate after implementing darshan-configure. It may make the existing compiler wrappers simpler if they do something like:

ld darshan-configure --pre-ld-flags NORMAL_ARGUMENTS darshan-configure --post-ld-flags rather than hardcoding the full list in the generated script itself.

modularize log file format

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

In the future we need to track more high level library and data model information in Darshan. Right now, however, the log file format has to change every time we modify any counters and it gets larger with each interface we instrument.

It would be nice if the counters in darshan weren't just an array of integers, but instead had a concept of dividing the information into opaque sections for each type of instrumentation. That would allow us to experiment with new instrumentation (and parsers for that portion of the instrumentation) without breaking the entire file format. Old parsers could ignore any optional sections of the log file that it doesn't know how to parse.

darshan-job-summary.pl variance table un-renderable if log file contains too many records

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

darshan-job-summary.pl should do a better job of reporting errors. this log file (/intrepid-fs0/logs/darshan/2010/10/1/mmin_nekcem_id314635_10-1-13849_4.darshan.gz) has a lot of records. the perl script generates the variance table, but that table contains 16954 lines.

the perl script runs pdflatex (twice) but does not check for errors. Ok, so you call it with -halt-on-error but if there is an eror, the subsequent move of the output file fails, and fails in a very cryptic way.

suggested fix: either check the exit status of the final pdflatex or check for the existence of the summary.pdf file. in case of error, at least dump the latex.output2 file. since darshan deletes the temp dir, there's no record of what went wrong.

Darshan support for Cray

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

Darshan can be made to work on Cray already by using the LD_PRELOAD method, but ideally we would support static linking as well via modifications the cc and ftn scripts.

support mpi 1.x

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

Darshan does a few things that are particular to mpi 2.x. Notably the MPI_Type_get_envelope (and corresponding #defines for types that it produces) are only available in 2.x.

Note that on systems that have both 1.x and 2.x flavors of MPI installed, we will need to install multiple darshan libraries to support them.

valgrind error in log compression

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

Need to see if there is an off-by-one error in darshan or if libz is doing something unusual:

==5852== Invalid read of size 4
==5852==    at 0x404A27D: crc32 (in /lib/libz.so.1.2.3.3)
==5852==    by 0x404C85A: ??? (in /lib/libz.so.1.2.3.3)
==5852==    by 0x404E1BA: ??? (in /lib/libz.so.1.2.3.3)
==5852==    by 0x404CB16: deflate (in /lib/libz.so.1.2.3.3)
==5852==    by 0x805BF26: cp_log_compress (darshan-mpi-io.c:1687)
==5852==    by 0x804C173: darshan_shutdown (darshan-mpi-io.c:429)
==5852==    by 0x804C4A6: MPI_Finalize (darshan-mpi-io.c:504)
==5852==    by 0x804B430: main (in /home/pcarns/working/darshan-examples/mpi-io-test)
==5852==  Address 0x55db614 is 964 bytes inside a block of size 967 alloc'd
==5852==    at 0x4024F20: malloc (vg_replace_malloc.c:236)
==5852==    by 0x805C321: darshan_get_exe_and_mounts (darshan-mpi-io.c:1818)
==5852==    by 0x804BC3C: darshan_shutdown (darshan-mpi-io.c:303)
==5852==    by 0x804C4A6: MPI_Finalize (darshan-mpi-io.c:504)
==5852==    by 0x804B430: main (in /home/pcarns/working/darshan-examples/mpi-io-test)
==5852== 
==5852== Invalid read of size 1
==5852==    at 0x4026979: memcpy (mc_replace_strmem.c:497)
==5852==    by 0x404C6C9: ??? (in /lib/libz.so.1.2.3.3)
==5852==    by 0x404E1BA: ??? (in /lib/libz.so.1.2.3.3)
==5852==    by 0x404CB16: deflate (in /lib/libz.so.1.2.3.3)
==5852==    by 0x805BF26: cp_log_compress (darshan-mpi-io.c:1687)
==5852==    by 0x804C173: darshan_shutdown (darshan-mpi-io.c:429)
==5852==    by 0x804C4A6: MPI_Finalize (darshan-mpi-io.c:504)
==5852==    by 0x804B430: main (in /home/pcarns/working/darshan-examples/mpi-io-test)
==5852==  Address 0x55db617 is 0 bytes after a block of size 967 alloc'd
==5852==    at 0x4024F20: malloc (vg_replace_malloc.c:236)
==5852==    by 0x805C321: darshan_get_exe_and_mounts (darshan-mpi-io.c:1818)
==5852==    by 0x804BC3C: darshan_shutdown (darshan-mpi-io.c:303)
==5852==    by 0x804C4A6: MPI_Finalize (darshan-mpi-io.c:504)
==5852==    by 0x804B430: main (in /home/pcarns/working/darshan-examples/mpi-io-test)

track exact number of bytes moved via MPI

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

Right now we only track the total number of bytes read and written at the posix level. This may be different than the number of bytes read and written at the MPI level due to various MPI-IO optimizations.

For files opened via MPI, we could get more accurate performance estimates by using the bytes transferred at the MPI level rather than at the POSIX level.

retain full file paths in darshan logs

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

Right now darshan only records the last N characters (12?) of each file name. This was done mainly because we were overly conservative out of concern for memory overhead.

Modify Darshan to record complete paths, ether by expanding the name field to PATH_MAX or by malloc'ing on demand.

We also need to record CWD as well, so that in post processing we can make a good guess as to the full path even when the application opens relative paths.

realpath() and similar functions are not an option because they walk the path and stat each directory.

Handle jobs that run to Wall Time limit

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

Some users choose to run jobs that will run to the wall time limit and allow the job to be killed by the scheduler. Since these jobs never call MPI_Finalize a darshan log is never produced. It would be useful if darshan logs could be captured for these type of users.

document how to modify the Cray compiler scripts for Darshan support

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

This will be fairly simple after #74 is complete, but we should add notes to the darshan-runtime documentation indicating how to add Darshan instrumentation to the Cray compiler scripts for statically linked executables.

The short story is that cc invokes linux-cc which ends with a link command. The additional arguments must be added to that link command.

There is probably a similar change to make for fortran and c++ compilation.

improve performance measurement methodology

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

Right now there is a fair amount of work involved in computing the time spent performing I/O on a given process, and certain use scenarios (ie, multithreaded concurrent I/O) can obfuscate the calculation.

We should add explicit support for measuring I/O time per process at least, and maybe also directly calculate a performance estimate at runtime.

mismatch in variance table

In GitLab by @shanedsnyder on Sep 24, 2015, 16:24

To reproduce using attached log file:

$ darshan-job-summary.pl darshan-summary-table-bug-example.darshan.gz --output darshan-summary-table-bug-example.pdf

$ darshan-parser darshan-summary-table-bug-example.darshan.gz |grep 392949181 |grep RANK_BYTES

-1	17827256362278409223	CP_FASTEST_RANK_BYTES	8388608	...392949181	/intrepid-fs0	gpfs
-1	17827256362278409223	CP_SLOWEST_RANK_BYTES	8388608	...392949181	/intrepid-fs0	gpfs
-1	17827256362278409223	CP_F_VARIANCE_RANK_BYTES	0.000000	...392949181	/intrepid-fs0	gpfs

If you compare that darshan-parser output to the 2nd entry in the variance table in the summary pdf, then you acn see that the number of bytes doesn't match up.

performance test overhead across platforms

In GitLab by @shanedsnyder on Sep 24, 2015, 16:25

Run some tests to give a current snapshot of the % overhead that darshan introduces on BG/P, Cray, and Linux clusters. We have access to at least one example of all three. The test plan:

  • use IOR with a relatively small access size (256K)
  • test both shared and unique files
  • do 5 samples per data point (to combat noise across runs)
  • aim for 60 second runs
  • weak scaling
  • compare with and without darshan, both in terms of run time and ior reported performance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.