Coder Social home page Coder Social logo

score-p / scorep_plugin_fileparser Goto Github PK

View Code? Open in Web Editor NEW
4.0 4.0 2.0 47 KB

This repository contains the Score-P fileparser Plugin capable of logging system parameters from file descriptors.

License: BSD 3-Clause "New" or "Revised" License

CMake 8.19% C 90.82% Shell 0.98%

scorep_plugin_fileparser's People

Contributors

blastmaster avatar bmario avatar quimoniz avatar tilsche avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scorep_plugin_fileparser's Issues

clang-format

Add the tu-zih-energy .cflang-format and apply.

Memory of blobholder leaks.

The allocated memory of blobholder leaks.
Memory is allocated in measurement_blob.c on line 140, see

struct blob_holder* blobarray_create(uint64_t initial_capacity, uint64_t initial_value)
{
    /* try a calloc */
    struct blob_holder* container = calloc(1, sizeof(struct blob_holder));
   ...
}

but there is no free.

Delimiter is limited

Presently the delimiter that can be specified for parsing files is severely limited, due to ";" and "," being interpreted as variable specification syntax. Therefore it should be implemented to allow specification of Hex Code values for the delimiter e.g. "0x41" denoting the delimiter "A".

Optimize performance

This section that currently utitilizes strdup() should be optimized. Since implicitly using malloc() (through strdup()) is quite expensive.

Instead of using strdup(), just repair the string on-the-fly.
e.g. do:

*(curToken-1) = DELIMITER;

Thus the \0 characters within the string are being healed.

Segfault in `pthread_join` when running with more than one task per node.

This plugin crashes with a segmentation fault in pthread_join here, when run with more than one process per node.

Reporting the following backtrace:

==== backtrace ====
 2 0x000000000006bc9c mxm_handle_error()  /var/tmp/OFED_topdir/BUILD/mxm-3.7.3112/src/mxm/util/debug/debug.c:641
 3 0x000000000006c1ec mxm_error_signal_handler()  /var/tmp/OFED_topdir/BUILD/mxm-3.7.3112/src/mxm/util/debug/debug.c:616
 4 0x0000000000036400 killpg()  ??:0
 5 0x0000000000008f81 pthread_join()  ??:0
 6 0x00000000000018d0 fini()  /home/soeste/code/scorep_plugin_fileparser/fileparser_plugin.c:217
 7 0x000000000003f0f6 finalize_source()  /home/h0/soeste/score-p/sources.616d7962/build/build-backend/../../build-backend/../src/services/metric/scorep_metric_plugins.c:631
 8 0x000000000003b782 metric_subsystem_finalize()  /home/h0/soeste/score-p/sources.616d7962/build/build-backend/../../build-backend/../src/services/metric/scorep_metric_management.c:575
 9 0x000000000002f3c4 scorep_subsystems_finalize()  /home/h0/soeste/score-p/sources.616d7962/build/build-backend/../../build-backend/../src/measurement/scorep_subsystem_management.c:355
10 0x0000000000022715 scorep_finalize()  /home/h0/soeste/score-p/sources.616d7962/build/build-backend/../../build-backend/../src/measurement/SCOREP_RuntimeManagement.c:967
11 0x0000000000039ce9 __run_exit_handlers()  :0
12 0x0000000000039d37 __GI_exit()  :0
13 0x000000000002255c __libc_start_main()  ??:0
14 0x000000000040415f _start()  ??:0
===================
srun: error: taurusi6488: tasks 1-15: Segmentation fault

After a quick look I found out that just one process per host starts a new metric thread to gather information since this plugin runs in SCOREP_METRIC_PER_HOST per-host-mode, which is totally correct. Therefore, the add_counter function is just called on one process by Score-P and other processes do not initialize the logging_thread but trying to join them later in fini() which results in the segfault.

Specify multiple separators

For some files it is necessary to specify multiple field separators: e.g. \t and ' '
Somehow allow for that too.
Thanks to @tilsche for reporting this

Heap corruption

Observed error:

*** Error in `./lo2s/install/bin/lo2s': realloc(): invalid next size: 0x0000000002918790 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x70bfb)[0x7fb954777bfb]
/lib/x86_64-linux-gnu/libc.so.6(+0x76fc6)[0x7fb95477dfc6]
/lib/x86_64-linux-gnu/libc.so.6(+0x7a13c)[0x7fb95478113c]
/lib/x86_64-linux-gnu/libc.so.6(realloc+0x159)[0x7fb954782719]
/home/service/scorep/install/lib/libfileparser_plugin.so(+0x3f39)[0x7fb954503f39]
/home/service/scorep/install/lib/libfileparser_plugin.so(blobarray_append+0xb4)[0x7fb954503cd7]
/home/service/scorep/install/lib/libfileparser_plugin.so(+0x1a2f)[0x7fb954501a2f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494)[0x7fb954aad494]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7fb9547efacf]

Looks like heap corruption. Not trivially reproducible.

Valgrind gives some hints (reproducible, maybe unrelated?!)

==8941== Invalid write of size 8
==8941==    at 0x4C32765: memmove (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8941==    by 0x5CFE125: tryInsertingVarParamsSorted (fileparser_plugin.c:488)
==8941==    by 0x5CFDDB9: tryInsertingFileParams (fileparser_plugin.c:415)
==8941==    by 0x5CFDC04: get_event_info (fileparser_plugin.c:350)

More detailed errors

When parsing the variable specifications do give a more detailed explanation than "syntax incorrect", do point the user to the mistake they made, e.g.

  • tell the user that a '+' is missing

Also do make the '+' specification optional

cannot record multiple metrics per line

If multiple metrics per line are specified, only the first one (in the list of metrics) gets values.

service@igel:~$ export SCOREP_METRIC_FILEPARSER_PLUGIN="nfs_write_ops:int@/proc/self/mountstats+r=48;c=2;s= ;a;d,nfs_write_trans:int@/proc/self/mountstats+r=48;c=3;s= ;a;d,nfs_write_sent:int@/proc/self/mountstats+r=48;c=5;s= ;a;d,nfs_write_queue:int@/proc/self/mountstats+r=48;c=7;s= ;a;d,nfs_write_rtt:int@/proc/self/mountstats+r=48;c=8;s= ;a;d,nfs_write_execute:int@/proc/self/mountstats+r=48;c=9;s= ;a;d"
service@igel:~$ ./lo2s/install/bin/lo2s -v -e cpu-clock -a --metric-leader cpu-clock
[1829909624381373][pid: 10872][tid: 139648365047488][ INFO]: Enabling log-level 'info'
[1829909624458133][pid: 10872][tid: 139648365047488][ INFO]: checking available events...
[1829909638099402][pid: 10872][tid: 139648365047488][ INFO]: Using trace directory: lo2s_trace_2018-10-16T19-24-32
[1829909639039803][pid: 10872][tid: 139648365047488][ INFO]: Plugin 'fileparser_plugin' recording channels: nfs_write_ops, nfs_write_trans, nfs_write_sent, nfs_write_queue, nfs_write_rtt, nfs_write_execute, 
[1829909639188964][pid: 10872][tid: 139648365047488][ INFO]: Initialization done. Start recording...
^C[1830242154823202][pid: 10872][tid: 139648365047488][ INFO]: Recording done. Start finalization...
[1830242154904757][pid: 10872][tid: 139648365047488][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_ops' 190 data points.
[1830242154941225][pid: 10872][tid: 139648365047488][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_trans' 0 data points.
[1830242154965295][pid: 10872][tid: 139648365047488][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_sent' 0 data points.
[1830242154988760][pid: 10872][tid: 139648365047488][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_queue' 0 data points.
[1830242155012689][pid: 10872][tid: 139648365047488][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_rtt' 0 data points.
[1830242155035564][pid: 10872][tid: 139648365047488][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_execute' 0 data points.

Remove the first one, and the second one suddenly has values.

service@igel:~$ export SCOREP_METRIC_FILEPARSER_PLUGIN="nfs_write_trans:int@/proc/self/mountstats+r=48;c=3;s= ;a;d,nfs_write_sent:int@/proc/self/mountstats+r=48;c=5;s= ;a;d,nfs_write_queue:int@/proc/self/mountstats+r=48;c=7;s= ;a;d,nfs_write_rtt:int@/proc/self/mountstats+r=48;c=8;s= ;a;d,nfs_write_execute:int@/proc/self/mountstats+r=48;c=9;s= ;a;d"
service@igel:~$ ./lo2s/install/bin/lo2s -v -e cpu-clock -a --metric-leader cpu-clock
[1830265208970119][pid: 10904][tid: 140326624992960][ INFO]: Enabling log-level 'info'
[1830265209041432][pid: 10904][tid: 140326624992960][ INFO]: checking available events...
[1830265226914799][pid: 10904][tid: 140326624992960][ INFO]: Using trace directory: lo2s_trace_2018-10-16T19-30-27
[1830265228275974][pid: 10904][tid: 140326624992960][ INFO]: Plugin 'fileparser_plugin' recording channels: nfs_write_trans, nfs_write_sent, nfs_write_queue, nfs_write_rtt, nfs_write_execute, 
[1830265228459904][pid: 10904][tid: 140326624992960][ INFO]: Initialization done. Start recording...
^C[1830285950172117][pid: 10904][tid: 140326624992960][ INFO]: Recording done. Start finalization...
[1830285950223695][pid: 10904][tid: 140326624992960][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_trans' 4 data points.
[1830285950253160][pid: 10904][tid: 140326624992960][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_sent' 0 data points.
[1830285950278237][pid: 10904][tid: 140326624992960][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_queue' 0 data points.
[1830285950294802][pid: 10904][tid: 140326624992960][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_rtt' 0 data points.
[1830285950310828][pid: 10904][tid: 140326624992960][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_execute' 0 data points.

Swap order to see it's the location in the list, not the smallest cloumn number:

service@igel:~$ export SCOREP_METRIC_FILEPARSER_PLUGIN="nfs_write_sent:int@/proc/self/mountstats+r=48;c=5;s= ;a;d,nfs_write_queue:int@/proc/self/mountstats+r=48;c=7;s= ;a;d,nfs_write_rtt:int@/proc/self/mountstats+r=48;c=8;s= ;a;d,nfs_write_execute:int@/proc/self/mountstats+r=48;c=9;s= ;a;d,nfs_write_ops:int@/proc/self/mountstats+r=48;c=2;s= ;a;d,nfs_write_trans"
service@igel:~$ ./lo2s/install/bin/lo2s -v -e cpu-clock -a --metric-leader cpu-clock
[1830806124834395][pid: 10961][tid: 140548152229568][ INFO]: Enabling log-level 'info'
[1830806124922792][pid: 10961][tid: 140548152229568][ INFO]: checking available events...
[1830806151435702][pid: 10961][tid: 140548152229568][ INFO]: Using trace directory: lo2s_trace_2018-10-16T19-39-28
Score-P Fileparser Plugin: Could not parse variable specification "nfs_write_trans". Syntax incorrect?
[1830806153313523][pid: 10961][tid: 140548152229568][ INFO]: Plugin 'fileparser_plugin' recording channels: nfs_write_sent, nfs_write_queue, nfs_write_rtt, nfs_write_execute, nfs_write_ops, 
[1830806153701017][pid: 10961][tid: 140548152229568][ INFO]: Initialization done. Start recording...
^C[1830847377970078][pid: 10961][tid: 140548152229568][ INFO]: Recording done. Start finalization...
[1830847378026422][pid: 10961][tid: 140548152229568][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_sent' 13 data points.
[1830847378063890][pid: 10961][tid: 140548152229568][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_queue' 0 data points.
[1830847378082205][pid: 10961][tid: 140548152229568][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_rtt' 0 data points.
[1830847378098508][pid: 10961][tid: 140548152229568][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_execute' 0 data points.
[1830847378122680][pid: 10961][tid: 140548152229568][ INFO]: In plugin: fileparser_plugin received for channel 'nfs_write_ops' 0 data points.

Allow accumulated metrics

It should be possible to write accumulated metrics (e.g. ACCUMULATED_LAST) to the trace. Note that this is different than the "d" specifier which only subtracts the first value as initial offset.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.