sstsimulator / sst-core Goto Github PK

View Code? Open in Web Editor NEW

118.0 23.0 85.0 20.85 MB

SST Structural Simulation Toolkit Parallel Discrete Event Core and Services

Home Page: http://www.sst-simulator.org

License: Other

Makefile 0.86% C++ 74.12% C 0.08% Python 18.25% Shell 0.30% M4 2.46% Objective-C 0.68% CMake 1.63% Mako 1.62%

simulator mpi parallel discrete-event simulation-core simulation snl-performance-workflow snl-other

sst-core's People

Contributors

Stargazers

Watchers

Forkers

pranith grantmackey-wdc cdunham gthparch jjwilke umd-memsys akashdeep-aulakh hongyunnchen deanchester beuth-diglab rowhit dskarlatos luisfnqoliveira mperrinel tiffanyann grantmackey alecgrover shayan-taheri lpsmodsimteam hughes-c e10harvey vadimkutovoi allevin fryeguy52 jason-zhangyp qqk2008 feldergast gvoskuilen abhishekkumarjain zeta1999 ribbas chopperwritecode nmhamster vjleung jyoung3131 razdanrohin smukher jmox0351 calewis hpcgarage jade94605 vinayby jpkenny william-an jwilso alinezhad2018 pdbj connie120 abhi5658054 stanso ct-clmsn bremerm31 vishnuv0209 kaushik-shiva20 bpswenson cohitherewer bripage berquist stkaplan researcherben nab880 lchj jmlapre vdbhatt knob1420 ldalessa bwhitchurch hooninator sarah-wt abubekermohammed joongunpark beardhatcode bliu1013 fyt-c96 mrutt92 ritheshn nixuan123 insertinterestingnamehere jleidel shannong donaldkruse gglin001

sst-core's Issues

Trac: #459 --output-config pyFile option creates invalid Python on embernightly

emberLoad.py is used as input to a number of tests. The "--output-config" option seems to be yielding invalid Python on these.

Here is a sample snip-it:

# Automatically generated by SST
import sst
# Define SST Program Options:
sst.setProgramOption("timebase", "1 ps")
sst.setProgramOption("stopAtCycle", "0 ns")
# Define SST Statistics Options:
# Define the SST Components:
comp_rtr_0x0x0 = sst.Component( < < < < < < < Broken line
rtr.0x0x0", "merlin.hr_router")
comp_nic0 = sst.Component(
nic0", "firefly.nic")
comp_nic0.addParams({
"link_bw" : """4GB/s""",
"nic2host_lat" : """150ns""",
"num_vNics" : """1""",
"packetSize" : """2048B""",
"module" : """merlin.linkcontrol""",
"rxMatchDelay_ns" : """100""",
"verboseLevel" : """0""",
"buffer_size" : """14KB""",
"txDelay_ns" : """50""",
"nid" : """0"""
})

Trac: #400 Configure check for PIN should abort if too old of a version is found

When I .configure using --with-pin=${INTEL_PIN_DIRECTORY} and the directory which I point to contains a version of PIN which is too old to work with the current SST, an error should occur, rather than blind acceptance. It is not nice to have configure work, but then make fail with a very non-obvious compilation error (missing header files).

Trac: #450 zoltan partitioner and --disable-mpi

When the user runs .configure --disable-mpi --with-zoltan=somepath, Configure will report an error that Zoltan was not found. This is not accurate, rather, the Zoltan autoconf checks require the MPI compilers (#include <mpi.h>), which are not being used.

Two items:

We should give a better error in this case
With future multi-thread support, Zoltan partitioning may still be desired in a non-MPI build.

NotSerializable macro needs namespace for SST::Output

static void
throw_exc(){
**# SST::**Output ser_abort("", 5, -1, SST::Output::STDERR);
ser_abort.fatal(CALL_INFO_LONG, -1, "ERROR: type %s should not be serialized\n",#obj);
} \

Add feature of output class to support an start_block_output() & stop_block_output()

Need an atomic output ability to the output class to allow a start_block_output() and stop_block_output() that collects output data. When stop_block_output() is called, all collected output data is then output. This is designed to prevent interleaving of output data.

Possible implementation with a counter that increments on each start and decrements on each stop, and outputs when counter reaches 0.

Requested by Scott H.

SST Python Scripts need mechanism to set thread count

Neef a setThreadCount in Python for cases where threading shouldn't happen (components are known not to be thread safe).

Trac: #454 Statistics SimTime does not match serial to parallel

When a simulation is run in parallel, one expects the exact same results as when a simulation is run in serial.

The SimTime field, seen in the CSV statistic output, does not match between serial and parallel. This is probably due to the sync intervals between ranks.

Trac: #434 SST Checkpointing no Longer Works

Use case from ISCA tutorial, need checkpointing for long running simulations.

Trac: #438 Old sst/core/stats directory should be removed

The sstcorestats directory should be removed. Note: this is an older implementation of statistics. The new statistics API is in the sstcorestatapi directory.

Implement SharedRegion::merge(uint8_t target, const uint8_t newData, size_t size)

This version of SharedRegion merge is not implemented which means that the Bulk Merger fails. You are put to Bulk Moves as soon as you call getRawPtr(), which means you can't get raw access to the region and have it propagate to other ranks.

Trac: #202 Provide nightly tarballs

We should provide, on our download site, a nightly tarball of the current state of SSTs trunk repository - the result of running make dist. #152 could then download and use that tarball as the basis for running tests.

Trac: #466 configure creates incorrect config when pin is found in the shell environment but not on the command line

configure creates incorrect config when pin is found in the shell environment but not on the command line

Trac: #436 Simple interfaces in the SST Info output

Feedback from ISCA tutorial, want to see simple interfaces in the SST info output dump.

Trac: #367 --disable-elements=all, --enable-elements=XXX,YYY,ZZZ

It would be nice to be able to select which element libraries are built at .configure time. While one can put a .ignore file in each element library directory, it would be nice to have configure arguments like:

.configure --disable-elements=all --enable-elements=Merlin,MemHierarchy

Which would only enable those two Element Libraries. It would also be an error if a user specifically requested an element library which was unable to configure itself. ie, request Ariel, but PIN wasn't found.

SubComponent doesn't support isPortConnected()

Component has member function isPortConnected() but SubComponent() does not.

SST Statistics should shut down after components

There are summary statistics that should be collected for a component as the component is being shutdown, therefore we need to move the shutdown of the Statistics objects to a time after the component objects have been shut down.

I've made small changes in main and simulation that work in my testing, though someone more familiar with the Statistics class should review them.
core-diffs.tar.gz

Enhance configure script to check compiler version

Arrange for the configure script to check the compiler version, in the case of known compilers (e.g., gcc, Intel cc) to make sure that is a new enough version to compile SST successfully. In the case of an older, unsupported compiler version, report an error and halt configuration.

SharedRegionMerger::merge() with change sets does not detect conflicts

The merge() function in SharedRegionMerger copies the change set data without checking for conflicts. I'm not sure if the conflict check is supposed to happen elsewhere, but I was able to write multiple times to the same location in a shared region, and neither the local SharedRegionManager nor the merge detected the conflicts.

Trac: #456 configure doesnt check for zlib.h

.configure should check for the existence of zlib.h

Trac: #236 sst run-mode init option fails on Sirius Zodiac Trace test case.

The symptoms look the same as #213. However #7087, which fixed the iris test and the portals_sm tests does not fix the Sirius Zodiac Trace test case. To observe the failure go to the directory: sstelementszodiactestallreduce and enter: sst --output-config newPyFile.py --model-options --shape=27 --run-mode init allreduce.py > all.out If the --run-mode init is removed, it executes correctly. A non-gdb output from the seg-fault is attached.

Trac: #383 SST generates boost serialization warnings

This ticket is to document an issue with SSTs interaction with boost (version1.56 at the time of this issue).

SST serialization code sstcoreserializationcore.h , sstcoreserializationelement.h and others generate a significant amount of warnings related to boost serialization. Many of these warnings are in boost file mplprint.hpp and are related to signedunsigned comparison or divide by zero warnings.

According to boost issue #4953, these warnings (such as code causing a divide by zero) are intentional to assist users in tracing the templates used in the code.

These warnings are a significant distraction from real warnings and have had no impact on operation of SST. Therefore they are being ignored via some gcc and clang pragmas

search for

pragma GCC diagnostic push

pragma GCC diagnostic ignored -Wsign-compare

pragma clang diagnostic push

pragma clang diagnostic ignored -Wdivision-by-zero

These pragmas are being created for gcc 4.6 and later and current versions of clang. gcc versions earlier than 4.6 will still generate the warnings.

While the warnings are being ignored, they still exist for a reason. In the future, it is advised that the issues causing serialization warnings be investigated and addressed as necessary.

sst-info does not work after external core/elements split

sst-info does not find the registered external elements

StopAction is not thread safe

When StopAction fires on a rank, all threads are stopped, even if they haven't reached the stopAt time. StopAction needs to be a collective operation and fire once all threads have made it to the stopAt time.

Doxygen build does not work on external core

make html (which kicks off a doxygen build) does not generate any files.

sstinfo should be able to search for subcomponents

You can search sstinfo by component: sstinfo miranda.BaseCPU, but you currently cannot search for subcomponents, ie: sstinfo miranda.GUPSGenerator does not return anything.

Trac: #389 Adaptation in stats output format

I suggest the following simple change in the .csv format

Divide ComponentName in two columns ComponentName and ComponentId In this way, rtr.0x0x0 would become rtr, 0x0x0. Unless the point is default separator (if yes, this ticket can be deleted but perhaps the point separator concept should be documented), keeping a single column makes parsing the component name hasardous.

Trac: #275 Upon successful nightly build TITLE test, publish make dist tarball and svn rev num

This ticket collects two related enhancements. Developers are often unsure when to do an svn update in their sandbox. They dont really know if the head of the trunk produced a good build or not. We need to publish in some very convenient place the svn revision number that was used for the last successfull build and test of the trunk during the nightly process. In a similar vein, some non-developer users may need a version of the source code from the trunk in order to obtain a new feature or bug fix. They too are at risk if they simply check out the trunk head. Instead, provide a tarball of the last successful build and test of the trunk during the nightly process.

Trac: #433 SST auto connect to debug on signal?

Use case from ISCA, can we automatically connect to a debugger when a terminate signal is received Would be a good outcome for long running simulation.

Statistics API bug: Values shown in histogram bin labels can overflow.

In file sst/core/statapi/stathistogram.h, template class HistogramStatistic, private member function registerOutputFields():

When the bin fields are generated in the for loop, the generated field name includes the range of bin values, formatted as -, where the lower and upper limits are formatted in decimal.

Sometimes, as in the case of collecting histograms over memory address ranges, BinDataType ends up being instantiated as a 64-bit type (uint64_t). Meanwhile, the number of bins (returned by getNumBins()) is always assumed by the code to fit in 32 bits (due to the uint32_t type on the bin index y), and meanwhile getBinWidth() (whose type is HistoBinType) may also return a value declared as 32 bits.

Therefore, in the following line of code which calculates binLL:

            binLL = (y * getBinWidth()) + getBinsMinValue();

the multiplication operator here can be doing a 32-bit multiply (since both operands are 32 bits), yet the result may not actually fit in 32 bits, in which case the result of the multiplication expression will overflow and be truncated to 32 bits before being stored in binLL, and so, a corrupt value may end up in binLL (and also binUL). This will result in the bin field labels being incorrect, which may lead to incorrect display or sorting of data in some later processing of the output .csv file.

This can happen, in particular, when running memsieve to collect memory access histograms over address ranges which exceed the 4GB limit for 32-bit addresses. The bin width (page size) and number of bins can both be declared as 32-bit values, yet their product may not fit in 32 bits. E.g., in the example sst configuration for sieve, I encountered the particular case where 4 million bins representing memory pages of size 4 KB each were being generated, so that the address ranges spanned a full 16GB.

However, the overflow is easily avoided by forcing the above expression to always invoke a 64-bit multiply operation, by casting either of the multiplicands to 64 bits before the multiplication occurs, e.g.:

            binLL = (y * (uint64_t)getBinWidth()) + getBinsMinValue();

And in cases when BinDataType happens to be only 32 bits, this change is harmless.

The above change has been implemented in a branch labeled "bugFix_binLabelOverflw" which I will push to the sst-core repository.

sst-info broken in new split-core

Reported by a customer:

-> sst-info miranda
ERROR: No such file or directory - When trying to open Directory @SST_ELEMLIB_DIR@
PROCESSED 0 .so (SST ELEMENT) FILES FOUND IN DIRECTORY @SST_ELEMLIB_DIR@

Looks like sst-info doesn't pull the search path from the SST configuration, but rather the hard-coded #define.

Rarely encounter code in core known to NOT be thread safe

The Merlin Dragon-12 test using 2 threads failed with a segmentation fault that is possibly attributable to taking such a path thru the core. (This is not a reproducible failure.)
Dragon12-seq-fault.txt

Trac: #467 configure of boost threads fails on sst-devel

checking for boost/filesystem.hpp... yes
configure: Performing linking checks for Boost Filesystem...
configure: Boost Filesystem configuration successful.
checking whether the Boost::Thread library is available... yes
checking for exit in -lboost_thread... no
checking for exit in -lboost_thread... (cached) no
checking for exit in -lboost_thread... (cached) no
configure: error: Could not link against boost_thread !
make: * [config.status] Error 1
[mjleven@sst-devel build]$

OpenMPI leaves many residual directories in /tmp

SST with OpenMPI leaves many thousands of residual directories in /tmp that accumulate run after run. These residual directories slowly eat up filesystem space. At the time of this issue creation, there are over 20,000 residual directories consuming 81MB of space.

$ ls /tmp/openmpi-sessions-jwilso\@sst-test_0/ | wc -w
20264
$ du -d0 -h /tmp/openmpi-sessions-jwilso\@sst-test_0/
81M /tmp/openmpi-sessions-jwilso@sst-test_0/
$ ls /tmp/openmpi-sessions-jwilso\@sst-test_0/
32768  35968  38862  41576  44514  47469  50489  53566  56683  59753  62665
32769  35969  38863  41577  44516  47470  50497  53567  56686  59754  62666
32770  35970  38864  41578  44518  47471  50498  53568  56687  59756  62667
32775  35971  38866  41579  44520  47472  50500  53569  56688  59758  62668
...
35963  38854  41573  44511  47464  50486  53559  56680  59744  62655  65532
35965  38855  41574  44512  47465  50487  53562  56681  59749  62659  65533
35967  38857  41575  44513  47467  50488  53565  56682  59751  62662  65535
$ ls /tmp/openmpi-sessions-jwilso\@sst-test_0/ | wc -w
20196

The residual directories can be grouped by their disk usage: 4k, 12k, and 16k.

The residual 4k directories are empty.

$ ls -l 34616
total 0
$ ls -ld 34616
drwx------ 2 jwilso jwilso 4096 May 18 12:39 34616
$ du -h 34616
4.0K    34616

The residual 12k directories have the following directory structure:

$ ls -ld 57532
drwx------ 3 jwilso jwilso 4096 May 18 12:35 57532
$ tree 57532
57532
└── 1
    └── 0
$ du -h 57532
4.0K    57532/1/0
8.0K    57532/1
12K 57532

The residual 16k directories have the following directory structure:

$ ls -ld 56572
drwx------ 3 jwilso jwilso 4096 May 18 14:28 56572
$ tree 56572
56572
├── 0
│   ├── 0
│   └── debugger_attach_fifo
└── contact.txt

2 directories, 2 files
$ du -h 56572
4.0K    56572/0/0
8.0K    56572/0
16K 56572
$ file 56572/0/debugger_attach_fifo 
56572/0/debugger_attach_fifo: fifo (named pipe)
$ stat 56572/0/debugger_attach_fifo 
  File: ‘56572/0/debugger_attach_fifo’
  Size: 0           Blocks: 0          IO Block: 4096   fifo
Device: fd03h/64771d    Inode: 917729      Links: 1
Access: (0644/prw-r--r--)  Uid: (17341/  jwilso)   Gid: (17341/  jwilso)
Access: 2016-05-18 14:13:58.414379047 -0600
Modify: 2016-05-18 14:13:58.414379047 -0600
Change: 2016-05-18 14:13:58.414379047 -0600
 Birth: -
$ cat 56572/contact.txt 
3707502592.0;tcp://134.253.243.30:35477
24151

Trac: #415 Add arbitrary group counts as an exit condition

It would be nice for a component to be able to have a new exit condition which is essentially a global count of some event defined by the component. Each component could define a condition by name and their portion of the global count. All components which are part of that count will exit when the global count is reached.

Trac: #452 Something strange about the single SST partitioning option.

Five partitioning options are tested and single results seem unexpectedly different:

test_linear2.out:Simulation is complete, simulated time: 31.949 us
test_linear4.out:Simulation is complete, simulated time: 31.949 us
test_linear8.out:Simulation is complete, simulated time: 31.949 us
test_roundrobin2.out:Simulation is complete, simulated time: 31.949 us
test_roundrobin4.out:Simulation is complete, simulated time: 31.949 us
test_roundrobin8.out:Simulation is complete, simulated time: 31.949 us
test_simple2.out:Simulation is complete, simulated time: 31.949 us
test_simple4.out:Simulation is complete, simulated time: 31.949 us
test_simple8.out:Simulation is complete, simulated time: 31.949 us
test_single2.out:Simulation is complete, simulated time: 18.4467 Ms
test_single4.out:Simulation is complete, simulated time: 18.4467 Ms
test_single8.out:Simulation is complete, simulated time: 18.4467 Ms

Trac: #313 support for stopAtWalltime

Currently sst provides setProgramOption stopAtCycle option to stop simulation at certain simulation time. As majority of actual sst simulation jobs are submitted to be run on machine which have queue limits. It would be beneficial if there is option like stopAtWalltime so that simulation could be exited by the allocated time and users can get some result back rather than program termination.

Implement a sparse array in Statistics Histogram

Implement the Statistics Histrogram using a sparse array to help reduce its memory footprint.

Ariel destructor is not being invoked on termination

Apparently this problem is not limited to Ariel.

Ariel leaves files around on /tmp as a consequence. (until the next system reboot)

Trac: #285 Enable debug print statements only after some specific, user defined simulated time

When dealing with long simulations, memHierarchys output could be quite big (100s GBs) even if the debug level is low.

Ali, from ARM, would like to see a feature to only activate --enable-debug after a some simulation time after the simulation begin. Nothing urgent though.

We have to be careful not to introduce some slowdown (ie a comparison every clock tick).

SST Core Static Build allows Elements to be Bound in during Link

Wants to be able to link some elements into SST Core so we can remove the need for dynamic linking. Can choose which elements to link in. Requested by Scott H.

--enable-debug missing

Another feature gone missing in the autoconf of the new split-core: '--enable-debug'. Without this, debug output is not available.

The pymodel infrastructure does not have a way to get number of threads

OpenMPI MacPorts Install isn't in Documentation

@mjleven was performing a new install with MacPorts and OpenMPI is not listed in the how-to. We should probably add this.

boost -mt lib support dissappeared

With the new autoconf setup, the support for auto-detecting the '-mt' variants of MacPort's Boost implementations has disappeared. (ie, 'libboost_serialization-mt.dylib) ./configure passes, which suggests that our autoconf macro for BOOST does not actually attempt a library-test, only a header-test.

Trac: #408 Simulation runtime parameters are not available to lower level objects

At startup of simulation, main() creates a Config object that decodes the runtime parameters. A pointer to this object is passed to the SimulationBase, but is not stored. When lower level objects want access to runtime parameters, they must go through various methods (either storing a parameter in simulation or passing a ptr to the config object) to get access.

If SimulationBase stored the pointer to the config object (which never goes away in main), and provided a getConfig() method, then objects could easily retrieve the runtime parameters.

Trac: #431 Connector for Components/Modules written in Python

Use case from ISCA tutorial, would like to be able to prototype components in Python and have them connect to SST simulations (which may use non-Python components). Requires connector from C++ classes to Python, issues with Python interpreter per component etc

Trac: #430 Profile driven partition scheme for SST

Use case from ISCA tutorial - can we provide a way to profile a simulation linkscomponents and then record this information, load in for a future run so that the partition scheme is optimized for the next run.

Trac: #442 SST Problem list with GCC-5.1

The compiler segment faults attempting to build Patterns -- Issue 439.

There is an sst seg-fault in the qsim test.

The Prospero PIN library appears to be incompatible with gcc-5.1.

The SST build gets an ugly number of warning from Boost (1.56).

With a dot-ignore in pattern, successfully ran 27 test Suites with 98 tests in addition to the 150 tests in the Ember Sweep test.

Add guaranteed event delivery order

It is currently possible for events that arrive at the same simulated time to be delivered in different orders depending on the partitioning. In general, models should be written to be agnostic to event ordering, but there are some models for which this is difficult. Need to add a method for guaranteeing that events are always delivered in the same order for serial and parallel jobs.

Marsaglia does not use enough random data for 64bit results

Marsaglia's RNG system only uses 32bits of random data to calculate a 64bit value (double, Int64, UInt64). This leads to cases where the results appear non-random. (For example, most of the time, a call to generateNextUInt64() will return a number that is a multiple of 512.)