Coder Social home page Coder Social logo

ztong87 / mfact Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 44.02 MB

MFACT: An MPI Fast Application Classification Tool based on SST DUMPI Trace.

License: Other

Makefile 0.28% M4 15.68% Python 0.10% Shell 0.42% Perl 0.83% C++ 7.83% C 74.47% Fortran 0.15% R 0.24%

mfact's Introduction

MFACT: An MPI Fast Application Classification Tool based on SST DUMPI Trace.

If you use the resources available here in your work, please refer to this paper as the source.

@INPROCEEDINGS{7516058,
author={Z. Tong and S. Pakin and M. Lang and X. Yuan},
booktitle={2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)},
title={Fast Classification of MPI Applications Using Lamport's Logical Clocks},
year={2016},
pages={618-627},
doi={10.1109/IPDPS.2016.40},
ISSN={1530-2075},
month={May}}

Please refer to the SST-DUMPI-README for copyright information.

This README file is used for Fast Classification Tool and it is based on the sst-dumpi library from Sandia National Lab. This package is part of the Message Disembogulator Suite (MDS), known internally as LA-CC-13-036. It concurrently runs multiple copy of updated dumpi2ascii to process dumpi traces and to model application performance over an arbitrary number of network configurations in one trace run.

Before installation, be sure to install openmpi or mpich
Files that have been updated for this tool include:

  • ~/sst-dumpi/dumpi/bin/dumpi2ascii.c
  • ~/sst-dumpi/dumpi/bin/dumpi2ascii-callback.c
  • ~/sst-dumpi/dumpi/bin/dumpi2ascii-callback.h

To Configure

./bootstrap.sh

To configure

./configure CC=mpicc CXX=mpicxx --enable-libdumpi --prefix=$HOME/$DUMPI_PATH

To Make and install

make

make install

Edit the LD_LIBRARY_PATH

export LD_LIBRARY_PATH=$HOME/sst-dumpi/dumpi/bin

To run fast classification tool requires three parameters

*-n: number of ranks deployed need to the same as the number of dumpi traces (ranks)
*-x: number of ranks per compute node (required to differentiate intra-node and inter-node communication
*-p: file prefix of dumpi traces

For instanace, to run a 64 ranks of NPB-BT with 1 rank per node:

mpirun -n 64 dumpi2ascii 1 tests/npb-bt-traces/dumpi-2015.07.22.20.22.28-

As a result, a file named "summary" will be generated to show the performance predictions over 147 pre-defined network configurations.

There are three sections in the perfomrance summary:

  1. Summary of time counters by percentage and classification results
  2. Communication sensitivity to various network configurations
  3. Performance summary of prediction results

Simulation environment parameters:

In ~/sst-dumpi/dumpi/bin/dumpi2ascii-callback.h:

  • MAXNUMRECORD: maximum number of MPI records in each trace
  • MAXNUMCONFIGS: maximum number of network configurations
  • MEMORY_BW: memory bandwidth for the target system
  • MEMORY_LAT: memory latency for the target system
  • OVERLAP_CONTROL: overlap options, lat-first (default) or bw-first

Computation scaling factor is defined and can be adjusted in ~/sst-dumpi/dumpi/bin/dumpi2ascii.c

  • config.comp_factor = 1.0;

Notes:

  1. List of intercepted MPI Operations are shown in ~/sst-dumpi/dumpi/bin/dumpi2ascii-callback.h
    This covers the majority of the frequently used MPI functions.

  2. User defined datatypes are sometimes not recorded properly in dumpi traces.
    For more details, check trace_initTypes() in ~/sst-dumpi/dumpi/bin/dumpi2ascii.c

  3. If you run larger traces on a single node, please check the file limits in /etc/security/limits.conf and update accordingly.
    user hard nofile 500000
    user soft nofile 500000

Predictive Modeling

  1. Please see how the predictive model is built in model.R.

mfact's People

Watchers

Tony avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.