Coder Social home page Coder Social logo

src-scientific-use-cases's Introduction

SRC Scientific Use Cases

The goal of this repository is to collect scientific use cases (e.g. applications, kernels, benchmarks, workflows) that will be expected to run on SKA Regional Centers (SRC). Each of these cases should contain the necessary information to be reproduced. Along with each case, profiling results such as profiling data and profiling analysis reports are stored as well.

Each case should consist of the following information:

  • Input files. Input files should be referenced instead of added to this repo if they are large. (definition of "large" pending, >10MB?)
  • Per-case documentation that includes at least the following:
    • How the case was compiled, including compiler version.
    • How the case is run, especially detailing all run parameters:
      • Application parameters.
      • Slurm/MPI launch parameters.
      • SRC node platform information (OS, hardware, dependent library versions (e.g. MPI, CUDA, other library dependencies)).
  • Performance analysis outputs
    • Profiling, tracing, roofline model outputs.
    • Analysis reports or summaries.
    • Scalability reports/graphs.
  • Outputs
    • Runtime or performance results, if not already included in the output.
    • Optionally power consumption data (aggregated or time series).

Directory structure

In order to be able to structure and scale all of this information, this repository is organised in the following hierarchical structure:

Cases can either be stand-alone, or part of a workflow. A workflow consists of one or more steps, and each step runs one application with certain inputs (a case). Each case can be run multiple times, resulting in individual runs where certain parameters (input stays the same) may change, such as number of nodes/processes, the SRC node, or on the computation device.

├── README.md
├── cases  # One application per case
│   └── SoFia
│       ├── cpu  # In case the application supports accelerators, 
│       │   │    # this structure allows distinguishing between cpu-only
│       │   │    # and accelerator (e.g. gpu) studies.
│       │   └── run_2022-08-17T15:23:21 # One directory per application run/instance.
│       │   └── run_2022-08-17T15:23:21
│       │       ├── build_parameters.txt # How the application was built/compiled.
│       │       ├── out.log              # Application output, often containing runtime/performance metrics.
│       │       ├── profiling            # Directory for storing profiling-specific outputs
│       │       │   ├── instrumentation.txt # Description of how this application was instrumented.
│       │       │   ├── flame.svg
│       │       │   ├── perf.out
│       │       │   └── perf_cmd.sh
│       │       ├── run_parameters.sh    # How the application was run/started, including application parameters,
│       │       │                        # input file used, MPI/Slurm submit script parameters, etc.
│       │       └── site_info.txt        # Reference or information about which site/platform/center this application was run on.
│       ├── doc                          # Directory used to collect general information
│       │   ├── building.md              # How this application can be compiled, including recommended compile flags, options, and dependencies.
│       │   ├── performance_analysis.md   # Optional performance analysis report describing performance-related knowledge about this application.
│       │   │                             # This information may include: describing scalability limits, how many nodes to use according to inputs,
│       │   │                             # power and energy efficiency metric studies on different platforms,
│       │   │                             # and hints as to which low-hanging fruit optimisations could be performed to improve runtime/efficiency.
│       │   └── running.md                # General guidence on how to run this application.
│       ├─ version                   # Each case can contain several software versions
│       │   └── version.def            # Ideally, a ready-to-run container definition file should be provided, along with relevant running information if needed.
│       └── inputs      # Directory containing input files suitable for this case
│           └── SDC2      # In case there are multiple possible inputs for this application (each input corresponds to a separate run/case).
│               └── SDC2.fits.url   # Large inputs are not included in this repository, but referenced.
└── workflows            # Workflows can be organised as either a collection of independent steps (cases),
    │                    # where each step can be a symbolic link to the cases/ directory,
    │                    # or as a standalone pipeline that will run all steps as one single big case.
    │                    # In the latter, the file and directory structure mirrors that of a case.
    │                    # Cases and workflows are kept separate in order to emphasise that a workflow may
    │                    # consist of several distinct applications. (Which has implication on ease of profiling, for example)
    ├── HI-FRIENDS
    │   └── steps
    │       ├── SoFia -> ../../../cases/SoFia
    │       └── spectral-cube
    └── eMerlin
        ├── cpu
        │   └── run_2022-08-19T10:34:16
        │       ├── build_parameters.txt
        │       ├── out.log
        │       ├── profiling
        │       │   ├── flame.svg
        │       │   ├── perf.out
        │       │   └── perf_cmd.sh
        │       └── run_parameters.sh
        ├── eMERLIN_CASA_Pipeline_Ubuntu_CASA58_2022-07-07
        │   └── eMERLIN_CASA_Pipeline_Ubuntu_CASA58_2022-07-07.def
        ├── doc
        │   ├── building.md
        │   ├── performance_analysis.md
        │   └── running.md
        ├── inputs
        │   └── all_avg.ms.tar
        └── steps
            └── casa

Adding a new applications or workflow

Adding a new application or workflow should replicate the example structure detailed above.

src-scientific-use-cases's People

Contributors

manuparra avatar pllopis avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.