Slurm Code Usage Analysis, SCUA

The SCUA tool queries the Slurm accounting database, extracts data on job steps and uses this data to match the executable names to a known set of applications. Once it has done this, it analyses the data and can produce output on various facets of usage.

You can use the scua command on a system with Slurm available to extract the data from the Slurm database and run the analyses or you can use the scua.py command to analyse a data file that contains a dump from the Slurm database in the correct format - this option is useful if you want to run multiple, separate analyses as it runs much quicker (extracting data from the Slurm database is typically the most time consuming step) and you can move the data dump to a different system to perform the analysis.

Note: at the moment, if graphical graphical plots are requested, they are only produced for analysis broken down by software use. Tables of data (as CSV and/or markdown) are available for all analysis breakdowns.

Requirements

scua is a bash script and so has no requirements other than it must be run on the system where the Slurm sacct command is available.

scua.py uses Pandas, numpy, matplotlib, tabulate and seaborn on top of a standard Python 3 installation.

### Environment setup

You must set the environment variable SCUA_BASE to point to the location of the downloaded repository.

Analyses available

The type of analysis you wish to run on the data is specified by command line options (see Usage, below). The following analysis types are available:

Analysis	`scua` Argument Combination	`scua.py` Argument Combination	Description
Software + Size	(default, always performed)	(default, always performed)	Analyses usage by parallel job step size and software package
Software + Node Power	`-w`	`--power`	Analyses job step mean node power draw and software package.
Motif + Size	`-t`	`--motif`	Analyses usage by parallel job step size and computational motif.
Motif + Power	`-t -w`	`--motif --power`	Analyses job step mean node power draw and computational motif.
Area + Size	`-a <project list CSV>`	`--projects=<project list CSV>`	Analyses usage by parallel job step size and research area. Requires a CSV file linking account codes to research areas.
Area + Power	`-a <project list CSV> -w`	`--projects=<project list CSV> --power`	Analyses job step mean node power draw and research area. Requires a CSV file linking account codes to research areas.

`scua` Usage

Usage: scua [options]
Options:
 -a account_csv  Perform analysis by research area. account_csv is a CSV file with a mapping of account codes to research areas
 -A account      Limit to specified account code, e.g. z01
 -c              Save tables of data (as csv)
 -E date/time    End date/time as YYYY-MM-DDTHH:MM, e.g. 2021-02-01T00:00
 -k              Keeps the intermediate output from sacct in `scua_sacct.csv`
 -g              Produce graphs of usage data (as png)
 -h              Show this help
 -m              Save tables of data (as markdown)
 -p prefix       Prefix for output file names
 -S date/time    Start date/time as YYYY-MM-DDTHH:MM, e.g. 2021-02-01T00:00
 -t              Perform analysis by computational motif
 -u user         Limit to specific user
 -w              Perform analysis of mean node power draw

`scua.py` Usage

Usage: scua.py [options] input
Options:
 -projects=account_csv  Perform analysis by research area. account_csv is a CSV file with a mapping of account codes to research areas
 -A account             Limit to specified account code, e.g. z01
 --csv                  Save tables of data (as csv)
 --plots                Produce graphs of usage data (as png)
 --help                 Show this help
 --md                   Save tables of data (as markdown)
 --prefix=prefix        Prefix for output file names
 --motif                Perform analysis by computational motif
 --power                Perform analysis of mean node power draw
 --dropnan              Drop all rows that contain NaN. Useful for strict comparisons between usage and energy use as some job steps may be missing energy use data

Output

SCUA prints the usage statistics to Markdown-formatted tables on STDOUT. If you specify the -c option, it will save the same tables as CSV files and if you specify the -m option it will save the same tables as markdown files. All files will be prefixed with the specified prefix or scua if no prefix is supplied.

If you specify the -g option (to produce graphs) you will also obtain additional image files:

${prefix}_codes_usage.png: Bar chart of CU usage broken down by code
${prefix}_overall_boxplot.png: Boxplot representing overall job size statistics weighted by CU usage per job
${prefix}_top15_boxplot.png: Boxplot representing job size statistics for the top 15 codes by usage weighted by CU usage per job

Python ValueError in add_userid.py

When I attempt to run scua with a broader date range than just a few days or a single month, I get the following error.

>> ./bin/scua -S 2024-01-01T00:00 -E 2024-06-31T23:59
Running: sacct -S 2024-01-01T00:00 -E 2024-06-31T23:59  -a -n --format JobIDRaw,JobName%30,User,Account,NNodes,NTasks,ElapsedRaw,State,ConsumedEnergyRaw,MaxRSS,AveRSS,ReqCPUFreq -P --delimiter :: | egrep "^[0-9]+\.[0-9]" | egrep -v "RUNNING|PENDING|REQUEUED"
Running: sacct -S 2024-01-01T00:00 -E 2024-06-31T23:59  -a -n -X --format JobIDRaw,JobName%30,User,Account,NNodes,NCPUS,ElapsedRaw,State,ConsumedEnergyRaw,MaxRSS,AveRSS,ReqCPUFreq -P --delimiter :: | egrep -v "RUNNING|PENDING|REQUEUED"
Running: /home/e813/e813/abrown_e813/usage-analysis/bin/add_userid.py scua_sacct_step.dat scua_sacct_job.dat scua_sacct_users.dat scua_sacct.dat
Traceback (most recent call last):
  File "/home/e813/e813/abrown_e813/usage-analysis/bin/add_userid.py", line 45, in <module>
    df_step[['JobID','SubJobID']] = df_step['JobID'].str.split(pat='.', n=1, expand=True)
  File "/home2/home/e813/e813/abrown_e813/usage-analysis/venv/lib/python3.9/site-packages/pandas/core/frame.py", line 4299, in __setitem__
    self._setitem_array(key, value)
  File "/home2/home/e813/e813/abrown_e813/usage-analysis/venv/lib/python3.9/site-packages/pandas/core/frame.py", line 4341, in _setitem_array
    check_key_length(self.columns, key, value)
  File "/home2/home/e813/e813/abrown_e813/usage-analysis/venv/lib/python3.9/site-packages/pandas/core/indexers/utils.py", line 390, in check_key_length
    raise ValueError("Columns must be same length as key")
ValueError: Columns must be same length as key
No jobs found
rm: cannot remove 'scua_sacct.dat': No such file or directory

archer2-hpc / usage-analysis Goto Github PK

usage-analysis's Introduction

Slurm Code Usage Analysis, SCUA

Requirements

Analyses available

`scua` Usage

`scua.py` Usage

Output

usage-analysis's People

Contributors

Stargazers

Watchers

Forkers

usage-analysis's Issues

Python ValueError in add_userid.py

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

archer2-hpc / usage-analysis Goto Github PK

usage-analysis's Introduction

Slurm Code Usage Analysis, SCUA

Requirements

Analyses available

scua Usage

scua.py Usage

Output

usage-analysis's People

Contributors

Stargazers

Watchers

Forkers

usage-analysis's Issues

Recommend Projects

Recommend Topics

Recommend Org

`scua` Usage

`scua.py` Usage