con / duct Goto Github PK

View Code? Open in Web Editor NEW

1.0 5.0 2.0 388 KB

A helper to run a command, capture stdout/stderr and details about running

Home Page: https://pypi.org/project/con-duct/

License: MIT License

Python 100.00%

duct's Introduction

duct

Installation

pip install con-duct

Quickstart

Try it out!

duct --sample-interval 0.5 --report-interval 1 test/data/test_script.py --duration 3 --memory-size=1000

duct is most useful when the report-interval is less than the duration of the script.

Summary:

A process wrapper script that monitors the execution of a command.

>duct --help

usage: duct [-h] [--version] [-p OUTPUT_PREFIX]
            [--summary-format SUMMARY_FORMAT] [--clobber]
            [-l {NONE,CRITICAL,ERROR,WARNING,INFO,DEBUG}] [-q]
            [--sample-interval SAMPLE_INTERVAL]
            [--report-interval REPORT_INTERVAL] [-c {all,none,stdout,stderr}]
            [-o {all,none,stdout,stderr}]
            [-t {all,system-summary,processes-samples}]
            command [command_args ...] ...

duct is a lightweight wrapper that collects execution data for an arbitrary
command.  Execution data includes execution time, system information, and
resource usage statistics of the command and all its child processes. It is
intended to simplify the problem of recording the resources necessary to
execute a command, particularly in an HPC environment.

Resource usage is determined by polling (at a sample-interval).
During execution, duct produces a JSON lines (see https://jsonlines.org) file
with one data point recorded for each report (at a report-interval).

environment variables:
  Many duct options can be configured by environment variables (which are
  overridden by command line options).

  DUCT_LOG_LEVEL: see --log-level
  DUCT_OUTPUT_PREFIX: see --output-prefix
  DUCT_SUMMARY_FORMAT: see --summary-format
  DUCT_SAMPLE_INTERVAL: see --sample-interval
  DUCT_REPORT_INTERVAL: see --report-interval
  DUCT_CAPTURE_OUTPUTS: see --capture-outputs

positional arguments:
  command [command_args ...]
                        The command to execute, along with its arguments.
  command_args          Arguments for the command.

options:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -p OUTPUT_PREFIX, --output-prefix OUTPUT_PREFIX
                        File string format to be used as a prefix for the
                        files -- the captured stdout and stderr and the
                        resource usage logs. The understood variables are
                        {datetime}, {datetime_filesafe}, and {pid}. Leading
                        directories will be created if they do not exist. You
                        can also provide value via DUCT_OUTPUT_PREFIX env
                        variable. (default:
                        .duct/logs/{datetime_filesafe}-{pid}_)
  --summary-format SUMMARY_FORMAT
                        Output template to use when printing the summary
                        following execution. (default: Exit Code: {exit_code}
                        Command: {command} Log files location: {logs_prefix}
                        Wall Clock Time: {wall_clock_time:.3f} sec Memory Peak
                        Usage (RSS): {peak_rss} bytes Memory Average Usage
                        (RSS): {average_rss} bytes Virtual Memory Peak Usage
                        (VSZ): {peak_vsz} bytes Virtual Memory Average Usage
                        (VSZ): {average_vsz} bytes Memory Peak Percentage:
                        {peak_pmem}% Memory Average Percentage:
                        {average_pmem}% CPU Peak Usage: {peak_pcpu}% Average
                        CPU Usage: {average_pcpu}% Samples Collected:
                        {num_samples} Reports Written: {num_reports} )
  --clobber             Replace log files if they already exist. (default:
                        False)
  -l {NONE,CRITICAL,ERROR,WARNING,INFO,DEBUG}, --log_level {NONE,CRITICAL,ERROR,WARNING,INFO,DEBUG}
                        Level of log output to stderr, use NONE to entirely
                        disable. (default: INFO)
  -q, --quiet           [deprecated, use log level NONE] Disable duct logging
                        output (to stderr) (default: False)
  --sample-interval SAMPLE_INTERVAL, --s-i SAMPLE_INTERVAL
                        Interval in seconds between status checks of the
                        running process. Sample interval must be less than or
                        equal to report interval, and it achieves the best
                        results when sample is significantly less than the
                        runtime of the process. (default: 1.0)
  --report-interval REPORT_INTERVAL, --r-i REPORT_INTERVAL
                        Interval in seconds at which to report aggregated
                        data. (default: 60.0)
  -c {all,none,stdout,stderr}, --capture-outputs {all,none,stdout,stderr}
                        Record stdout, stderr, all, or none to log files. You
                        can also provide value via DUCT_CAPTURE_OUTPUTS env
                        variable. (default: all)
  -o {all,none,stdout,stderr}, --outputs {all,none,stdout,stderr}
                        Print stdout, stderr, all, or none to stdout/stderr
                        respectively. (default: all)
  -t {all,system-summary,processes-samples}, --record-types {all,system-summary,processes-samples}
                        Record system-summary, processes-samples, or all
                        (default: all)

duct's People

Contributors

Stargazers

Watchers

Forkers

asmacdo candleindark

duct's Issues

Support Windows

Duct works by creating a new session to run all processes under.

I didn't look into this too deeply, but it does look possible. Still, I wouldnt feel comfortable just adding this until we test for the abandoning parent issue, blocked by: #44

From ChatGPT:

**For Windows, os.setsid is not available as it is specific to Unix-like systems. Instead, you can use subprocess.CREATE_NEW_PROCESS_GROUP in combination with Popen to achieve similar functionality. This flag creates a new process group, which allows you to manage the process more effectively.

Here is an example of how you can modify your Popen call to work on both Unix-like systems and Windows:

import os
import subprocess
import platform

def start_process(command):
    if platform.system() == "Windows":
        # Windows-specific implementation
        process = subprocess.Popen(command, creationflags=subprocess.CREATE_NEW_PROCESS_GROUP)
    else:
        # Unix-like system implementation
        process = subprocess.Popen(command, preexec_fn=os.setsid)
    return process

# Example usage
command = ["your_command", "arg1", "arg2"]
process = start_process(command)

This code will check the operating system and use the appropriate flag for creating a new process group. On Unix-like systems, it uses os.setsid, and on Windows, it uses subprocess.CREATE_NEW_PROCESS_GROUP. This should provide a functional equivalent to os.setsid for process management on Windows.**

Collect PBS info

Use https://github.com/brainlife/abcd-spec/blob/master/hooks/smon

Disable release workflow until after first release

auto's documentation implies that it only works when the GitHub project already has at least one GitHub Release. The release.yml workflow should thus be either disabled or removed until after the first release is made.

Behavior when runtime < sample-interval

using

version

❯ time ls
CONTRIBUTING.rst  README.md  pyproject.toml  setup.cfg  smoke-tests.sh*  src/  test/  test_logs.py*  test_script.py*  tox.ini  venvs/
LC_COLLATE=POSIX ls -bCF --color=auto --hyperlink=auto  0.00s user 0.00s system 81% cpu 0.003 total

so it virtually takes no time to do ls but with duct:

❯ time duct ls
-----------------------------------------------------
duct is executing ls...

Log files will be written to .duct/logs/2024.05.30T16.15.23-209921_
-----------------------------------------------------
CONTRIBUTING.rst
pyproject.toml
README.md
setup.cfg
smoke-tests.sh
src
test
test_logs.py
test_script.py
tox.ini
venvs

-----------------------------------------------------
                    duct report
-----------------------------------------------------
Exit Code: 0
Command: ls
Wall Clock Time: 1.095590591430664
Number of Processes: 0
Log files location: .duct/logs/2024.05.30T16.15.23-209921_
duct ls  0.12s user 0.08s system 16% cpu 1.190 total

over a second.

Add basic unit-testing and CI

Candidates for unit-testing:

without much of load, just printing to stdout, stderr and exiting with different exit statuses:
- ensure that main portion of the code is running and producing expected files
- ~~parametrize through different values for the options and test if behaves accordingly~~
aggregation (should probably go to separate function)

overall just try following https://github.com/datalad/datalad-installer/blob/master/.github/workflows/test.yml as much as possible as to include type checking etc.

Run precommit checks in CI

From @yarikoptic #57 (review)

Since you have it in pre-commit -- could have a pre-commit CI to check all of them if so was desired.
here is a number of projects calling to pre-commit in their workflows -- didn't check yet for best example

here is a number of projects calling to pre-commit in their workflows -- didn't check yet for best example

❯ grep -l pre-commit */.github/workflows/*
aiohttp_retry/.github/workflows/python-package.yml
asv/.github/workflows/pre-commit.yml
belay/.github/workflows/pre-commit.yaml
black/.github/workflows/lint.yml
boto3/.github/workflows/lint.yml
briefcase/.github/workflows/ci.yml
briefcase/.github/workflows/pre-commit-update.yml
celery/.github/workflows/linter.yml
cibuildwheel/.github/workflows/test.yml
codecarbon/.github/workflows/pre-commit.yml
commitizen/.github/workflows/pythonpackage.yml
django-rest-framework/.github/workflows/pre-commit.yml
filesystem_spec/.github/workflows/main.yaml
gpt-engineer/.github/workflows/pre-commit.yaml
home-assistant/.github/workflows/ci.yaml
lark/.github/workflows/mypy.yml
lazy_loader/.github/workflows/lint.yml
mamba/.github/workflows/linters.yml
mdformat/.github/workflows/tests.yaml
mlflow/.github/workflows/autoformat.yml
mlflow/.github/workflows/lint.yml
mlflow/.github/workflows/recipe-template.yml
mypy/.github/workflows/test.yml
networkx/.github/workflows/lint.yml
nilearn/.github/workflows/update_precommit_hooks.yml
ome-zarr-py/.github/workflows/precommit.yml
pixi/.github/workflows/pre-commit.yml
prefect/.github/workflows/static-analysis.yaml
pydantic/.github/workflows/ci.yml
pydantic/.github/workflows/docs-update.yml
pyparsing/.github/workflows/pre-commit.yml
python-cookiecutter/.github/workflows/test.yml
requests/.github/workflows/lint.yml
ruff/.github/workflows/ci.yaml
ruff/.github/workflows/release.yaml
singularity-hpc/.github/workflows/main.yml
yarl/.github/workflows/reusable-linters.yml
zarr-python/.github/workflows/needs_release_notes.yml

Cannot track usage of podman containers

 duct -- podman run --rm -it progrium/stress --cpu 2 --io 1 --vm 2 --vm-bytes 5000M --timeout 10

duct is executing podman run --rm -it progrium/stress --cpu 2 --io 1 --vm 2 --vm-bytes 5000M --timeout 10...
Log files will be written to .duct/logs/2024.06.07T10.29.23-530520_
stress: info: [1] dispatching hogs: 2 cpu, 1 io, 2 vm, 0 hdd
stress: dbug: [1] using backoff sleep of 15000us
stress: dbug: [1] setting timeout to 10s
stress: dbug: [1] --> hogcpu worker 2 [2] forked
stress: dbug: [1] --> hogio worker 1 [3] forked
stress: dbug: [1] --> hogvm worker 2 [4] forked
stress: dbug: [1] using backoff sleep of 6000us
stress: dbug: [1] setting timeout to 10s
stress: dbug: [1] --> hogcpu worker 1 [5] forked
stress: dbug: [1] --> hogvm worker 1 [6] forked
stress: dbug: [6] allocating 5242880000 bytes ...
stress: dbug: [6] touching bytes in strides of 4096 bytes ...
stress: dbug: [4] allocating 5242880000 bytes ...
stress: dbug: [4] touching bytes in strides of 4096 bytes ...
stress: dbug: [6] freed 5242880000 bytes
stress: dbug: [6] allocating 5242880000 bytes ...
stress: dbug: [6] touching bytes in strides of 4096 bytes ...
stress: dbug: [4] freed 5242880000 bytes
stress: dbug: [4] allocating 5242880000 bytes ...
stress: dbug: [4] touching bytes in strides of 4096 bytes ...
stress: dbug: [4] freed 5242880000 bytes
stress: dbug: [4] allocating 5242880000 bytes ...
stress: dbug: [4] touching bytes in strides of 4096 bytes ...
stress: dbug: [6] freed 5242880000 bytes
stress: dbug: [6] allocating 5242880000 bytes ...
stress: dbug: [6] touching bytes in strides of 4096 bytes ...
stress: dbug: [6] freed 5242880000 bytes
stress: dbug: [6] allocating 5242880000 bytes ...
stress: dbug: [6] touching bytes in strides of 4096 bytes ...
stress: dbug: [4] freed 5242880000 bytes
stress: dbug: [4] allocating 5242880000 bytes ...
stress: dbug: [4] touching bytes in strides of 4096 bytes ...
stress: dbug: [4] freed 5242880000 bytes
stress: dbug: [4] allocating 5242880000 bytes ...
stress: dbug: [4] touching bytes in strides of 4096 bytes ...
stress: dbug: [6] freed 5242880000 bytes
stress: dbug: [6] allocating 5242880000 bytes ...
stress: dbug: [6] touching bytes in strides of 4096 bytes ...
stress: dbug: [1] <-- worker 2 signalled normally
stress: dbug: [1] <-- worker 5 signalled normally
stress: dbug: [1] <-- worker 3 signalled normally
stress: dbug: [1] <-- worker 4 signalled normally
stress: dbug: [1] <-- worker 6 signalled normally
stress: info: [1] successful run completed in 10s

Exit Code: 0
Command: podman run --rm -it progrium/stress --cpu 2 --io 1 --vm 2 --vm-bytes 5000M --timeout 10
Log files location: .duct/logs/2024.06.07T10.29.23-530520_
Wall Clock Time: 10.3531014919281
Memory Peak Usage: 0.1%
CPU Peak Usage: 7.0%

Thats unfortunate, especially if it turns out to be a fundamental limitation of our method of tracking with session id.

Summarize total resource consumption over all PIDs

Since gathering maxes for each sample and then totaling at report generation time will overestimate usage, instead we will collect "totals" for each sample, but only report the spikes.

idea: collate output assisting filing issues (on github) with detailed information about execution

Idea just came up in the scope of repronim/containers and @asmacdo saying that monolythic script does not work on his laptop.

I feel like duct could be used to assist in filing issues! I envision doing smth like (relies on ReproNim/containers#124 in this demo)

duct --file-issue scripts/run-README-example

in that repo and seeing an issue being generated with (hypothetical results; note that formatting is really just YAML similarly to how we do in datalad wtf)

duct details (stdout, stderr, stats) for: scripts/run-README-example

overall stats

%cpu:
- max: 230%
- mean: 50%
memory (MB):
- max: 1024
- mean: 512
wall time (sec): 63
cpu time (sec): 30
exit code: 1
#subprocesses: 122

stdout (x lines, ...)

....

stderr (x lines, ...)

....

where it figures out repository automagically (or could be specified). but although sounds trivial and quick for a prototype additional aspects would make it not so, e.g. figuring out all kinds of details about filing issues etc, including additional info, e.g.

But I think it could be quite easy to implement/do as -D github-style-report -c|--clipboard (somewhat borrowed from datalad wtf) where it would at the end produce such report to screen and/or copy to clipboard so make it ready to be pasted into a report.

Other solutions in the same vein or may be could be integrated with

This could be of great help also to potentially "reproduce" issues/behaviors since it would be capturing desired

Add option `--clobber` and by default error out if there is already report(s) at the location

ATM it causes inconsistent behavior overall by default - appends to .json and overwrites outputs files
note and fix also -- no new line is recorded in that .json after final entry -- should be ended with a newline IMHO)

smaug:/tmp
$> rm duct-out*          
(dev3) 3 12170.....................................:Mon 06 May 2024 08:48:03 AM EDT:.
smaug:/tmp
$> duct -p duct-out_ -- bash -c "echo std output; echo 'err output' >&2"         
err output
std output
{"Command": "bash", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}
(dev3) 3 12171.....................................:Mon 06 May 2024 08:48:17 AM EDT:.
smaug:/tmp
$> grep o duct-out_*
duct-out_info.json:{"Command": "bash", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}
duct-out_stderr:err output
duct-out_stdout:std output
(dev3) 3 12172.....................................:Mon 06 May 2024 08:48:29 AM EDT:.
smaug:/tmp
$> duct -p duct-out_ -- bash -c "echo std output2; echo 'err output2' >&2"
err output2
std output2
{"Command": "bash", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}
(dev3) 3 12173.....................................:Mon 06 May 2024 08:48:36 AM EDT:.
smaug:/tmp
$> grep o duct-out_*                                                      
duct-out_info.json:{"Command": "bash", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}{"Command": "bash", "System": {"uid": "yoh", 
"memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}
duct-out_stderr:err output2
duct-out_stdout:std output2

Add ability to register external commands/hooks to provide extra metadata

E.g.

query external remotes for versions of dependencies
...

For that we might need to provide some layered mechanism to pick up configurations. E.g. in many datalad or even simpler git repo based cases it could go to .git/config or "datalad's config".

edit1: what I really wondered is even for those external hooks to provide

augmentation of environment, e.g. ENV vars, so we could e.g. provide env var to tune git-annex commit message to git-annex branch, e.g. to include its version to help later troubleshooting odd changes etc.
list env variables to include in the _info.json:env field

Collect slurm info

Use https://github.com/brainlife/abcd-spec/blob/master/hooks/smon for inspiration.

Use https://github.com/PennLINC/babs/tree/main/tests/e2e-slurm for e2e test setup.

What slurm info to collect:

SLURM_ env vars
- test if we do "in effect" on discovery
- add a unittest for collection of such variables so changes to code do not break this ability
(?) slurm version

Test with abandoning_parent

One of the reasons duct was implemented was to poll processes where the parent abandons the child process.

Example test script: https://github.com/con/duct/blob/af67030a40fef7378418d5fa626d6baee7523436/abandoning_parent.sh

At the end report absolute peak memory (may be with % if really like it)

ATM I see

❯ duct -- ./test_script.py --duration 3 --memory-size=1000
duct is executing ./test_script.py --duration 3 --memory-size=1000...
Log files will be written to .duct/logs/2024.06.06T23.20.47-1211148_
this is of test of STDERR
this is of test of STDOUT
Test completed. Consumed 1000 MB for 3 seconds with CPU load factor 10000.

Exit Code: 0
Command: ./test_script.py --duration 3 --memory-size=1000
Log files location: .duct/logs/2024.06.06T23.20.47-1211148_
Wall Clock Time: 3.3275320529937744
Memory Peak Usage: 0.0%
CPU Peak Usage: 104.0%
duct -- ./test_script.py --duration 3 --memory-size=1000  4.18s user 2.58s system 197% cpu 3.428 total

which is odd -- not sure what 0.0% it is reporting given that 1GB was requested.

Add yaml lint to precommit

Fixes #69

Add pytest-cov + codecov upload/report support

again -- see/borrow from datalad-installer . You would need API key in github ci action for codecov/codecov-action@v4

Track Average CPU and Mem usage

From @yarikoptic #39 (comment)
I think it would be useful to have "average cpu%" probably more than "max cpu%" (spike) since that one I think most of them time would be some 100%. Average is a sum/number_of so you can keep adding and also keep track (increment number) on how many times you added to come up with a mean. The same could be done for memory (although there average is less useful but still might be)

Do not obscure/change original stdout/stderr anyhow

Might be related to

Should output as is it was received, no changes! So if there are any ANSI terminal control characters (see https://en.wikipedia.org/wiki/ANSI_escape_code and #24) -- they should be passed as is. I think the fact that output is somehow changed we get output from git annex addurl which likely uses some \r or may be some other ANSI command gets to look like

$> duct scripts/replace_neurodesk_urls
OK: https://d15yxasja65rk8.cloudfront.net/afni_21.2.00_20210714.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_21.2.00_20210714.simg to images/neurodesk/neurodesk-afni--21.2.00.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_21.2.00_20210714.simg ok
^[[1G(recording state in git...)
 INFO: removing 108 oracle urls
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
{"Command": "scripts/replace_neurodesk_urls", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}

Similarly to #25 the test could be to run duct running a command which outputs all kinds of "stuff" and ensuring that output we get to stdout from running duct is exactly what that command produces!

Bug: user-provided output-prefix is formatted too soon

We need to be able to open files for writing to stdout and stderr prior to executing the inner program. The only PID we have access to at this time is os.getpid(), ie the PID of duct, not the PID of the running processes. So, stdout, stderr, and system stats should go to the duct pid.

However currently we call .format once and use it everywhere. Instead we need to call .format each time we have a different PID.

Prepare release to PyPI

Lets push it to the cheese shop!

Prerequisites:

con-duct on PyPI
con_duct for python package name
keep duct as entrypoint

Do the actual release with automation #7

check what duct is "busy with"

ATM running a singularity container through duct shows 2-3% CPU of duct. I wonder what it is doing. Would be worth checking with py-spy etc some of its runs to see what it spends time on -- even though just a little, given how little it needs to do I would expect it to be even lower

yoh      1622935  0.1  0.0 271356 39816 pts/13   Sl+  10:58   0:00       /usr/bin/python3 /usr/bin/datalad run -m Doing trial run with bleeding edge 0.0.5 duct singularity run -B /mnt/DATA/data/yoh/1076_spacetop:/mnt/DATA/data/yoh/1076_spacetop:ro -B /mnt/DATA/data/yoh/1076_spacetop/derivatives:/mnt/DATA/data/yoh/1076_spacetop/derivatives:rw -B /mnt/DATA/data/yoh/1076_spacetop/scratch:/mnt/DATA/data/yoh/1076_spacetop/scratch:rw -B /mnt/DATA/data/yoh/1076_spacetop/../dsst-defacing-pipeline/src:/opt/dsst-defacing-pipeline:ro --scratch /mnt/DATA/data/yoh/1076_spacetop/scratch --pwd /mnt/DATA/data/yoh/1076_spacetop --net --network none ../dsst-defacing-pipeline/dsst-defacing-pipeline-0.0.5.sif -p 0001 -s 01 -- . derivatives/dss-defacing-0.0.5-1
yoh      1623044  0.0  0.0   2580   904 pts/13   S+   10:58   0:00         /bin/sh -c duct singularity run -B /mnt/DATA/data/yoh/1076_spacetop:/mnt/DATA/data/yoh/1076_spacetop:ro -B /mnt/DATA/data/yoh/1076_spacetop/derivatives:/mnt/DATA/data/yoh/1076_spacetop/derivatives:rw -B /mnt/DATA/data/yoh/1076_spacetop/scratch:/mnt/DATA/data/yoh/1076_spacetop/scratch:rw -B /mnt/DATA/data/yoh/1076_spacetop/../dsst-defacing-pipeline/src:/opt/dsst-defacing-pipeline:ro --scratch /mnt/DATA/data/yoh/1076_spacetop/scratch --pwd /mnt/DATA/data/yoh/1076_spacetop --net --network none ../dsst-defacing-pipeline/dsst-defacing-pipeline-0.0.5.sif -p 0001 -s 01 -- . derivatives/dss-defacing-0.0.5-1
yoh      1623046  2.3  0.0 241476 14000 pts/13   Sl+  10:58   0:15           /mnt/DATA/data/yoh/dsst-defacing-pipeline/venvs/dev3/bin/python3 /mnt/DATA/data/yoh/dsst-defacing-pipeline/venvs/dev3/bin/duct singularity run -B /mnt/DATA/data/yoh/1076_spacetop:/mnt/DATA/data/yoh/1076_spacetop:ro -B /mnt/DATA/data/yoh/1076_spacetop/derivatives:/mnt/DATA/data/yoh/1076_spacetop/derivatives:rw -B /mnt/DATA/data/yoh/1076_spacetop/scratch:/mnt/DATA/data/yoh/1076_spacetop/scratch:rw -B /mnt/DATA/data/yoh/1076_spacetop/../dsst-defacing-pipeline/src:/opt/dsst-defacing-pipeline:ro --scratch /mnt/DATA/data/yoh/1076_spacetop/scratch --pwd /mnt/DATA/data/yoh/1076_spacetop --net --network none ../dsst-defacing-pipeline/dsst-defacing-pipeline-0.0.5.sif -p 0001 -s 01 -- . derivatives/dss-defacing-0.0.5-1
yoh      1623049  0.0  0.0 1247012 16620 ?       Ssl  10:58   0:00             Singularity runtime parent
yoh      1623076  0.0  0.0  19740 16584 ?        S    10:58   0:00               python /opt/dsst-defacing-pipeline/run.py -p 0001 -s 01 -- . derivatives/dss-defacing-0.0.5-1
yoh      1623098  0.0  0.0   4040  3056 ?        S    10:58   0:00                 /bin/tcsh /usr/bin/@afni_refacer_run -input /mnt/DATA/data/yoh/1076_spacetop/sub-0001/ses-01/anat/sub-0001_ses-01_acq-MPRAGEXp3X08mm_T1w.nii.gz -mode_deface -no_clean -prefix /mnt/DATA/data/yoh/1076_spacetop/derivatives/dss-defacing-0.0.5-1/bids_defaced/sub-0001/ses-01/anat/sub-0001_ses-01_acq-MPRAGEXp3X08mm_T1w
yoh      1674377  100  0.0 326992 320464 ?       R    11:10   0:01                   3dcalc -a tmp.00.INPUT.nii -c tmp.05.sh_t2a_thr.nii[1] -expr a*not(bool(c)) -prefix tmp.99.result.deface.nii -datum float -ISOLA

but I could be wrong.

automate update of readme with --help

SSIA

add [--] into --help output

so instead of

usage: duct [-h] [-p OUTPUT_PREFIX] [--sample-interval SAMPLE_INTERVAL] [--report-interval REPORT_INTERVAL] [-c {all,none,stdout,stderr}] [-o {all,none,stdout,stderr}]
            [-t {all,system-summary,processes-samples}]
            command [arguments ...]

looks like

usage: duct [-h] [-p OUTPUT_PREFIX] [--sample-interval SAMPLE_INTERVAL] [--report-interval REPORT_INTERVAL] [-c {all,none,stdout,stderr}] [-o {all,none,stdout,stderr}]
            [-t {all,system-summary,processes-samples}]
            [--] command [arguments ...]

and add description on -- as to be recommended to ensure that arguments are not treated as arguments to duct.

On the other hand, I think we might need to look into how to tune argparser here: ideally we should automagically stop parsing when reaching command. @jwodder , you did some heavy CLI parsing tune up in datalad-installer, may be you know what would be the best way here to ensure that parsing stops and [arguments] for the command do not interfer with our duct arguments?

Remove color from output

Causes underlying process to loose its ANSI coloring
Seems to not doing reset of ANSI coloring after printing its summary
Seems need to make sure that underlying process is done before printing summary?

Running singularity directly -- results in colors for WARNING and FATAL

Running through DUCT produced

where no color for WARNING, and FATAL comes at the end in blue.
Both messages are (without ANSI colors) in stderr

(dev3) yoh@typhon:/mnt/DATA/data/yoh/1076_spacetop$ cat .duct/logs/2024.06.09T12.47.28-2020941_stderr 
WARNING: skipping mount of /mnt/DATA/data/yoh/1076_spacetop/derivatives: stat /mnt/DATA/data/yoh/1076_spacetop/derivatives: no such file or directory
FATAL:   container creation failed: mount /mnt/DATA/data/yoh/1076_spacetop/derivatives->/mnt/DATA/data/yoh/1076_spacetop/derivatives error: while mounting /mnt/DATA/data/yoh/1076_spacetop/derivatives: mount source /mnt/DATA/data/yoh/1076_spacetop/derivatives doesn't exist

Setup automated releases

Just also follow datalad-installer setup.

somehow adds 0d character into output dumps

needs a test on some wild binary output ( may be just sweep through all 256 characters in order and after random permutation) to correspond to original !!

I have noticed that after running

$> duct -p duct-out_ -- bash -c "echo std output; echo 'err output' >&2"         
err output
std output
{"Command": "bash", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}

I could not grep . those output files!!!

$> grep . duct-out_*                              
duct-out_info.json:{"Command": "bash", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}{"Command": "bash", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}

although content is there... the culprit is that there is 0d (13) is added for some reason!

$> hexdump -C duct-out_stdout                                             
00000000  73 74 64 20 6f 75 74 70  75 74 32 0d 0a           |std output2..|
0000000d

here is how file should look like (I just echo'ed into the file)

$> hexdump -C duct-out_stdout-manual 
00000000  73 74 64 20 6f 75 74 70  75 74 32 0a              |std output2.|
0000000c

Add --version

'ATM

❯ duct --version
usage: duct [-h] [-p OUTPUT_PREFIX] [--sample-interval SAMPLE_INTERVAL]
            [--report-interval REPORT_INTERVAL] [-c {all,none,stdout,stderr}]
            [-o {all,none,stdout,stderr}] [-t {all,system-summary,processes-samples}]
            command [arguments ...]
duct: error: the following arguments are required: command, arguments

example

❯ ~datalad/datalad-installer/src/datalad_installer.py --version
datalad-installer 1.0.4

but imho better be just the version without name

Release to conda-forge

Example:
https://github.com/conda-forge/staged-recipes/pull/26537files#diff-53275bd5be2712ba0c13d78aaf896f1e079266db20b7703b40f2c0029a140c6c

(there he submits two at once, we need just one)

Add a test with large (MBs of stdout stderr) and ensure that it doesn't stall

speed up execution tests

They aren't miserably slow, but we can tweak (and standardize) the time values in the test_execution args and probably make this blazing fast. Doesn't seem crucial to block release though.

duct pytest test/test_execution.py
duct is executing pytest test/test_execution.py...
Log files will be written to .duct/logs/2024.06.06T17.49.26-444723_
============================= test session starts ==============================
platform linux -- Python 3.11.5, pytest-7.4.4, pluggy-1.5.0
rootdir: /home/austin/devel/duct
configfile: tox.ini
plugins: cov-5.0.0
collected 8 items

test/test_execution.py ........                                          [100%]

============================== 8 passed in 4.20s ===============================

Exit Code: 0
Command: pytest test/test_execution.py
Log files location: .duct/logs/2024.06.06T17.49.26-444723_
Wall Clock Time: 4.352977275848389
Memory Peak Usage: 0.1%
CPU Peak Usage: 18.0%

 austin@fancy  ~/devel/duct   next ± duct pytest
duct is executing pytest...
Log files will be written to .duct/logs/2024.06.06T17.46.28-441631_
============================= test session starts ==============================
platform linux -- Python 3.11.5, pytest-7.4.4, pluggy-1.5.0
rootdir: /home/austin/devel/duct
configfile: tox.ini
plugins: cov-5.0.0
collected 38 items

test/test_execution.py ........                                          [ 21%]
test/test_helpers.py .....                                               [ 34%]
test/test_prepare_outputs.py ......                                      [ 50%]
test/test_report.py ....                                                 [ 60%]
test/test_tailpipe.py ...............                                    [100%]

============================== 38 passed in 5.71s ==============================

Exit Code: 0
Command: pytest
Log files location: .duct/logs/2024.06.06T17.46.28-441631_
Wall Clock Time: 5.862141370773315
Memory Peak Usage: 0.6%
CPU Peak Usage: 29.0%

austin@fancy  ~/devel/duct   next ± duct tox
duct is executing tox...
Log files will be written to .duct/logs/2024.06.06T17.46.41-442524_
.pkg: _optional_hooks> python /home/austin/miniconda3/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_sdist> python /home/austin/miniconda3/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: get_requires_for_build_wheel> python /home/austin/miniconda3/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: prepare_metadata_for_build_wheel> python /home/austin/miniconda3/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
.pkg: build_sdist> python /home/austin/miniconda3/lib/python3.11/site-packages/pyproject_api/_backend.py True setuptools.build_meta
lint: install_package> python -I -m pip install --force-reinstall --no-deps /home/austin/devel/duct/.tox/.tmp/package/368/duct-0.0.1.tar.gz
lint: commands[0]> flake8 src test
lint: OK ✔ in 2.47 seconds
typing: install_package> python -I -m pip install --force-reinstall --no-deps /home/austin/devel/duct/.tox/.tmp/package/369/duct-0.0.1.tar.gz
typing: commands[0]> mypy src test
src/duct.py:59: error: Need type annotation for "max_values"  [var-annotated]
src/duct.py:61: error: Need type annotation for "_sample"  [var-annotated]
src/duct.py:64: error: Name "command" already defined on line 35  [no-redef]
src/duct.py:333: error: Incompatible types in assignment (expression has type "None", variable has type "TextIO | TailPipe | int")  [assignment]
src/duct.py:343: error: Incompatible types in assignment (expression has type "None", variable has type "TextIO | TailPipe | int")  [assignment]
Found 5 errors in 1 file (checked 8 source files)
typing: exit 1 (0.34 seconds) /home/austin/devel/duct> mypy src test pid=442917
typing: FAIL ✖ in 2.1 seconds
py38: skipped because could not find python interpreter with spec(s): py38
py38: SKIP ⚠ in 0.01 seconds
py39: skipped because could not find python interpreter with spec(s): py39
py39: SKIP ⚠ in 0.01 seconds
py310: skipped because could not find python interpreter with spec(s): py310
py310: SKIP ⚠ in 0.01 seconds
py311: install_package> python -I -m pip install --force-reinstall --no-deps /home/austin/devel/duct/.tox/.tmp/package/370/duct-0.0.1.tar.gz
py311: commands[0]> pytest test
============================= test session starts ==============================
platform linux -- Python 3.11.5, pytest-8.2.0, pluggy-1.5.0
cachedir: .tox/py311/.pytest_cache
rootdir: /home/austin/devel/duct
configfile: tox.ini
plugins: mock-3.14.0, cov-5.0.0
collected 38 items

test/test_execution.py ........                                          [ 21%]
test/test_helpers.py .....                                               [ 34%]
test/test_prepare_outputs.py ......                                      [ 50%]
test/test_report.py ....                                                 [ 60%]
test/test_tailpipe.py ...............                                    [100%]

============================== 38 passed in 5.55s ==============================
py311: OK ✔ in 7.42 seconds
py312: skipped because could not find python interpreter with spec(s): py312
py312: SKIP ⚠ in 0.01 seconds
pypy3: skipped because could not find python interpreter with spec(s): pypy3
  lint: OK (2.47=setup[2.32]+cmd[0.15] seconds)
  typing: FAIL code 1 (2.10=setup[1.76]+cmd[0.34] seconds)
  py38: SKIP (0.01 seconds)
  py39: SKIP (0.01 seconds)
  py310: SKIP (0.01 seconds)
  py311: OK (7.42=setup[1.71]+cmd[5.71] seconds)
  py312: SKIP (0.01 seconds)
  pypy3: SKIP (0.01 seconds)
  evaluation failed :( (12.06 seconds)

Exit Code: 255
Command: tox
Log files location: .duct/logs/2024.06.06T17.46.41-442524_
Wall Clock Time: 12.165053367614746
Memory Peak Usage: 0.7%
CPU Peak Usage: 89.5%

Fancy sugaring: online report of consumption

Eventually might be worth adding an option to enable display of the summary on the screen, similarly to how apt commands do it (IIRC) -- at the bottom of the terminal there is then a line which informs about progress of apt command - % done etc

https://en.wikipedia.org/wiki/ANSI_escape_code has codes for navigation
https://github.com/pyout/pyout is our project which might have some of them used already to see how to do it etc.

fill out type checking

Some of the code is type hinted, lets go for 100% and add a blocking test to CI.

store PID as int not str

ATM

{"1623049": {"pcpu": 0.0, "pmem": 0.0, "rss": 15668, "vsz": 1247012, "timestamp": "2024-06-09T11:36:56.374783-04:00"}, "1623076": {"pcpu": 3.2, "pmem": 0.0, "rss": 17204, "vsz": 20872, "timestamp": "2024-06-09T11:36:56.374816-04:00"}, "1786431": {"pcpu": 0.0, "pmem": 0.0, "rss": 3036, "vsz": 4040, "timestamp": "2024-06-09T11:36:56.374833-04:00"}, "1786646": {"pcpu": 100.0, "pmem": 0.0, "rss": 116912, "vsz": 128380, "timestamp": "2024-06-09T11:36:56.374848-04:00"}, "totals": {"pmem": 0.0, "pcpu": 103.2}}

for PIDs - imho better just to store 1623049 etc

Add pre-commit

also adopt from datalad-installer

Collect first sample right after running the command

So we have some information recorded . ATM there is also might be a bug that we do not aggregate / report anything even if recorded a sample. E.g. here sleeping for 200 ms, with sampling interval of 100 ms we get no final report whatsoever

$> duct --sample-interval 0.1 -p duct-out_ sleep 0.2
{"Command": "sleep", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}
(dev3) 3 12140.....................................:Mon 06 May 2024 08:41:31 AM EDT:.
smaug:/tmp
$> cat duct-out_info.json                           
{"Command": "sleep", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}%

Provide a test for correct handling of args

In follow up to

should test parsing (no need to execute) of following command lines

duct cmd --help returns composition of options so --help goes for cmd
duct --help cmd returns composition of options so --help goes for duct
duct --unknown cmd fails to find --unknown
duct --unknown cmd --option fails to find --unknown
duct --unknown cmd --option fails
duct --sample-interval SAMPLE_INTERVAL fails (no cmd)

etc, to cover corner cases of possible ways users try to specify smth peculiar. That is where parametric test specification could be quite handy

Filesafe ISO8601 - use periods instead of hyphens

dandi/dandi-cli#1445 (comment)

Record duct's version within produced _info.json record

E.g. within duct-version field.

support starting with 3.8

[asmacdo]

sweep through all python versions
document support in setup config

Add to _info.json a final record + use it to render report

Pretty much a summary of what it took process to run. From that record/information is where from that final message duct outputs should be rendered via f-string formatting pretty much. I.e. for

Command: singularity run -B /mnt/DATA/data/yoh/1076_spacetop:/mnt/DATA/data/yoh/1076_spacetop:ro -B /mnt/DATA/data/yoh/1076_spacetop/derivatives:/mnt/DATA/data/yoh/1076_spacetop/derivatives:rw -B /mnt/DATA/data/yoh/1076_spacetop/scratch:/mnt/DATA/data/yoh/1076_spacetop/scratch:rw -B /mnt/DATA/data/yoh/1076_spacetop/../dsst-defacing-pipeline/src:/opt/dsst-defacing-pipeline:ro --scratch /mnt/DATA/data/yoh/1076_spacetop/scratch --pwd /mnt/DATA/data/yoh/1076_spacetop --net --network none ../dsst-defacing-pipeline/dsst-defacing-pipeline-0.0.5.sif -p 0001 -s 01 -- . derivatives/dss-defacing-0.0.5-1
Log files location: .duct/logs/2024.06.09T10.58.56-1623046_
Wall Clock Time: 2884.758 sec
Memory Peak Usage: 0.0%
CPU Peak Usage: 303.0%

Should be

fmt =f"""\
Exit code: {exit_code}
Command: {command}
Log files location: {logfile_prefix}
Wall Clock Time: {wall_clock} sec
Memory Peak Usage: {stats["memory_peak_perc"]}%
CPU Peak Usage: {stats["cpu_peak_perc"]}%
"""
print(fmt.format(final_record))

and we should add CLI option --report-format (`DUCT_REPORT_FORMAT) or alike so people could tune to their desires.

NB I know that exit code is not changing colors etc, but for that we might want to adopt/provide custom formatting similarly to how we do in pyout... later or add some ad-hoc for now exit_code_ansi_color and ansi_color_reset to use in the format as {exit_code_ansi_color}Exit code: {exit_code}{ansi_color_reset} but exclude them from record dumped to .json

Re-enable osx tests

https://github.com/con/duct/pull/14/files#diff-245392b692a50c38ecab4381b118862db514035c10983f3bd4f4b7f1f4be4692R27-R31

Disabled to keep the matrix small during development.

No point to print system information at the end -- print process aggregated one

relates also to

#22
At least the final record should know how long the process was running for! ATM there is no such information recorded -- only system one which is boring and not informative at all for the purpose of capturing the executed process stats.

Make fields lower-case

to avoid necessity of knowledge on which are Camel which are not. ATM it is a mix:

$ jq . < .duct/logs/2024.06.09T14.34.52-2325510_info.json
{
  "Command": "singularity run -B /mnt/DATA/data/yoh/1076_spacetop:/mnt/DATA/data/yoh/1076_spacetop:ro -B /mnt/DATA/data/yoh/1076_spacetop/derivatives:/mnt/DATA/data/yoh/1076_spacetop/derivatives:rw -B /mnt/DATA/data/yoh/1076_spacetop/scratch:/mnt/DATA/data/yoh/1076_spacetop/scratch:rw -B /mnt/DATA/data/yoh/1076_spacetop/../dsst-defacing-pipeline/src:/opt/dsst-defacing-pipeline:ro --scratch /mnt/DATA/data/yoh/1076_spacetop/scratch --pwd /mnt/DATA/data/yoh/1076_spacetop --net --network none ../dsst-defacing-pipeline/dsst-defacing-pipeline-0.0.5.sif -- . derivatives/dss-defacing-0.0.5-1",
  "System": {
    "uid": "yoh",
    "memory_total": 1081801523200,
    "cpu_total": 32
  },
  "ENV": [
    {}
  ],
  "GPU": []
}

Just make them all lower case, i.e. "command", "system", "env", "gpu"

smon "redesign"

https://github.com/brainlife/abcd-spec/blob/master/hooks/smon is a great start for a desired script but we need to

Sanity Check: does datalad runner also buffer stdout (cannot force flush in wrapper script)

Here is a basic (actually a "fancy" version since a generator)

#!/usr/bin/env python
# emacs: -*- mode: python; py-indent-offset: 4; tab-width: 4; indent-tabs-mode: nil -*-
# ex: set sts=4 ts=4 sw=4 noet:
from datalad.runner.protocol import GeneratorMixIn
from datalad.runner.utils import (
    AssemblingDecoderMixIn,
)

from datalad.cmd import (
    GitWitlessRunner,
    StdOutErrCapture,
)

class GeneratorStdOutErrCapture(GeneratorMixIn,
                                AssemblingDecoderMixIn,
                                StdOutErrCapture):
    """
    Generator-runner protocol that captures and yields stdout and stderr.
    """

    def __init__(self):
        GeneratorMixIn.__init__(self)
        AssemblingDecoderMixIn.__init__(self)
        StdOutErrCapture.__init__(self)

    def pipe_data_received(self, fd, data):
        if fd in (1, 2):
            print(f"processing data: {data} for {fd}")
            self.send_result((fd, self.decode(fd, data, self.encoding)))
        else:
            StdOutErrCapture.pipe_data_received(self, fd, data)

if __name__ == '__main__':
    import sys, os
    git_runner = GitWitlessRunner()
    generator = git_runner.run(
        sys.argv[1:],
        protocol=GeneratorStdOutErrCapture,
        env = os.environ.copy(),
    )
    for out in generator:
        print(f"generator output: {out}")

which on this simple script

#!/usr/bin/env python

from time import sleep
import sys

for i in range(5):
    sys.stdout.write(f"stdout {i}\n")
    sys.stderr.write(f"stderr {i}\n")
    sleep(0.1)

which, if stdout is not piped would just interleave

❯ ./prints.py
stdout 0
stderr 0
stdout 1
stderr 1
stdout 2
stderr 2
stdout 3
stderr 3
stdout 4
stderr 4

and if stdout redirected, wouldn't be flushed I guess so we get

❯ ./prints.py | tee /tmp/out
stderr 0
stderr 1
stderr 2
stderr 3
stderr 4
stdout 0
stdout 1
stdout 2
stdout 3
stdout 4

So datalad runner produces

❯ ./demo_datalad_capture.py ./prints.py
processing data: b'stderr 0\n' for 2
generator output: (2, 'stderr 0\n')
processing data: b'stderr 1\n' for 2
generator output: (2, 'stderr 1\n')
processing data: b'stderr 2\n' for 2
generator output: (2, 'stderr 2\n')
processing data: b'stderr 3\n' for 2
generator output: (2, 'stderr 3\n')
processing data: b'stderr 4\n' for 2
generator output: (2, 'stderr 4\n')
processing data: b'stdout 0\nstdout 1\nstdout 2\nstdout 3\nstdout 4\n' for 1
generator output: (1, 'stdout 0\nstdout 1\nstdout 2\nstdout 3\nstdout 4\n')

research other possible existing solutions

checkout what was done by @soichih in https://github.com/brainlife/amaretti to monitor jobs -- may be it went natively through PBS/slurm to ask for that information
there could be some other monitoring tools.

Adopt pre-commit with black and other goodness

just again follow datalad-installer

report "Memory Peak Usage" in absolute value not in "%"

...
Command: singularity run -B /mnt/DATA/data/yoh/1076_spacetop:/mnt/DATA/data/yoh/1076_spacetop:ro -B /mnt/DATA/data/yoh/1076_spacetop/derivatives:/mnt/DATA/data/yoh/1076_spacetop/derivatives:rw -B /mnt/DATA/data/yoh/1076_spacetop/scratch:/mnt/DATA/data/yoh/1076_spacetop/scratch:rw -B /mnt/DATA/data/yoh/1076_spacetop/../dsst-defacing-pipeline/src:/opt/dsst-defacing-pipeline:ro --scratch /mnt/DATA/data/yoh/1076_spacetop/scratch --pwd /mnt/DATA/data/yoh/1076_spacetop --net --network none ../dsst-defacing-pipeline/dsst-defacing-pipeline-0.0.5.sif -p 0001 -s 01 -- . derivatives/dss-defacing-0.0.5-1
Log files location: .duct/logs/2024.06.09T10.58.56-1623046_
Wall Clock Time: 2884.758 sec
Memory Peak Usage: 0.0%
CPU Peak Usage: 303.0%
[INFO   ] == Command exit (modification check follows) =====

and the same 0 was in _stats.json:

yoh@typhon:/mnt/DATA/data/yoh/1076_spacetop$ jq -r '."totals".pmem' .duct/logs/2024.06.09T10.58.56-1623046_usage.json | uniq
0

so there is smth wrong about it overall but also it is kinda not useful -- we need to know absolute values. Would not hurt reporting both rss (resident memory -- actually what was in the RAM) and vsz (virtual address space allocated, might be less useful but who knows). And may be then % of rss in relation to memory_total

candidate base: perf

part of linux https://perf.wiki.kernel.org/index.php/Main_Page

concerns:

might come with a notable performance hit. Possible "workaround" - disable anything what is not relevant to our desired metrics: memory, cpu,

incomplete output of stdout is stored (when process died)

I had the process running which produced following output on the screen

(git)smaug:~/proj/repronim/containers[master]git
$> datalad run -m "Update neurodesk image urls" duct scripts/replace_neurodesk_urls
[INFO   ] == Command start (output follows) =====
INFO: file images/neurodesk/neurodesk-afni--21.2.00.simg
OK: https://d15yxasja65rk8.cloudfront.net/afni_21.2.00_20210714.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_21.2.00_20210714.simg to images/neurodesk/neurodesk-afni--21.2.00.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_21.2.00_20210714.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
INFO: file images/neurodesk/neurodesk-afni--22.1.14.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/afni_22.1.14_20220713.simg - could not verify presence: 403 . Will not be added
 INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--22.1.14.simg ok
rmurl images/neurodesk/neurodesk-afni--22.1.14.simg ok
rmurl images/neurodesk/neurodesk-afni--22.1.14.simg ok
INFO: file images/neurodesk/neurodesk-afni--22.3.06.simg
 INFO: removing 3 oracle urls
ERROR: https://d15yxasja65rk8.cloudfront.net/afni_22.3.06_20221128.simg - could not verify presence: 403 . Will not be added
rmurl images/neurodesk/neurodesk-afni--22.3.06.simg ok
rmurl images/neurodesk/neurodesk-afni--22.3.06.simg ok
rmurl images/neurodesk/neurodesk-afni--22.3.06.simg ok
INFO: file images/neurodesk/neurodesk-afni--22.3.07.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_22.3.OK: https://d15yxasja65rk8.cloudfront.net/afni_22.3.07_20221206.simg
07_20221206.simg to images/neurodesk/neurodesk-afni--22.3.07.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_22.3.07_20221206.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--22.3.07.simg ok
rmurl images/neurodesk/neurodesk-afni--22.3.07.simg ok
rmurl images/neurodesk/neurodesk-afni--22.3.07.simg ok
INFO: file images/neurodesk/neurodesk-afni--23.0.00.simg
 INFO: removing 3 oracle urls
ERROR: https://d15yxasja65rk8.cloudfront.net/afni_23.0.00_20230118.simg - could not verify presence: 403 . Will not be added
rmurl images/neurodesk/neurodesk-afni--23.0.00.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.00.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.00.simg ok
INFO: file images/neurodesk/neurodesk-afni--23.0.04.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/afni_23.0.04_20230215.simg - could not verify presence: 403 . Will not be added
 INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--23.0.04.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.04.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.04.simg ok
INFO: file images/neurodesk/neurodesk-afni--23.0.07.simg
OK: https://d15yxasja65rk8.cloudfront.net/afni_23.0.07_20230302.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_23.0.07_20230302.simg to images/neurodesk/neurodesk-afni--23.0.07.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_23.0.07_20230302.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--23.0.07.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.07.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.07.simg ok
INFO: file images/neurodesk/neurodesk-afni--23.3.02.simg
OK: https://d15yxasja65rk8.cloudfront.net/afni_23.3.02_20231024.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_23.3.02_20231024.simg to images/neurodesk/neurodesk-afni--23.3.02.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_23.3.02_20231024.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--23.3.02.simg ok
rmurl images/neurodesk/neurodesk-afni--23.3.02.simg ok
rmurl images/neurodesk/neurodesk-afni--23.3.02.simg ok
INFO: file images/neurodesk/neurodesk-afni--24.1.02.simg
OK: https://d15yxasja65rk8.cloudfront.net/afni_24.1.02_20240409.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_24.1.02_20240409.simg to images/neurodesk/neurodesk-afni--24.1.02.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_24.1.02_20240409.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--24.1.02.simg ok
rmurl images/neurodesk/neurodesk-afni--24.1.02.simg ok
rmurl images/neurodesk/neurodesk-afni--24.1.02.simg ok
INFO: file images/neurodesk/neurodesk-aidamri--1.1.simg
OK: https://d15yxasja65rk8.cloudfront.net/aidamri_1.1_20210708.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/aidamri_1.1_20210708.simg to images/neurodesk/neurodesk-aidamri--1.1.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/aidamri_1.1_20210708.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-aidamri--1.1.simg ok
rmurl images/neurodesk/neurodesk-aidamri--1.1.simg ok
rmurl images/neurodesk/neurodesk-aidamri--1.1.simg ok
INFO: file images/neurodesk/neurodesk-ants--2.3.1.simg
OK: https://d15yxasja65rk8.cloudfront.net/ants_2.3.1_20211204.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/ants_2.3.1_20211204.simg to images/neurodesk/neurodesk-ants--2.3.1.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/ants_2.3.1_20211204.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neuERROR: https://d15yxasja65rk8.cloudfront.net/ants_2.3.4_20211212.simg - could not verify presence: 403 . Will not be added
OK: https://d15yxasja65rk8.cloudfront.net/ants_2.3.5_20211212.simg
OK: https://d15yxasja65rk8.cloudfront.net/ashs_2.0.0_20210105.simg
OK: https://d15yxasja65rk8.cloudfront.net/aslprep_0.2.7_20210323.simg
OK: https://d15yxasja65rk8.cloudfront.net/aslprep_0.4.0_20230728.simg
OK: https://d15yxasja65rk8.cloudfront.net/aslprep_0.5.0_20231116.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/aslprep_0.6.0_20240129.simg - could not verify presence: 403 . Will not be added
OK: https://d15yxasja65rk8.cloudfront.net/bart_0.7.00_20210302.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/bidsappaa_0.2.0_20230612.simg - could not verify presence: 403 . Will not be added
OK: https://d15yxasja65rk8.cloudfront.net/bidsappbaracus_1.1.4_20230612.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/bidsappbrainsuite_21a_20230615.simg - could not verify presence: 403 . Will not be added
OK: https://d15yxasja65rk8.cloudfront.net/bidsapphcppipelines_4.3.0_20230524.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidsappmrtrix3connectome_0.5.3_20230615.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidsapppymvpa_2.0.2_20230629.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidsappspm_0.0.20_20230629.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidscoin_3.7.0_20220329.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidscoin_4.2.0_20231017.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidscoin_4.2.1_20231030.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/bidscoin_4.3.0_20240220.simg - could not verify presence: 403 . Will not be added
OK: https://d15yxasja65rk8.cloudfront.net/bidscoin_4.3.2_20240329.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/bidstools_1.0.0_20201208.simg - could not verify presence: 403 . Will not be added
OK: https://d15yxasja65rk8.cloudfront.net/bidstools_1.0.1_20230905.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidstools_1.0.2_20231017.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidstools_1.0.3_20231030.simg
OK: https://d15yxasja65rk8.cloudfront.net/bidstools_1.0.4_20240221.simg
OK: https://d15yxasja65rk8.cloudfront.net/brainstorm_3.211130_20211207.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/brkraw_0.3.11_20240223.simg - could not verify presence: 403 . Will not be added
OK: https://d15yxasja65rk8.cloudfront.net/cat12_r1933_20220128.simg
OK: https://d15yxasja65rk8.cloudfront.net/cat12_r2166_20230601.simg
OK: https://d15yxasja65rk8.cloudfront.net/clearswi_1.0.0_20211018.simg
ERROR: https://d15yxasja65rk8.cloudfront.net/code_220114_20220121.simg - could not verify presence: 403 . Will not be added
OK: https://d15yxasja65rk8.cloudfront.net/code_230315_20230315.simg
OK: https://d15yxasja65rk8.cloudfront.net/conn_20b_20210109.simg
OK: https://d15yxasja65rk8.cloudfront.net/conn_22a_20231115.simg



error: git-annex died of signal 9
{"Command": "scripts/replace_neurodesk_urls", "System": {"uid": "yoh", "memory_total": 135060111360, "cpu_total": 16}, "ENV": [{}], "GPU": []}
[INFO   ] == Command exit (modification check follows) =====
run(ok): /home/yoh/proj/repronim/containers (dataset) [duct scripts/replace_neurodesk_urls]
add(ok): .duct/logs/2024-05-06T13-29-22-390727_stderr (file)
add(ok): .duct/logs/2024-05-06T13-29-22-390727_stdout (file)
add(ok): .duct/logs/2024-05-06T13-29-22-390727_info.json (file)
save(ok): . (dataset)
datalad run -m "Update neurodesk image urls" duct   26.46s user 16.71s system 0% cpu 1:28:35.37 total

note: I had to kill underlying process since it stalled... not sure yet either due to duct "clogging" outputs or not, but FTR

here is the stack of processes

yoh       390634  0.0  0.0 283344 52372 pts/7    Sl+  13:29   0:01           /usr/bin/python3 /usr/bin/datalad run -m Update neurodesk image urls duct scripts/replace_neurodesk_urls
yoh       390725  0.0  0.0   2580  1536 pts/7    S+   13:29   0:00             /bin/sh -c duct scripts/replace_neurodesk_urls
yoh       390727  0.0  0.0 242704 13952 pts/7    Sl+  13:29   0:00               /home/yoh/venvs/dev3/bin/python3 /home/yoh/venvs/dev3/bin/duct scripts/replace_neurodesk_urls
yoh       390730  0.0  0.0   7084  3328 ?        Ss   13:29   0:00                 /bin/bash scripts/replace_neurodesk_urls
yoh       398627  0.0  0.0   9896  3712 ?        S    13:29   0:00                   git -c annex.alwayscommit=false annex addurl --file=images/neurodesk/neurodesk-conn--22a.simg https://d15yxasja65rk8.cloudfront.net/conn_22a_20231115.simg
yoh       398628  0.0  0.0 1075011040 66704 ?    Sl   13:29   0:00                     /home/yoh/git-annexes/10.20240430+git26-g5f61667f27/usr/lib/git-annex.linux/exe/git-annex --library-path /home/yoh/git-annexes/10.20240430+git26-g5f61667f27/usr/lib/git-annex.linux//lib/x86_64-linux-gnu: /home/yoh/git-annexes/10.20240430+git26-g5f61667f27/usr/lib/git-annex.linux/shimmed/git-annex/git-annex addurl --file=images/neurodesk/neurodesk-conn--22a.simg https://d15yxasja65rk8.cloudfront.net/conn_22a_20231115.simg
yoh       398658  0.0  0.0  11208  5248 ?        S    13:29   0:00                       /home/yoh/git-annexes/10.20240430+git26-g5f61667f27/usr/lib/git-annex.linux/exe/git --library-path /home/yoh/git-annexes/10.20240430+git26-g5f61667f27/usr/lib/git-annex.linux//lib/x86_64-linux-gnu: /home/yoh/git-annexes/10.20240430+git26-g5f61667f27/usr/lib/git-annex.linux/shimmed/git/git --git-dir=.git --work-tree=. --literal-pathspecs cat-file --batch

after I killed that addurl we got the process terminated

but stdout was very incomplete if we compare to above

$> cat .duct/logs/2024-05-06T13-29-22-390727_stdout 
INFO: file images/neurodesk/neurodesk-afni--21.2.00.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_21.2.00_20210714.simg to images/neurodesk/neurodesk-afni--21.2.00.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_21.2.00_20210714.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
rmurl images/neurodesk/neurodesk-afni--21.2.00.simg ok
INFO: file images/neurodesk/neurodesk-afni--22.1.14.simg
 INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--22.1.14.simg ok
rmurl images/neurodesk/neurodesk-afni--22.1.14.simg ok
rmurl images/neurodesk/neurodesk-afni--22.1.14.simg ok
INFO: file images/neurodesk/neurodesk-afni--22.3.06.simg
 INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--22.3.06.simg ok
rmurl images/neurodesk/neurodesk-afni--22.3.06.simg ok
rmurl images/neurodesk/neurodesk-afni--22.3.06.simg ok
INFO: file images/neurodesk/neurodesk-afni--22.3.07.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_22.3.07_20221206.simg to images/neurodesk/neurodesk-afni--22.3.07.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_22.3.07_20221206.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--22.3.07.simg ok
rmurl images/neurodesk/neurodesk-afni--22.3.07.simg ok
rmurl images/neurodesk/neurodesk-afni--22.3.07.simg ok
INFO: file images/neurodesk/neurodesk-afni--23.0.00.simg
 INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--23.0.00.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.00.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.00.simg ok
INFO: file images/neurodesk/neurodesk-afni--23.0.04.simg
 INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--23.0.04.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.04.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.04.simg ok
INFO: file images/neurodesk/neurodesk-afni--23.0.07.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_23.0.07_20230302.simg to images/neurodesk/neurodesk-afni--23.0.07.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_23.0.07_20230302.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--23.0.07.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.07.simg ok
rmurl images/neurodesk/neurodesk-afni--23.0.07.simg ok
INFO: file images/neurodesk/neurodesk-afni--23.3.02.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_23.3.02_20231024.simg to images/neurodesk/neurodesk-afni--23.3.02.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_23.3.02_20231024.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--23.3.02.simg ok
rmurl images/neurodesk/neurodesk-afni--23.3.02.simg ok
rmurl images/neurodesk/neurodesk-afni--23.3.02.simg ok
INFO: file images/neurodesk/neurodesk-afni--24.1.02.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/afni_24.1.02_20240409.simg to images/neurodesk/neurodesk-afni--24.1.02.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/afni_24.1.02_20240409.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-afni--24.1.02.simg ok
rmurl images/neurodesk/neurodesk-afni--24.1.02.simg ok
rmurl images/neurodesk/neurodesk-afni--24.1.02.simg ok
INFO: file images/neurodesk/neurodesk-aidamri--1.1.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/aidamri_1.1_20210708.simg to images/neurodesk/neurodesk-aidamri--1.1.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/aidamri_1.1_20210708.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neurodesk-aidamri--1.1.simg ok
rmurl images/neurodesk/neurodesk-aidamri--1.1.simg ok
rmurl images/neurodesk/neurodesk-aidamri--1.1.simg ok
INFO: file images/neurodesk/neurodesk-ants--2.3.1.simg
 INFO: adding https://d15yxasja65rk8.cloudfront.net/ants_2.3.1_20211204.simg to images/neurodesk/neurodesk-ants--2.3.1.simg
^[[0Jaddurl https://d15yxasja65rk8.cloudfront.net/ants_2.3.1_20211204.simg ok
^[[1G INFO: removing 3 oracle urls
rmurl images/neurodesk/neu%

so probably buffered and buffer was not flushed etc.