Coder Social home page Coder Social logo

abathur / resholve Goto Github PK

View Code? Open in Web Editor NEW
210.0 8.0 5.0 894 KB

a shell resolver? :) (find and resolve shell script dependencies)

License: MIT License

Shell 17.71% Python 66.87% Nix 11.03% Roff 3.47% Makefile 0.92%
bash nix packaging packaging-scripts packaging-tool shell nixpkgs shell-scripting build-tool build-tools

resholve's Introduction

resholve references to external dependencies in shell scripts

Test

resholve ensures shell script dependencies are declared, present, and don't break or shift if PATH changes. It helps turn shell scripts and libraries into reliable, self-contained packages you can use as building blocks.

resholve treats references to external commands (and sourced scripts) as build-blocking errors until you declare the dependency. Once they're all declared, it rewrites the references to absolute paths.

Comparisons:

  • a linker for bash scripts
  • patchelf for shell scripts

Convinced? Jump to the Quickstart. Otherwise:

  • If you want to understand what problems resholve addresses, read the next section.
  • If you want to see invocations and output, the Demos document is a good place to start.

What problem(s) does this solve?

I built resholve so Nix/Nixpkgs can have great shell packaging.

In the Nix ecosystem, resholve helps us:

  • discover and declare dependencies at package time instead of after runtime failures
    • keep unexpected versions of an executable or script from causing undefined behavior
  • avoid polluting PATH with all of a script's dependencies, which also means
    • no conflicts between tools different scripts need on PATH
    • no conflicts with other packages a user expects on PATH
    • no implicit dependency on the content of fragile rc/profile scripts
  • work directly with "normal" shell scripts
    • no polluting source scripts with template variables/syntax
    • no inlining shell scripts to readily inject absolute paths
    • no fragile .patch files
    • no fragile sed/awk text substitutions that might over-match

Note: resholve is a generic command-line program. Other ecosystems/toolchains could use it for similar benefits.

Quickstart

If you're packaging Shell with Nix, you'll want to use resholve's Nix API.

You can also use resholve's CLI directly.

Note: resholve is only packaged with Nix for now. Whether you use the CLI or the Nix API, you'll need to have Nix installed.

Nix API

Since resholve's Nix API/builders are included in nixpkgs, most Nix users can jump right in. Two good places to start:

Tip: If you're an experienced packager or write a lot of Shell, you may also want to read through resholve's Nix demo. It's terse, but it ~proves that Nix + resholve enable us to build shell packages that are so well-contained we can safely compose them even when they have conflicting dependencies.

CLI

resholve has an explicit-is-better-than-implicit philosophy, so its CLI is pretty verbose. You can use it directly, but it's more or less assumed that you'll use it through scripts or packaging toolchains. (You may not need to learn the CLI unless you're using it on scripts you can't build Nix expressions for, integrating it with other toolchains, packaging it, or contributing to resholve itself.)

If you're new to resholve, start with the demo shell.

If you just want resholve itself (no preconfigured demo environment), use the instructions for building/installing a development version or a stable version.

Note: However you obtain the resholve CLI, check man resholve for CLI usage.

Demo shell

The demo shell pulls in prerequisites for resholve's command-line demo. This demo illustrates resholve's basic features, invocation patterns, output, error messages, exit statuses, and how resholving a script changes it.

The easy way to run the demo is with Nix's experimental nix-command and flakes features enabled. The following command will load the demo shell environment and print more information on how to proceed:

nix develop github:abathur/resholve

Note: There's more on the demo's output format and a plaintext copy of the output in the Demos document.

Traditional `nix-shell` instructions

You can also use the demo via nix-shell if you clone the repository:

git clone https://github.com/abathur/resholve.git
cd resholve
nix-shell

Development versions

resholve's master branch is fairly stable. If you have Nix's experimental nix-command and flakes features enabled, you should be able to use it with any of the below:

# without cloning
nix build github:abathur/resholve
nix shell github:abathur/resholve

# from the root of a resholve checkout
nix build
nix shell
Traditional `nix-build` instructions

You can build resholve from a checkout with the traditional CLI:

git clone https://github.com/abathur/resholve.git
cd resholve
nix-build

Caution: The same isn't quite true of nix-shell, which will load the demo shell. This might be fine for your purposes, but keep in mind that it pre-populates some environment variables just for the demo.

Stable versions

You can get a cached stable version of resholve from Nixpkgs:

# new CLI/flakes
NIXPKGS_ALLOW_INSECURE=1 nix shell --impure nixpkgs#resholve
NIXPKGS_ALLOW_INSECURE=1 nix shell --impure github:nixos/nixpkgs#resholve

# traditional CLI
NIXPKGS_ALLOW_INSECURE=1 nix-shell -p resholve

Note: the high-quality shell parser resholve builds on uses python2. nixpkgs has taken steps to protect users from accidental run-time use of python2. resholve will still work at build-time for use in Nix packages. You only need the NIXPKGS_ALLOW_INSECURE env to use nixpkgs' copy of resholve in a shell. To be safe, don't run resholve on untrusted input.

(This isn't permanent. resholve should eventually be able to move to python3.)

Contributing

If you're looking to improve resholve or the broader ecosystem (resholve + binlore), feel free to open an issue, reach out to me on Matrix, or send an email.

There's much to do. Some of it is simple and straightforward. Some of it's creative and green-field. Some of it's difficult. I've focused on primary work at the expense of documenting an onramp for other contributors--but I'm happy to help you get started and use the opportunity to build the ramp as we go.

You can rebuild resholve by following the instructions for building a development version. resholve's tests don't run during the build, so you should also validate the codebase by running make ci.

Note: Some documentation updates entail updating generated files. I use make update for this, but this will also usually cause some churn in timings.md and demos.md. It's generally fine to skip committing those changes if they aren't meaningful. Feel free to bug me if you aren't comfortable doing this or need feedback.

Acknowledgements

  • resholve leverages the Oil shell's OSH parser) and wouldn't be feasible without Andy Chu's excellent work on that project.

Limitations

Documentation

  • The manpage is currently the canonical reference to resholve's options and behavior; the only online format is plaintext. See #19.

Packaging

  • My short-term goal is to support packaging shell projects for the Nix package manager. As such, the current build process depends on Nix.

    If you're interested in using resholve without Nix, I'll appreciate contributions that fill in build support for other environments.

Known Gaps & Edge Cases in the utility itself

Because Shell is a very flexible, tricky language, resholve necessarily focuses on low-hanging-fruit tasks. Some of these will inevitably be supported over time, while others may stay out of reach. Please open an issue if you find a new one.

The main areas I'm currently aware of:

  • In any Nix build, resholve now blocks resolution of some fundamental external utilities (such as su and sudo) that use run wrappers in NixOS. See #29 for more.

  • Because resholve makes assumptions about the behavior of some builtins in order to resolve scripts, it blocks if it looks like one is overridden by a function or alias. (This can likely be relaxed once I have a better sense of who/what/when/where/why/how these are overridden).

  • resholve doesn't have robust handling of variables that get executed like commands (this includes things like eval $variable and "$run_as_command" and $GIT_COMMAND status). There's some room for improvement here, but I also want to manage expectations because some cases are completely intractable without evaluating the script.

    • there's a first-level complication about seeing-through the variables themselves
    • and then a second-level issue with seeing-through double-quoted strings (for example, an eval )
  • fc -s has interesting behavior that makes it hard to account for:

    • If I run ls /tmp and then echo blah and then fc -s 'ls', it'll re-run that previous ls command.
      • If resholve rewrites ls to an absolute path, the fc -s command won't work as expected unless we also expand the ls inside the fc command.
    • If I run ls /tmp and then fc -s tmp=sbin, it'll run ls /sbin; if I then run fc -s ls=stat, it runs stat /sbin.
      • Accounting for this will be hard. There are no strict semantics--it can substitute arbitrary text which could be executable names or arguments or even just parts of them. We'd have to be very explicitly parsing things out, or maybe extracting them into a mock test and running it, to know what to do.

    For now this is unaddressed. It probably makes the most sense to just raise a warning about not handling fc and link to a doc or issue about it, but I'm inclined to put this off until someone asks about it.

resholve's People

Contributors

abathur avatar grahamc avatar jayman2000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

resholve's Issues

notes from nixpkgs#173885

I'll give you a step-for-step introduction of what I did and what I found, maybe this can help you to help me:

I first took a look at Kokadas' derivation and saw that it was primarily rounded on that "solutions" entry after adding the "mkDerivation". So I searched how to write one of those, which led me to this README that I've used as the source of truth.

My first doubt was, "Should I use the paths relative to input or output?" and well, that was written there, "$out-relative paths".

The second was an observation that I've seen in Kokadas' code. He was using "sh" instead of "bash" as interpreter, which led me to ask him and then discovering bash includes a posix-mode.

Then a small try after just adding the "scripts", "interpreter", and the "inputs" I've already known lead me to that regex parsing issue of oilshell, which I solved with the sed s|^(.+) =~ ([^\$].+) ]]|regexp='\2'; \1 =~ $regexp ]]| that I based in a commit from you in 2021.

After that I had to deal with a Can't resolve dynamic argument in 'source' -- that's because airgeddon sources its scripts with source "${scriptfolder}${rc_file_name}" and well, I don't know how to correctly solve these even now ๐Ÿ™ˆ -- Should I replace this with relative-hardcoded-paths, I know that I can't use absolute-hardcoded-paths -- Kokada pointed me to "keep.source" but this didn't help, "fix" somehow did the trick for the "strings" and "known_pins", and for the "plugins" in a little hacky way. But I had to leave ".airgeddonrc" out of this and use "keep" on it. (very :lost: even now)

At this point, I had already taken a look at the resholve docs. And well, I've started to add all the deps that weren't documented and resholve was finding ๐Ÿ† . That lead me to these issues in the way:

  • "There is not yet a good way to resolve 'ping' in Nix builds." (Which don't come up with any suggestion, I've added it as an external)

  • "timeout" had a "15" where the lore was expecting an "executable argument" (No clue how to say it to skip only this parameter, so I changed it to "cannot", which I'm pretty sure I should not)

  • you repeat "execer" twice here:

    -- should be "wrapper" right? (This repeats in that README)

  • "tmux" is executing its arguments, even worse, airgeddon uses "send-keys" to throw commands in it. (What should I do here? Again I went with "cannot")

  • I had to come up with a hacky solution to keep optional dependencies optional, using the "external".

In the end, it ran, but the scripts in heredocs inside it were not using absolute references. They were even using the unpatched shebang. Airgeddon starts running one of these inside tmux and that wasn't finding the executables.

Originally posted by @PedroHLC in NixOS/nixpkgs#173885 (comment)

Refactor to more readily support custom handling of individual (mainly external) commands

Up to now, the "commands" resholved sub-resolves arguments for is mainly focused on builtins (though it does handle a few external commands with similar call patterns), most of which are handled by a small number of patterns.

The slowly-expanding list of cases of external commands (xargs, sudo, tar, find, etc.) likely to have arguments that should ideally be resolved is already making it obvious that resholved needs better support for mapping commands to both re-usable generic handlers, and bespoke single-command handlers.

Parsing error on &!

zsh has a &! syntax for starting a disowned job, e.g.:

sleep 3s &!

Using this in a script causes resholve to fail with a parsing error:

$ resholve --interpreter "$SHELL" --path '' <<< 'sleep 3s &!'
  sleep 3s &!
             ^
[ stdinNone ]:2: error: Invalid word while parsing command

it's possible to cause a directive-parse error with the Nix API

fix = {
        "$TEST_CMP" = [ "'diff -u'" ];
};

Can cause:

There's a bad directive already in this file. You may need to
re-resholve it with the current version? Here's the context:

   parsing:  ../../../nix/store/qm3zqvbscrwzify2ny01189gg4bmm3n0-sharness-1.1.0-dev/share/sharness/sharness.sh
 directive:  '# resholve: fix $TEST_CMP:diff -u'
     error:  valid single-part fix directives: 'aliases', absolute path

This is probably the quoting confusing things, but remember to look at the broader question of whether there's a better way to support these multi-word fixes than having to embed a quote. There are almost none of these in the wild yet, so it's a good time to fix it...

Should(/can?) resholve do anything about paths in `test`/`[ ]` ?

This section in mons is giving resholve a little trouble:

lib='%LIBDIR%/liblist.sh'
[ ! -r "$lib" ] && { "$lib: library not found."; exit 1; }
. "$lib"

The ~intended approach for cases like this is to use a fix directive to replace $lib with the $out-relative path (lib/libshlist/liblist.sh) and then let resholve handle resolving them to absolute paths whenever it makes sense to do so--but in this case the result is roughly:

lib='/nix/store/arzpg3i5zh14yj1s7d20086a9rjmlfvx-mons-20200320/lib/libshlist/liblist.sh'
[ ! -r "lib/libshlist/liblist.sh" ] && { echo "lib/libshlist/liblist.sh: library not found."; exit 1; }
. "/nix/store/4pby883d4wypprmvkzz92c0c62iz4cxs-resholved-mons-20200320/lib/libshlist/liblist.sh"

Then the check fails at runtime.

I'm not certain, but I imagine it's common enough to be worth handling if we can. That said, we definitely can't resolve every test -f or test -r etc., so the question is more about how we'd narrow the scope to ensure we're only resolving them when it makes sense.

I guess the two main candidate approaches would be:

  • Treat test with file/dir flags as its own thing that resholve cares about (i.e., require every one of them to either resolve to a known file at resolve-time, or require a directive exempting the path).
  • Automatically resolve paths matching some heuristics (like actually resolved during this invocation, present in inputs/scripts, present in $out?) and ignore the rest

Maybe it's worth plumbing andy's real-world shell collection to see what all cases of the various test file/dir flags you can find and how these approaches stack up against them.

investigate/develop a buildInputs setup hook to resholve all outputs of a Nix build

@grahamc pointed out that resholved may also be helpful as a Nix setup hook.

I think this is a natural progression (and will probably be a good stress test). I'm a little unsure what this should look like, so I'm optimistic it'll be a straightforward contribution for someone with a little more perspective than I have on the Nix ecosystem.

A few mixed thoughts:

  1. It might be useful, performance wise, to avoid eagerly resolving some things things? (wrappers, nix-shell scripts, shebangless scripts with a .sh extension, multiple scripts that all symlink to a single file and use $0 to execute differently)

  2. I don't have any sense from a distance how configurable the hook interface needs to be. If these tasks rarely require overrides or special settings, it may be trivial to implement this at any time. If it needs to be able to customize significantly, we'll need to be a little more intentional about the interface (and it would probably be best to do that alongside the QC pass on the existing APIs.)

Document how to get a list of builtins for common compatible shells

Once I add support for overriding the resolved builtins in #10, I think it'd be helpful to document/list how a user might generate an appropriate list.

For now this is mostly a braindump of ways I'm aware of that might yield an automatic list of builtins for some shells for refining later.

  • enable: bash dgsh zsh
  • builtin: ksh
  • builtin --names: fish
  • builtins: csh, tcsh
  • compgen -b or compgen -A builtin: bash

idiom for errors conditional on other resolutions?

I'll start off by saying that I suspect this fruit is too high up the tree to bother with.

I stumbled on the following in bats-core:

local hostname="${HOST:-}"
[[ -z "$hostname" ]] && hostname="${HOSTNAME:-}"
[[ -z "$hostname" ]] && hostname="$(uname -n)"
[[ -z "$hostname" ]] && hostname="$(hostname -f)"

The most-principled resolution would be to only require one of these (since the idea of needing to try multiple is exactly the kind of problem resholve eats for lunch). resholve can't know about the initial environment, the best it could do is stop after the first that it finds in the script/env.

The simple approach is to just make the user choose whether they want to supply/exempt each--this will inevitably be the only choice for a while.

For resholve to do this on its own, it'd need to evaluate these conditionals--and I'm very sure that's too far up the tree.

Single-source documentation where reasonable?

I don't know much about this and a few casual searches haven't turned up many obvious options. It smells like something I'll spin my wheels a lot on.

resholve is largely a solo effort so far, so I'm keeping documentation support fairly modest to avoid meta-work keeping multiple copies/variants in sync, or spinning my wheels on cobbling together a single-sourcing workflow.

Current docs are:

  • The --help flag provides a terse overview of flags and their purpose. (I'm not super happy with ConfigArgParse's default help formatting, and keeping this reference terse mitigates my complaints with it).
  • The manpage (currently hand-written in mdoc) is the only canonical reference.

If you have experience single-sourcing that directly applies or transfers to --help, manpage, README.md, and possibly an external site at some point, help bootstrapping the workflow/toolchain here would be much appreciated.

It's worth discussing the approach before diving in, but I think my main stipulation is that I don't want to single-source in a way that is crappy/suboptimal for all or most of the outputs (i.e., I don't want to just throw the manpage through a few tools to generate plaintext, html, and markdown).

Exec lacks a required argument

Got the following traceback, command included.

[nix-shell:~/src/nb]$ resholve --interpreter=$(which bash) --inputs='' --keep='source:$NBRC_PATH;$__file $PAGER' nb
    eval "describe \"${_alias}\" \"\$____describe_${_subcommand}\""
         ^
/Users/toonn/src/nb/nb:1116: FEEDBACK WANTED: Letting quoted 'eval' through for now. Not sure if this is 'right' or not. Weigh in @ https://github.com/abathur/resholve/issues/2
    eval "_${_alias}() { _${_subcommand} \"\${@}\"; }"
         ^
/Users/toonn/src/nb/nb:1117: FEEDBACK WANTED: Letting quoted 'eval' through for now. Not sure if this is 'right' or not. Weigh in @ https://github.com/abathur/resholve/issues/2
Traceback (most recent call last):
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1983, in <module>
    sys.exit(punshow())
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 621, in punshow
    epilogue=args.epilogue,
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 485, in resolve_script
    script_path, shebang=shebang, prologue=prologue, epilogue=epilogue
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1394, in __init__
    self.Visit(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1958, in Visit
    self.VisitChildren(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1973, in VisitChildren
    self.Visit(item)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1958, in Visit
    self.VisitChildren(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1977, in VisitChildren
    self.Visit(child)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1958, in Visit
    self.VisitChildren(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1973, in VisitChildren
    self.Visit(item)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1958, in Visit
    self.VisitChildren(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1973, in VisitChildren
    self.Visit(item)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1958, in Visit
    self.VisitChildren(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1973, in VisitChildren
    self.Visit(item)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1958, in Visit
    self.VisitChildren(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1977, in VisitChildren
    self.Visit(child)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1958, in Visit
    self.VisitChildren(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1973, in VisitChildren
    self.Visit(item)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1957, in Visit
    self._Visit(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1951, in _Visit
    self._visit_command_Simple(node)
  File "/nix/store/v5x6bbli7j8mn58mmca54jy3rwcy4rkz-resholve-0.4.2/bin/.resholve-wrapped", line 1661, in _visit_command_Simple
    node,
Exception: ('Trying to handle exec but it lacks a required argument', (command.Simple
  words: [(compound_word parts:[(Token id:Id.Lit_Chars span_id:22274 val:exec)])]
  redirects: [
    (redir
      op: (Token id:Id.Redir_Great span_id:22276 val:'3>')
      loc: (redir_loc.Fd fd:3)
      arg: 
        (compound_word
          parts: [
            (double_quoted
              left: (Token id:Id.Left_DoubleQuote span_id:22278 val:'"')
              parts: [
                (braced_var_sub
                  token: (Token id:Id.VSub_Name span_id:22280 val:_temp_file)
                  spids: [22279 22281]
                )
              ]
              spids: [22278 22282]
            )
          ]
        )
    )
  ]
  do_fork: T
))

auto-generate ~graphical version of demo?

Edit: This was initially about setting up automation to ensure the sample of demo output in the repo was kept in sync with the real demos/tests. That work's been done, but it might still be nice to have an animated/graphical demo if it adds value and we can readily automate it.

I'd like to better-automate extracting fresh (and sample output code blocks?) demos from the real test suite (there's the cli demo in tests/demo.bats, and the Nix ~integration tests in ci.nix). Mostly so I don't have to either waste time regenerating them every time I push to master, or constantly worry about whether they're out of date.

There's an underlying question about what the ideal form of the demos is. My thinking at the moment is that this is a more-the-merrier proposition, and anything is better than nothing. Main options:

  • a static output ~log, ideally with with some form of highlighting, maybe:
    • In general, this is done. It's not fancy or HTML, but regen of demos.md has been automated for a while now.
    • convert the ansi to html with ansi2html or similar (also want a stylesheet; the default output colors suck)
    • the Makefile handles the broad strokes of "automating" this within my normal dev workflows (but we could probably skip this if all of the right outputs were built in CI and published from there?)
    • add a new outputter for the demos that is designed for this (see tests/demo.bash; could change RESHOLVE_DEMO into a string and use a switch block for handling.)
  • a graphical format, either like: a slide-show with one slide per test, or a rolling ~screencast-style: svg (preferred) / JS-rendered non-image / gif; potential resources:

Other potential resources:
https://www.r-bloggers.com/2020/04/rendering-your-readme-with-github-actions/

AttributeError: 'str' object has no attribute 'first_spid'

When updating Nixpkgs recently, some CI caught this issue with one of our resholved scripts:

    Traceback (most recent call last):
      File "/nix/store/33xkmpzvw86kyv564lp0xinjjxnn471m-resholve-0.6.6/bin/.resholve-wrapped", line 4478, in <module>
        sys.exit(punshow())
      File "/nix/store/33xkmpzvw86kyv564lp0xinjjxnn471m-resholve-0.6.6/bin/.resholve-wrapped", line 903, in punshow
        resolve_cmdlikes()
      File "/nix/store/33xkmpzvw86kyv564lp0xinjjxnn471m-resholve-0.6.6/bin/.resholve-wrapped", line 2902, in resolve_cmdlikes
        cmdlike.resolve()
      File "/nix/store/33xkmpzvw86kyv564lp0xinjjxnn471m-resholve-0.6.6/bin/.resholve-wrapped", line 2874, in resolve
        self._resolve_invocations(solution)
      File "/nix/store/33xkmpzvw86kyv564lp0xinjjxnn471m-resholve-0.6.6/bin/.resholve-wrapped", line 2840, in _resolve_invocations
        source.look_for_external_sub_execution(self.name, invocation)
      File "/nix/store/33xkmpzvw86kyv564lp0xinjjxnn471m-resholve-0.6.6/bin/.resholve-wrapped", line 3775, in look_for_external_sub_execution
        parsed = parser.parse_known_args(invocation.args)
      File "/nix/store/w05yrq8d41vbd7cz3gyq3ysidz2qyfdp-python-2.7.18/lib/python2.7/argparse.py", line 1737, in parse_known_args
        namespace, args = self._parse_known_args(args, namespace)
      File "/nix/store/w05yrq8d41vbd7cz3gyq3ysidz2qyfdp-python-2.7.18/lib/python2.7/argparse.py", line 1943, in _parse_known_args
        start_index = consume_optional(start_index)
      File "/nix/store/w05yrq8d41vbd7cz3gyq3ysidz2qyfdp-python-2.7.18/lib/python2.7/argparse.py", line 1883, in consume_optional
        take_action(action, args, option_string)
      File "/nix/store/w05yrq8d41vbd7cz3gyq3ysidz2qyfdp-python-2.7.18/lib/python2.7/argparse.py", line 1811, in take_action
        action(self, namespace, argument_values, option_string)
      File "/nix/store/33xkmpzvw86kyv564lp0xinjjxnn471m-resholve-0.6.6/bin/.resholve-wrapped", line 339, in __call__
        Invocation(words=values),
      File "/nix/store/33xkmpzvw86kyv564lp0xinjjxnn471m-resholve-0.6.6/bin/.resholve-wrapped", line 3131, in __new__
        self.first_spid = firstword.first_spid
    AttributeError: 'str' object has no attribute 'first_spid'

It only appeared for this one script. I'll gladly send you said script (and the Nix expression), just reach out to me on Matrix when you have time.

Support overriding the builtin list

I don't have any intrinsic objection to supporting shells other than bash, but I have very little interest in significantly complicating resholved to do so (especially in the short run.)

That said, I think the short-term Goldilocks change here is to support completely overriding the builtin list, which should enable resholved to resolve a fair fraction of the script base of a large number of traditional shells.

Since this seems likely to be a common request, I'm interested in finding space in the API for it now.

Support for gnused's `e` command extension?

I'm in the process of implementing general support for resolving of commands that appear in the arguments to other commands, which should be available in the coming weeks.

sed has an instance of this behavior via its e command that isn't as straightforward to support as others.

I managed to find a way to raise an error when this form is encountered:

I don't know how to handle sed `e` commands yet--sorry :(

Next step: - See the feedback issue for a workaround

https://github.com/abathur/resholve/issues/28

That said, I get the impression this feature is very rarely used, so I currently plan to leave it here for now until/unless users show up documenting real use/test-cases in this issue.

Ease path to adding new external command parsers

I'd like to see a way to .override resholve from nixpkgs to effectively throw in some arbitrary extra code into ExternalCommandParsers. The override should also affect resholve.writeScript and such as well, of course. If this could be made part of the "solution" rather than an override, all the better.

It's a comparatively easily implemented relief valve to at least allow a user to do something to handle a situation where a command does execute its arguments and there's no parser for it in resholve. Currently there's no way forward in that case other than:

  • externally resolving the relevant command somehow and tricking resholve into thinking resolution isn't necessary
  • maintaining a patch file to apply to resholve during build
  • maintaining a fork of resholve

Longer-term, it would be good to develop some kind of simple language for users to describe a program's argument structure that covers most cases and is a more stable interface, but at least this works as a near-universal fallback for users willing to put in the effort.

Better support for alias definitions

I'm adding and promptly closing this issue to document progress towards an initial release. This work is already complete and merged to master.

For posterity: Prior to merging a347d9d, resholved recognized alias definitions, but didn't attempt to resolve commands inside them. This commit introduces the ability to resolve them. Because resolving aliases is "wrong" under some conditions (i.e., shell scripts designed to be sourced into a user profile, where they should most-likely resolve from PATH at runtime) this is gated by a flag --resolve-aliases.

Nix API for `keep` breaks on empty attrset

This breaks:

  solutions.default = {
    scripts = [ "${basename}" ];
    interpreter = "${bash}/bin/bash";
    keep ={};
}
       > no Makefile, doing nothing
       > installing
       > post-installation fixup
       > /nix/store/pphlhsh6qpxsyp8c7j6ja18gp10s41cw-check-0.0.0 /build/check
       > Traceback (most recent call last):
       >   File "/nix/store/x1988xrwidf8nsfls5kd7fjby6dwlnqc-resholve-0.5.1/bin/.resholve-wrapped", line 1990, in <module>
       >     sys.exit(punshow())
       >   File "/nix/store/x1988xrwidf8nsfls5kd7fjby6dwlnqc-resholve-0.5.1/bin/.resholve-wrapped", line 573, in punshow
       >     directives.keep.update(group)
       > TypeError: 'NoneType' object is not iterable

Override setting for letting user specify a command to resolve in invocations of another

I think it's a good idea to develop a generic override option as a relief valve for the intent in #6. I'm not sure what the API will look like, but this override would amount to letting the user express the following intent:

attempt to resolve <to_resolve> if it is found as an argument to <resolve_in>

This will hopefully avoid situations like...

  • Resholved gets strapped to a growth-rocket but loses many chances at a good first impression for users who end up blocked on resolving small numbers of edge-case commands.
  • We get stuck on a treadmill weighing the project down with special rules for handling rare edge-case commands.
  • We get stuck in quicksand trying to reach "good-enough" for a common/important command that ends up being a nightmare to support. (commands with big version/platform differences, very idiomatic syntax, etc.)

Nix: inability to resolve some ~wrapped / impure setuid executables such as `sudo`

Nix has to special-case some setuid executables, and this disrupts resholve's ability to resolve them to absolute paths. (There are a number of interlocking issues here, and I suspect this will take some time--and some willingness to be squeaky wheels--to get this fixed in Nixpkgs. I vaguely plan to document these issues--but for now I'm just outlining.)

I don't have a lot of the Nix(OS)/nixpkgs system-level perspective to have the best handle on all of this. I get the impression there isn't a canonical list, but guessing from the run wrappers on my own NixOS system, this seems like a fair list:

chsh dbus-daemon-launch-helper fusermount3 fusermount kcheckpass kwin_wayland mount newgidmap newgrp newuidmap passwd ping pkexec polkit-agent-helper-1 sg start_kdeinit sudoedit sudo su umount unix_chkpwd

In the near future, I'll update resholve to raise the following error for a cross-platform subset (ping chsh newgrp passwd su sudo mount umount) of these whenever NIX_BUILD_TOP is in the environment:

There is not yet a good way to resolve 'sudo' in Nix builds. Your feedback may help me (and the Nix community) understand what the best course of action is here.

See https://github.com/abathur/resholve/issues/29 for info, feedback, and potential workarounds.

In the short term, your best bets for working around this are:

  1. add a fake directive via the CLI or the Nix API. here's an example of what this would look like for sudo:
    • CLI: --fake 'external:sudo'
    • Nix:
      fake = {
        external = [ "sudo" ];
      };
  2. Use resholve's prologue option to inject (at the head of the script) some refinement based on your context:
    • A run-time check that will abort execution if the lookup fails.
    • Add/change the PATH to ensure the lookup will succeed.
    • Define a function or alias that executes any specific absolute path you need.

In some more limited cases, you may know that you have access to an executable that doesn't actually need a setuid wrapper and you really just need resholve to get out of your way. If you're really sure, you can tell it to back off by adding fix directive via the CLI or the Nix API. Here's an example of what this would look like for sudo:

  • CLI: --fix 'sudo'

  • Nix:

    fix = {
        sudo = true;
    };

Executables could be discovered by running the script with PATH pointing at FUSE

Instead of using a parser to parse the text of the shell script, you could point PATH at a FUSE filesystem and run the script to discover the executables the script runs. One might be concerned that one has to run the script, which might have annoying side-effects or be very slow - but that doesn't have to be the case, since you don't have to run the real executables: You can have the FUSE filesystem respond on-demand with stubs that do nothing for every executable that the script runs. This is kind of like what the Tup build system does. https://github.com/gittup/tup

Of course this is not a serious suggestion, parsing the shell script as in your current approach is certainly better (this dynamic way wouldn't even support the most basic functions). I just mention this because you might find it mildly amusing, because I implemented something like resholved using that strategy in this SIGBOVIK paper: https://github.com/catern/rsyscall/tree/master/research/sigbovik2020 https://github.com/catern/rsyscall/blob/master/research/sigbovik2020/paper.pdf

variables run as commands (first word)

I'm not sure how resholved should handle variables used as a command (i.e., the first word of a statement). I'm already aware of a number of patterns here and have found real-world examples in common programs where not handling these is "wrong", but I could use some feedback on how to handle it (and perhaps how often people run into it).

A few such patterns include:

  • Storing a frequently-used command in a variable name to make it easier to change or override and replacing uses of the command with the variable.
  • When a script has to do a little work to figure out what the right command is (i.e., it changes depending on user config, command-line options, platform, software versions, sanity checks of command behavior/options/output).
  • Running positional parameters as commands. This is done for multiple distinct reasons, like wrapping an execution with other logic, exposing script functions for external execution, etc. (I think most of these will be okay-ish, so they're currently exempted from the warning. For them to be robustly okay, resholved would probably have to figure out how and where they're being invoked, and then try to resolve the appropriate positional arg of each invocation.)

This pattern, at least with named variables, will now print a warning/request for feedback, such as:

  $GIT_PROGRAM status
  ^~~~~~~~~~~~
[ stdinNone ]:7: FEEDBACK WANTED: Letting dynamic command (first-word variable) through for now. Not sure if this is 'right' or not. ...

As the fact that resholved issues a warning here suggests, it isn't terribly hard to spot this behavior. Fixing it is less straightforward. Continuing the $GIT_PROGRAM example, resholved would either need to replace instances of the variable with a resolved program, or it needs to figure out where the variable is set and resolve whatever it's set to (but only if it's not something like a builtin/function/alias?)

Trying to do this automatically probably entails a jump in complexity and greater risk of breaking a resolved script in some way.

It is easy to throw an error here and just make/let the user override it, though I'm not sure how the override API should approach it...

  • It could enable the user to specify a replacement for the variable, but it seems tricky to do this robustly (do we just replace instances used as a command? also instances used inline? does the script build up the variable iteratively in multiple steps to add flags and options?)
  • The API could specify a target variable, and a string/identifier to try and resolve wherever it's defined. This might be a little more robust against the script iteratively constructing the command, but I can imagine it breaking down in other ways (the script uses the variable for flow control, or performs string operations like substitution or positional indexing/slicing).

explore using resholve in a Nix bundle/containerization workflow

The ability to bundle or containerize resholved scripts and dependencies for end-use without Nix installed might make resholve more attractive to Shell-based projects that aren't otherwise part of the Nix ecosystem.

Cross-platformable stuff is ideal, but I imagine this will need to percolate for a while regardless--so any starting point is helpful.

variable-name normalization not working in at least one context

I was drafting a quick take at a resolution for nixos-rebuild.sh and noticed that it wasn't recognizing $EDITOR, probably because it's in a mixed dollar-sub form, or something?

$ osh -n -c 'exec ${EDITOR:-nano} "$NIXOS_CONFIG"'
(C {<exec>} 
  {
    (braced_var_sub
      token: <Id.VSub_Name EDITOR>
      suffix_op: (suffix_op.Unary op_id:Id.VTest_ColonHyphen arg_word:{<nano>})
    )
  } {(DQ ($ Id.VSub_DollarName '$NIXOS_CONFIG'))}
)

underlying parse failure on `coproc` with group/braced command-list form

nix-repl> pkgs.resholve.version                                                                                                                           
"0.8.0"

nix-repl> :b pkgs.resholve.writeScript "foo" { interpreter = "${pkgs.bash}/bin/bash"; inputs = [];} "coproc TEST { echo foo; }; cat </dev/fd/\${TEST[0]}"
error: builder for '/nix/store/vf27cqizwz0274k8dzlvrczcbzmph14g-foo.drv' failed with exit code 1;
       last 8 log lines:
       > [resholve context] : invoking resholve with PWD=/build
       > [resholve context] RESHOLVE_LORE=/nix/store/a9sjzsny6c1hfz9764h0522cpgzhq4xi-more-binlore
       > [resholve context] RESHOLVE_INPUTS=
       > [resholve context] RESHOLVE_INTERPRETER=/nix/store/30j23057fqnnc1p4jqmq73p0gxgn0frq-bash-5.1-p16/bin/bash
       > [resholve context] /nix/store/illp9406jkdy0yncvh4l8x9qw8r7c8jk-resholve-0.8.0/bin/resholve --overwrite /nix/store/lisy9259fqy5vsa7g7gk863qj6wqsgdr-foo
       >   coproc TEST { echo foo; }; cat </dev/fd/${TEST[0]}
       >                           ^
       > /nix/store/lisy9259fqy5vsa7g7gk863qj6wqsgdr-foo:2: error: Unexpected right brace
       For full logs, run 'nix log /nix/store/vf27cqizwz0274k8dzlvrczcbzmph14g-foo.drv'.
$ bash -c 'coproc TEST { echo foo; }; cat </dev/fd/${TEST[0]}'
foo

Fails on commands named "pass"

When you use a command named pass (eg. from password-store) in a script, it crashes resholve:

$ cat test.sh 
pass
$ resholve ./test.sh --interpreter none --path ""
Traceback (most recent call last):
  File "/nix/store/bw25mwzbxyyxk9djlsvfrk4jz2kf4hyz-resholve-0.4.2/bin/.resholve-wrapped", line 1962, in <module>
    sys.exit(punshow())
  File "/nix/store/bw25mwzbxyyxk9djlsvfrk4jz2kf4hyz-resholve-0.4.2/bin/.resholve-wrapped", line 621, in punshow
    epilogue=args.epilogue,
  File "/nix/store/bw25mwzbxyyxk9djlsvfrk4jz2kf4hyz-resholve-0.4.2/bin/.resholve-wrapped", line 485, in resolve_script
    script_path, shebang=shebang, prologue=prologue, epilogue=epilogue
  File "/nix/store/bw25mwzbxyyxk9djlsvfrk4jz2kf4hyz-resholve-0.4.2/bin/.resholve-wrapped", line 1375, in __init__
    self._make_parser(parse_ctx, script, arena)
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/core/main_loop.py", line 188, in ParseWholeFile
    node = c_parser.ParseLogicalLine()  # can raise ParseError
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/osh/cmd_parse.py", line 2039, in ParseLogicalLine
    node = self._ParseCommandLine()
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/osh/cmd_parse.py", line 1901, in _ParseCommandLine
    child = self.ParseAndOr()
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/osh/cmd_parse.py", line 1831, in ParseAndOr
    child = self.ParsePipeline()
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/osh/cmd_parse.py", line 1777, in ParsePipeline
    child = self.ParseCommand()
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/osh/cmd_parse.py", line 1707, in ParseCommand
    enode = self.w_parser.ParseCommandExpr()
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/osh/word_parse.py", line 892, in ParseCommandExpr
    grammar_nt.command_expr)
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/frontend/parse_lib.py", line 365, in ParseOilExpr
    pnode, last_token = self.e_parser.Parse(lexer, start_symbol)
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/oil_lang/expr_parse.py", line 297, in Parse
    self.push_parser.setup(start_symbol)
  File "/nix/store/9y21a7f4a91l1hjkzrrdjpdhar86jwym-python2.7-oildev-unstable-2020-03-31/lib/python2.7/site-packages/oil/pgen2/parse.py", line 111, in setup
    self.stack = [_StackItem(self.grammar.dfas[start], 0, newnode)]
AttributeError: 'NoneType' object has no attribute 'dfas'

My uninformed guess is that likely oil's parser cosiders pass a keyword and it returns a different object.

ecosystem tool: command-variant/lineage/version detector

I haven't really considered specifics, but it would be nice to have a tool that can help figure out what variant/version of a command/utility a given script is using just based on its invocations.

Thoughts:

  • This somewhat overlaps with the command parsers resholve is already accruing (and which it might be feasible to eventually separate out into argfarce.
  • Hand-writing parsers scales about as poorly as you'd expect also somewhat overlaps with the ~help/manpage-parsing work that drives explainshell.
    • Though, to be fair, the set of identically-named divergent CLIs relevant for Shell scripting is probably not growing all that quickly...
  • This would probably accept invocations on stdin or in a file and indicate any utilities with this name that it can't rule out. Maybe there's an overall judgement, and then a judgement per invocation?
    • It should reflect awareness that sometimes scripts design around this--their source may contain invocations that are idiomatic to multiple different variants and then detect the system or feature-test the utility to decide which ones it should use.
    • Ideally, this tool would also attempt to account for history as well as the present. It's less-critical than variant detection, but it could still be a big help to know that a script uses a flag that is only supported in some specific versions.
  • Even if the utility can't be accurate enough to drive decisions automatically from within resholve, I think it can at least serve as an automated or manual ~sidecar tool used when resolving new projects to make sure you're plugging in compatible utilities (or noticing if the script is completely incompatible with the variants you have available).
  • The ~first target is things like BSD | GNU versions, but ideally this would be sensitive to drop-in replacements, reimplementations, forks, etc.
  • In the longer run, there might be some synergy with binlore to make collecting high-quality data going forward easier.
    • Currently, resholve's Nix API invokes binlore. In the longer run, I kind of imagine nixpkgs, for example, automatically collecting binlore for everything that outputs executables.
    • I vaguely hope that some of the information resholve needs can be ~formalized in a generic format that projects themselves could carry (whether the format is resholve-specific or not). Version, target shell(s), dependencies, supported arguments, etc.

Should "unused" directives cause blocking errors?

I'm not sure whether it would be simple or painful to implement (so I don't want to frame this as something that will happen, or will happen rapidly), but I've idly wondered a little if resholve should error if an issued directive never does anything.

A more concrete example might be an invocation that says resholve should allow /a/specific/absolute/path--should resholve error out if it never encounters the path?

I can imagine this catching some misunderstandings and helping clean up vestigial directives, but I probably won't prioritize it until/unless there's obviously demand.

awk option parser cannot handle `-vfoo=bar`, only `-v foo=bar`

error: builder for '/nix/store/q1jmr4hvrri48p2d6zqanip92b5ay2gh-sfeedrc.drv' failed with exit code 9;
       last 10 log lines:
       > [resholve context] RESHOLVE_LORE=/nix/store/3ns00nskan71bf01z2912f9h2xmnifqj-more-binlore
       > [resholve context] RESHOLVE_EXECER=
       > [resholve context] RESHOLVE_FAKE='function:'\''feed'\'';'\''_fetch'\'';'\''_convertencoding'\'';'\''_parse'\'';'\''_filter'\'';'\''_merge'\'';'\''_order'\'''
       > [resholve context] RESHOLVE_INPUTS=/nix/store/47n5hzqpahs7yv84ia6cxp3jg9ca8r86-coreutils-9.0/bin:/nix/store/j2ja4qf1hs51n9zvzq1i5sbl1vkxn8wd-lockfile-progs-0.1.19/bin:/nix/store/fr7vrxblkj327ypn3vhjwfhf19lddqqd-gawk-5.1.1/bin
       > [resholve context] RESHOLVE_INTERPRETER=/nix/store/0d3wgx8x6dxdb2cpnq105z23hah07z7l-bash-5.1-p16/bin/bash
       > [resholve context] /nix/store/x7qhhcwch658pj4sqfc3dj5bi6j99rbw-resholve-0.8.0/bin/resholve --overwrite /nix/store/1sgljd7baig9da7vs50wdkx9x0q72gad-sfeedrc
       > WARNING:__main__:CommandParser CommandParser(prog='awk (generic)', usage=None, description=None, version=None, formatter_class=<class 'argparse.HelpFormatter'>, conflict_handler='error', add_help=False) passing instead of "argument -c/--traditional: ignored explicit argument 'ategory=$1'"
       >     awk -F'\t' -vcategory="$1" '{ split($9, cs, "|");for (i in cs) if (cs[i] == category) { print; break; }; }'
       >     ^~~
       > /nix/store/1sgljd7baig9da7vs50wdkx9x0q72gad-sfeedrc:18: 'awk' _might_ be able to execute its arguments, and I don't have any command-specific rules for figuring out if this specific invocation does or not.
       For full logs, run 'nix log /nix/store/q1jmr4hvrri48p2d6zqanip92b5ay2gh-sfeedrc.drv'.
  • This is a pretty bad error message for this situation.
  • Awk (gnu) accepts -vcategory="$1" and -v category="$1" interchangeably, but resholve does not. Changing the source to the form with a space solved the problem for me, but I still thought I'd report it.

This may well be a fundamental limitation of the option parsing library in use, but let's at least make the error message more helpful. After all, you DO have "command-specific rules" for awk, they just failed in this instance.

exception on ~empty use of command

$ resholve --interpreter none <<EOF
command >/dev/null 2>&1      
EOF

Traceback (most recent call last):
  File "/nix/store/rmvavb9jan6z9d7aqzd42q8hxjbhhgpl-resholve-0.6.1/bin/.resholve-wrapped", line 4442, in <module>
    sys.exit(punshow())
  File "/nix/store/rmvavb9jan6z9d7aqzd42q8hxjbhhgpl-resholve-0.6.1/bin/.resholve-wrapped", line 879, in punshow
    epilogue=args.epilogue,
  File "/nix/store/rmvavb9jan6z9d7aqzd42q8hxjbhhgpl-resholve-0.6.1/bin/.resholve-wrapped", line 3973, in __init__
    self.visit_commands()
  File "/nix/store/rmvavb9jan6z9d7aqzd42q8hxjbhhgpl-resholve-0.6.1/bin/.resholve-wrapped", line 4437, in visit_commands
    self.visit_command_invocation(inv)
  File "/nix/store/rmvavb9jan6z9d7aqzd42q8hxjbhhgpl-resholve-0.6.1/bin/.resholve-wrapped", line 4417, in visit_command_invocation
    return self.look_for_essential_builtin_sub_execution(invocation)
  File "/nix/store/rmvavb9jan6z9d7aqzd42q8hxjbhhgpl-resholve-0.6.1/bin/.resholve-wrapped", line 3372, in look_for_essential_builtin_sub_execution
    parsed = parser.parse_known_args(invocation.args)
  File "/nix/store/i3719gs514i6061s99rv6r9q5adnj8p9-python-2.7.18/lib/python2.7/argparse.py", line 1737, in parse_known_args
    namespace, args = self._parse_known_args(args, namespace)
  File "/nix/store/i3719gs514i6061s99rv6r9q5adnj8p9-python-2.7.18/lib/python2.7/argparse.py", line 1946, in _parse_known_args
    stop_index = consume_positionals(start_index)
  File "/nix/store/i3719gs514i6061s99rv6r9q5adnj8p9-python-2.7.18/lib/python2.7/argparse.py", line 1902, in consume_positionals
    take_action(action, args)
  File "/nix/store/i3719gs514i6061s99rv6r9q5adnj8p9-python-2.7.18/lib/python2.7/argparse.py", line 1811, in take_action
    action(self, namespace, argument_values, option_string)
  File "/nix/store/rmvavb9jan6z9d7aqzd42q8hxjbhhgpl-resholve-0.6.1/bin/.resholve-wrapped", line 332, in __call__
    Invocation(words=values),
  File "/nix/store/rmvavb9jan6z9d7aqzd42q8hxjbhhgpl-resholve-0.6.1/bin/.resholve-wrapped", line 3123, in __new__
    self.firstword = firstword = self[0]
TypeError: 'NoneType' object has no attribute '__getitem__'

The osh AST is something like:

(command.Simple
  words: [(compound_word parts:[(Token id:Id.Lit_Chars span_id:0 val:command)])]
  redirects: [
    (redir
      op: (Token id:Id.Redir_Great span_id:2 val:'>')
      loc: (redir_loc.Fd fd:1)
      arg: (compound_word parts:[(Token id:Id.Lit_Chars span_id:3 val:'/dev/null')])
    )
    (redir
      op: (Token id:Id.Redir_GreatAnd span_id:5 val:'2>&')
      loc: (redir_loc.Fd fd:2)
      arg: (compound_word parts:[(Token id:Id.Lit_Chars span_id:6 val:1)])
    )
  ]
  do_fork: T
)

Possible Future Work?

There are some things I would like for it to do:

1. better support for bash alias definitions, where currently it
does not handle them at all.
Finished in #5
2. better support for programs which call other programs, like
xargs ยซcmdยป. This one is not a principled solution, but I think
an investment in to handling common "higher order" programs like
this would pay off well.
See #6
3. development of a "strict mode" where the PATH is unset, to improve
the guarantee
See #9
4. have a "traced execution" model where executing specially
instrumented "resholved" scripts record any programs executed which
are not in their Nix closure. This indicates a scope leak and
resolution failure.
See #9
5. a buildInputs setup hook, which automatically resholves all of the outputs of a Nix build. See #8

what do you think about these things?

brainstorm ways to augment path-resolution, improve runtime isolation

In #4 @grahamc made two more suggestions that, as I currently understand them, both boil down to: more resolution! more isolation! I'd like to try exhausting the solution space a bit, and spend some time hammering on the solutions to see which ones are promising enough to seriously consider. I'll try to keep the first post up-to-date as a summary...

I'm not exactly sure what is and isn't (behaviorally?) in-scope for resholved, yet; some of these will almost certainly be out-of-scope. Don't see these as either-or propositions; a good approach might combine a few.

More aggressive resolution modes

  1. Mode that treats all "words" as potential commands, tries to resolve every single one, and requires the user to triage any that resolve.

Inject shell at the head of the script to unset PATH

(or, more likely, set it to the empty string).

  1. The ideologue's approach is probably something like declare -xr PATH= (export & readonly).

    This is a very Nix approach to the problem, but it also entails a commitment to patching anything it breaks. I imagine a great many edge cases around scripts that are sourced, or get cat-ed together.

  2. A "gentle" approach would just empty PATH.

    Some scripts have sanity-checking routines of various quality that may discover the empty/weird path and helpfully re-set it to something more normal.

Modify the script to instrument it for data gathering

  1. Trace the execution?

    • Not sure how much we can meaningfully do at the bash level. There's a hook for fielding unresolved commands, command_not_found_handle, but resholved hopefully gets most of the low-hanging fruit here and I assume many edge-cases will be external command behavior (which this wouldn't catch), so I'm not sure how much it would help in practice. Probably more tractable at the system level, but not sure it's very cross-platform? IIRC trace is very locked-down on macOS?

    • As @catern mentions in #13, it may be possible to point PATH at a fuse filesystem and run the script to get some extra leverage here. This sounds promising, since it would be cross-compatible. I'm not sure if this would be in resholved's wheelhouse, or be an additional tool?

    • I had a separate thought late last night about an isolated runtime environment that uses some sort of fuzzer approach to try and exercise as many logic paths as feasible: rewrite the script to replace all of the identifiable tokens with aliases or functions that randomly:

      • return true or false
      • echo from a deliberately chosen list of formats or run the true underlying executable

      And then, of course, run the script many times. Maybe tune the number of repetitions by trying it out on some places where we've affirmatively found issues in the past. Will still be a little helpless with conditional logic on exact variable/argument contents.

    At some point, we're inevitably going to start bumping up against things that are unknowable/unprovable due to how malleable shell is in the first place or won't fall out of the tree unless we say the magic words in the right order. My own heuristic for how resholved should approach cases it can't directly solve for is to focus on finding a reliable heuristic for spotting the problem(s), raise an error, and make the user tell resholved what to do about it.

eval with quoted string argument

I'm not sure how resholved should handle eval (and possibly other commands) with a quoted string argument. I could use some input and examples.

These now print a warning/request for feedback, but the previous handling resulted in something like:

  eval 'echo $HOME'
       ^
[ stdinNone ]:3: Can't resolve command 'echo $HOME' to a known function or executable

With some work, I can re-parse the content inside the string and validate it as well, but I haven't run into a situation yet where I wouldn't just exempt it anyways.

An obvious case that might change this is a script with hard-coded absolute paths hidden in strings.

Should resholve complain about overridden essential builtins?

resholve special-cases a number of builtins (alias, builtin, command, coproc, eval, and .|source) to resolve executables that appear in their arguments.

In theory, resholve might do the "wrong" thing if these builtins are obscured by a function or alias that changes their behavior.

I considered making these an error--but I'll start with a warning because I suspect that syntax/semantic-incompatible replacements are rare in the real world (and compatible replacements would have to triage).

If you ran into this warning, I'm curious how/why the builtin was overridden in the script you tried to resolve.

Issue parsing zsh nested expansion.

root_usage="Size: 222151426, NumBlocks: 20587"
root_usage=${${root_usage##Size: }%,*}
echo $root_usage # Prints 222151426
resholve --overwrite my-script
                                                root_usage=${${root_usage##Size: }%,*}
                                                              ^
my-script:57: error: Expected } to close ${

Note that this syntax isn't supported by bash so that may be why it isn't supported.

$ root_usage=${${root_usage##Size: }%,*}
bash: ${${root_usage##Size: }%,*}: bad substitution

Workaround

eval 'root_usage=${${root_usage##Size: }%,*}'

This just gives a warning.

                                                eval 'root_usage=${${root_usage##Size: }%,*}'
                                                     ^
my-script:57: FEEDBACK WANTED: Letting quoted 'eval' through for now.

Support for (gnu?) dc's ! commands?

dc has an instance of sub-exec via its ! command that isn't as straightforward to support as some others.

I managed to find a way to raise an error when this form is encountered:

I don't know how to handle dc `!` commands yet--sorry :(

Next step: - See the feedback issue for a workaround

https://github.com/abathur/resholve/issues/40

That said, I get the impression this feature is very rarely used, so I currently plan to leave it here for now until/unless users show up documenting real use/test-cases in this issue.

Should resholve do anything about code that sets well-known execed envs?

Currently, resholve doesn't really care if you do something like SHELL=/bin/bash, so long as you don't then try to invoke $SHELL as a command (for a real-world case, see the end of arch-chroot in arch-install-scripts).

I haven't chewed on this enough yet to know what to do about them.

Question - what's the proper way to ignore an executable

Let's say you have a 3rd party script that tries to be support multiple different binaries (vlc and mpv as an example) and uses command -v mpv to check for it on the path.

We probably just want to pick one and ignore the other.

What is the proper way to handle that?

identifying executables that are likely to run other executables with static analysis

I'm sketching out how resholve can/should handle the long tail of external commands that accept a command/path as an argument and execute it (like sudo, find, xargs, etc.). I'm happy for the functionality described here to fit into resholve, but research so far is making me suspect something that can meet resholve's objective here could make a nice standalone tool, too.

It may be expedient for resholve to accumulate granular "hints" to help it handle common commands, but it isn't practical to treat long-tail commands like this, so I'm focused on the best "safe" fallback for them.

My current thinking is that resholve should make the user triage arguments to any externals that appear to be capable of running arbitrary executables from user input.

I'm focusing on the binary part of this for now (I imagine it is the harder problem, but know little about executable formats or analysis). My values here are something like: it's better to make the user triage than to have false negatives, as long as the number of false positives isn't enraging/exhausting.

I imagine a few levels of functionality here:

  1. Census executables in the path/inputs with nm to flag any that include an external execution call for user triage (and make the user triage arguments to them if they're actually found). I am OK with this having many false positives, as long as it isn't generating many false negatives.
  2. Analyze the binary to determine whether or not any ~variable/argument is passed to such a function. (Aside: for related reasons, I'm also interested in identifying any literal strings passed to such functions--especially if both can be done at once.) From poking around at binary analysis tools I'm not sure this can be performant enough to be worthwhile.
  3. If it can be performant and eliminate false positives without trading them for false negatives, improve on the above (walking more of the callgraph, taint analysis on argv?)

If this was a standalone tool, it would ideally:

  • accept batches of directories or explicit executables via parameter or file
  • Be able to do one or more smart things with unified binaries like coreutils that behave differently depending on $0.
    • At minimum, recognize they all point to the same file and save the time over handling/reporting each one naively.
    • A lot of user suffering (triage) can be spared if it's possible to rule out even a fraction of really common commands like this as not possibly leading to another execution.
  • Given cases like the above, it may be prudent to have some idiom for manually hinting that a given command is known not to be able to execute other commands. (resholve will likely have to mirror this feature for ~syntax/arguments)

Support for awk's sub-command execution?

I'm in the process of implementing general support for resolving of commands that appear in the arguments to other commands, which should be available in the coming weeks.

awk can do this via both the system() function and with its piping feature. Actually "resholving" commands in here probably entails having a parser of some sort. I'm undecided but leaning towards this being out-of-scope for resholve and saying this (and maybe sed, expect, and any others with this problem) is a sign that any of these will need their own tools. (Mostly because, if we could define an interchange format, additional tools wouldn't be tied to the resholve's own dependence on Oil/OSH/python2 and would be more likely to be able to tackle these tasks with the help of whatever parsers exist in whatever language they may be written in...).

I managed to find a way to raise an error when this form is encountered:

I don't know how to handle awk sub commands yet--sorry :(

Next step: - See the feedback issue for a workaround

https://github.com/abathur/resholve/issues/31

I don't really understand how often this feature is used, so I don't know if this is going to be a big stumbling block or an occasional hurdle.

"OSH eval error while looking for sub-exec"

I encountered a case of resholve falling over when OSH hit during resholve's new nested-resolution routine. resholve should now ~handle this and just print a warning, but I don't have a very good sense of

  • how common this warning will be
  • what kind of syntax will trigger it
  • whether there will be anything smarter resholve should do

I'm hoping to collect some more real reports of this to help figure out if we can just silently ignore these or if it's going to need additional handling.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.