facebook / dotslash Goto Github PK
View Code? Open in Web Editor NEWSimplified executable deployment
Home Page: https://dotslash-cli.com
License: Apache License 2.0
Simplified executable deployment
Home Page: https://dotslash-cli.com
License: Apache License 2.0
Take for example - https://github.com/astral-sh/ruff/releases/download/v0.4.2/ruff-0.4.2-aarch64-apple-darwin.tar.gz
When decompressing with tar - tar -xvf ruff-0.4.2-aarch64-apple-darwin.tar.gz
, it outputs ruff
But with dotslash v0.41.0
, it outputs DOTSLASH_CACHE/23/d6c44ba88ccffacf9de63082a597f3d2ad77f7/GNUSparseFile.0/ruff
Seems like this entry type variant might need special handling possibly
https://docs.rs/tar/latest/tar/enum.EntryType.html#variant.GNUSparse
Relevant config for dtslash
"macos-aarch64": {
"size": 8144818,
"hash": "blake3",
"digest": "02b131cb0da1e157ddf1ab96cc100c25b519d052a4cf6c55602614425b2fa37d",
"format": "tar.gz",
"path": "GNUSparseFile.0/ruff",
"providers": [
{
"url": "https://github.com/astral-sh/ruff/releases/download/v0.4.2/ruff-0.4.2-aarch64-apple-darwin.tar.gz"
}
]
}
Interestingly, running GNUSparseFile.0/ruff
via dotslash fails. But running the directly downloaded binary works as expected
I believe the Homebrew release is not being done correctly.
For macOS, we release a universal binary:
dotslash/.github/workflows/release.yml
Lines 22 to 42 in 92fe071
As explained here:
dotslash/website/docs/installation.md
Lines 9 to 18 in 92fe071
But based on my read of how the Brew formula was created:
Homebrew/homebrew-core@8e53cba
and the subsequent changes to that file:
https://github.com/Homebrew/homebrew-core/commits/master/Formula/d/dotslash.rb
It looks like it is doing a simple cargo install
, so it is not creating a universal binary.
Can we work with the Homebrew maintainers to fix this?
Currently, DotSlash does not support any sort of configuration file. The only thing that is really "configurable" is the location of the DotSlash cache folder, which defaults to the dotslash
subfolder under the operating system's cache dir, but can be overridden via the DOTSLASH_CACHE
environment variable.
An important advantage of this approach is that the "fast path" in the DotSlash execution flow (i.e., a cache hit):
https://dotslash-cli.com/docs/execution/
does not have to read any files other than the DotSlash file itself. Ideally, in adding support for a DotSlash config file, we would maintain this invariant. That is, the "fast path" should not have to read a config file, but the "slow path" is allowed to.
Ideas we have kicked around in the past include:
curl
configuration (which curl
executable to use, proxy info, etc.)We probably want some sort of "cascading set" of config files were one can override another, such as:
.dotslashconfig
in a parent folder / repo root?<CONFIG DIR>
/dotslash
/etc/dotslash/config
To avoid an undue increase on the size of the DotSlash binary, we should use "jsonrc" as the config file format.
Today, the docs state:
At the time of this writing, there is no way to add custom providers without forking DotSlash.
But we have already seen interest in custom providers, so it seems like we should start discussing possible solutions. Note this will likely require some sort of configuration file, which, as a reminder, we would like to avoid having to read in the case of a cache hit.
While the design for the configuration file is still under discussion, let's assume for the moment that at least two locations for provider-specific data are supported:
$XDG_STATE_HOME/dotslash/provider/
where providers installed by the user live/etc/dotslash
or some location for system-wide configuration. In practice, an entity might push enterprise-wide providers to this folder with the expectation that end-users should not write this folder directly.Today, the things a provider needs to know are:
One option would be to pass everything thing needs to the executable via a single JSON argument and then stream the stdout from the provider invocation directly to the path where the artifact should be written. This way, the provider does not get any direct knowledge about the layout of the $DOTSLASH_CACHE
.
Because the provider can be an executable, it makes sense for the provider to be a DotSlash file. For example, we could have:
$XDG_STATE_HOME/dotslash/provider/<provider-name>
where <provider-name>
is the name of the DotSlash file, which must also match the "type"
used in the "providers"
section of a DotSlash file. (The "name"
in the DotSlash file should probably also be required to match.) Note that this file will always be executed by DotSlash itself, so there is no need for any special Windows stuff.
A simple option is to support a subcommand like dotslash -- install-provider URL_TO_PROVIDER
that would fetch the specified URL, verify it contains a DotSlash file, and then write it to $XDG_STATE_HOME/dotslash/provider/<provider-name>
, as appropriate.
Another option (we'll call this the "DotSlash Inception" option) would be to enable a DotSlash file to include metadata about how to obtain a provider referenced in the file. Example:
{
"name": "example-cli",
"providers": {
"my-custom-cas": {
"size": 40660307,
"hash": "blake3",
"digest": "6e2ca33951e586e7670016dd9e503d028454bf9249d5ff556347c3d98c347c34",
// Must be a single DotSlash file?
"format": "gz",
// No need to specify path because it must be my-custom-cas?
"providers": [
{
"url": "https://example.com/my-custom-cas"
}
]
}
}
"platforms": {
"linux-x86_64": {
/* size, hash, digest, format, path */
"providers": [
{
"type": "my-custom-cas",
"id": "72b81fc3a30b7bedc1a09a3fafc4478a1b02e5ebf0ad04ea15d23b3e9dc89212"
}
]
}
}
}
The idea is that when example-cli
is run for the first time, DotSlash sees that it should use the my-custom-cas
provider. If the user does not have it installed, DotSlash can use the information in the providers
section to install the provider first and then use it to fetch example-cli
.
There are a lot of questions on how strict we might be on the requirements for a provider. There are also questions around how to know when to install a new version of a provider, or what to do if multiple DotSlash files try to provide different implementations of a provider (particularly with respect to defending against attackers).
It would be nice to be able to manage dotslash using homebrew on MacOS
Most other Meta's devtools are released on homebrew.
For my use-case, I have internal artifacts that are in S3 and need auth to fetch. Would ya'll accept a contribution to add S3 as a provider?
@ashleygwilliams was interested in seeing this written up.
The straightforward way is for a single dotslash artifact to package both the toolchain binaries and all the sysroots:
When this dotslash-based rustc
is run, dotslash would download and unpack all of those sysroots, both of which are slow operations. But it works. Rustc knows to look in ../lib/rustlib for sysroots.
A better way avoids having to download and unpack unused sysroots, while still having sysroots downloaded/cached/synchronized through dotslash, and available for a large range of target platforms.
In this approach, the native sysroot is always available by default (for proc macros), but sysroots for cross-compilation are managed as follows. When the dotslash-based rustc
is run, instead of the real rustc being the entry point ("path" in the JSON), there is a tiny program for which the source is below. It parses the rustc command line to find what --target
you are building for, spawns a subprocess corresponding to the dotslash executable responsible for the sysroot for that target, waits for the subprocess to print a directory path on its stdout, and uses that directory path as the --sysroot
argument for invoking the real rustc. The sysroot subprocess behaves as follows: when dotslash has downloaded and unpacked and executed it, it just prints out its own location, which is some path within the dotslash cache, then blocks until stdin is closed by the parent process to indicate the sysroot is done being accessed. The sysroot subprocess must remain running the entire time the parent process is accessing the sysroot, so that the sysroot does not get evicted by dotslash gc.
// Main.rs for `sysroot-multiplexer`. Cross-compiling with `-Zbuild-std` and Zig's
// linker produces Linux, macOS, and Windows binaries that are 77K-156K.
use std::env;
use std::env::consts::{ARCH, EXE_EXTENSION, OS};
use std::ffi::{OsStr, OsString};
use std::fs;
use std::io;
use std::io::{BufRead, BufReader, ErrorKind, Read, Write};
use std::path::Path;
use std::process::{self, Command, Stdio};
fn main() {
if let Err(err) = try_main() {
let _ = writeln!(io::stderr(), "sysroot-multiplexer error: {err}");
process::exit(1);
}
}
fn try_main() -> io::Result<()> {
let current_exe = env::current_exe()?;
let dir = current_exe.parent().unwrap();
let bin_rustc = dir.join("bin").join("rustc").with_extension(EXE_EXTENSION);
if bin_rustc.exists() {
return invoke_rustc_with_sysroot(dir, &bin_rustc);
}
let rustlib = dir.join("lib").join("rustlib");
if rustlib.exists() {
return print_sysroot_and_wait(dir);
}
let msg = format!(
"neither bin/rustc nor lib/rustlib exist in {}",
dir.display()
);
Err(io::Error::new(ErrorKind::Other, msg))
}
// Guess a sysroot based on what *this* sysroot-multiplexer executable was
// compiled for.
fn guess_host_for_target_triple() -> &'static str {
match (OS, ARCH) {
("linux", "x86_64") => "x86_64-unknown-linux-gnu",
("linux", "aarch64") => "aarch64-unknown-linux-gnu",
("macos", "x86_64") => "x86_64-apple-darwin",
("macos", "aarch64") => "aarch64-apple-darwin",
("windows", "x86_64") => "x86_64-pc-windows-msvc",
_ => panic!("what kind of computer is this... {OS}/{ARCH}"),
}
}
// Check if something looks like a path.
//
// We blindly use target triples as path components to the sysroot executable.
// This check is to avoid executing something that would give a cryptic error
// message.
//
// Rust does not define what characters are valid for target triples, so we just
// concern ourselves with basic path parts. See
// <https://rust-lang.github.io/rfcs/0131-target-specification.html>
fn is_path_like(triple: &OsStr) -> bool {
let mut comps = Path::new(triple).components();
// A "Normal" component is not a ".", ".." or "/".
!(matches!(comps.next(), Some(std::path::Component::Normal(_))) && comps.next().is_none())
}
// A more sophisticated parser would be more correct. There's code in
// <https://crates.io/crates/rustflags> to fully parse a rustc command line.
//
// Rustc does not do recursive argsfile expansion, despite the original PR
// (rust-lang/rust#63175) implying so. Also, the [code][] as it exists today
// doesn't look like it does that, and experimentation confirms it:
//
// $ rustc --version
// rustc 1.65.0 (897e37553 2022-11-02)
//
// $ rustc @<(echo '--version')
// rustc 1.65.0 (897e37553 2022-11-02)
//
// $ rustc @<(echo @<(echo '--version'))
// error: couldn't read @/dev/fd/10: No such file or directory (os error 2)
//
// error: aborting due to previous error
//
// [code]: https://github.com/rust-lang/rust/blob/19423b59440f/compiler/rustc_driver/src/args.rs
fn parse_rustc_args<I>(args: I, cmd: &mut Command) -> (bool, Option<OsString>)
where
I: IntoIterator,
I::Item: Into<OsString>,
{
let mut expect_target = false;
let mut has_sysroot = false;
let mut target = None;
let mut parse_arg = |arg: &OsStr, arg_str: Option<&str>| {
if expect_target {
target = Some(arg.to_owned());
}
if let Some(arg) = arg_str {
if arg == "--sysroot" || arg.starts_with("--sysroot=") {
has_sysroot = true;
}
if let Some(found_target) = arg.strip_prefix("--target=") {
target = Some(OsString::from(found_target));
}
expect_target = arg == "--target";
}
};
for arg in args {
let arg = arg.into();
let arg_str = arg.to_str();
if let Some(argsfile) = arg_str.and_then(|x| x.strip_prefix('@')) {
// Let rustc itself complain that an argsfile can't be read.
if let Ok(content) = fs::read_to_string(argsfile) {
for line in content.lines() {
parse_arg(OsStr::new(line), Some(line));
}
}
} else {
parse_arg(&arg, arg_str);
}
cmd.arg(arg);
}
(has_sysroot, target)
}
fn invoke_rustc_with_sysroot(dir: &Path, rustc: &Path) -> io::Result<()> {
let mut cmd = Command::new(rustc);
let (has_sysroot, target) = parse_rustc_args(env::args_os().skip(1), &mut cmd);
// TODO: Are there other cases we can skip a --sysroot flag? Stuff like
// `rustc --version` and `rustc --help` do not need it. Various of the
// `rustc --print` options probably also do not.
let mut sysroot_child = None;
if !has_sysroot {
let target = match target.as_deref() {
Some(target) if !is_path_like(target) => target,
Some(_) | None => OsStr::new(guess_host_for_target_triple()),
};
// Some sysroots are included by default (the host one for example),
// check if it exists first before invoking `sysroot_exe`.
let target_lib_dir = dir.join("lib").join("rustlib").join(target).join("lib");
if !target_lib_dir.is_dir() {
let sysroot_exe = dir.join("lib").join(target).with_extension(EXE_EXTENSION);
let mut child = Command::new(sysroot_exe)
.stdin(Stdio::piped())
.stdout(Stdio::piped())
.stderr(Stdio::inherit())
.spawn()
.map_err(|spawn_error| {
if spawn_error.kind() == ErrorKind::NotFound {
let msg = format!("no sysroot found for target {target:?}");
io::Error::new(ErrorKind::NotFound, msg)
} else {
spawn_error
}
})?;
let mut buf_read = BufReader::new(child.stdout.as_mut().unwrap());
let mut line = String::new();
buf_read.read_line(&mut line)?;
if line.is_empty() {
child.wait()?;
process::exit(1);
}
let sysroot_value = line.trim();
cmd.arg("--sysroot");
cmd.arg(sysroot_value);
sysroot_child = Some(child);
}
}
let exit_status = cmd.spawn()?.wait()?;
if let Some(mut sysroot_child) = sysroot_child {
drop(sysroot_child.stdin.take()); // close it
}
process::exit(exit_status.code().unwrap_or(1));
}
fn print_sysroot_and_wait(dir: &Path) -> io::Result<()> {
#[cfg(unix)]
let dir = std::os::unix::ffi::OsStrExt::as_bytes(dir.as_os_str());
#[cfg(not(unix))]
let dir = dir.to_str().expect("non-utf8 path :(").as_bytes();
let mut stdout = io::stdout().lock();
stdout.write_all(dir)?;
stdout.write_all(b"\n")?;
stdout.flush()?;
// Block until someone closes our stdin.
let mut stdin = io::stdin().lock();
let mut buf = [0u8; 1024];
while stdin.read(&mut buf)? > 0 {}
Ok(())
}
👋 It looks like the docs match main, but not the latest version.
This means that the docs show that dotslash
supports .tar.xz
however 0.2.0
does not. If I build against main, I'm able to use tar.xz
.
Or could main be released so I can have people pull dotslash and use my tar.xz
packages?
Would you please consider adding aarch64 binary release for linux VM running on Apple Silicon without rosetta2? Thanks
It would be nice to have a “check” command that verifies the download, hash etc for every defined platform. Ideally I’d have some CI that checks this, but for interactive dev tools that’s not always the case and regardless would be nice to have a quick check before waiting for CI.
Official rustup components are available only in tar.gz and tar.xz. The tar.xz are significantly better compressed. rustup
exclusively uses the tar.xz artifacts, and the plan is to stop providing tar.gz altogether (rust-lang/infra-team#89).
https://static.rust-lang.org/dist/2023-12-28
They may or may not provide zstd in the future (rust-lang/infra-team#97).
It looks like new providers could be added in the following way:
Provider
trait (see GitHubReleaseProvider
)get_provider
impl for DefaultProviderFactory
in main.rs
.It would be nice to have this reflected in the documentation!
If y'all are open to contributions, I'd be happy to take a crack at it. Looks like the docs themselves are built with Docusaurus, which shouldn't be difficult to work with.
My only question to validate at the moment is the proper handling of the provider config. It looks like the intent is that the provider config is given as a valid "loosely parsed" JSON value, which the provider itself is then able to deserialize into any specific structure it likes. Do I have that right?
It looks like there is a v0.3.0 release on GitHub:
https://github.com/facebook/dotslash/releases/tag/v0.3.0
Though crates.io is still v0.2.0:
https://crates.io/crates/dotslash
The additional support for .xz
in v0.3.0 seems like a win, though I expect .zip
support is more impactful for users, so it would be helpful to drive the associated PR to some sort of resolution:
Downstream packager here. Every time an upstream project retags something that already went out we have to do a lot of extra work to verify what changed and why and that it isn't malicious. Versioned tags are meant to be immutable. Please just bump the patch version and try again if/when something goes wrong with the release process or a bug is found early on.
First of all, love the project! It has become my first install for non-nix projects. Here are some feedbacks from my experience so far.
docker cli
, node
, ruff
, shfmt
, uv
. Still waiting for .tar.xz for shellcheck
and .zip for dprint
, rain
#!/usr/bin/env node
require(process.execPath.replace("/bin/node", "/lib/node_modules/corepack/dist/lib/corepack.cjs"))
.runMain([require("path").basename(process.argv[1]), ...process.argv.slice(2)])
Tested dotslash in WSL, macos, container (Debian 12), Github Action, AWS Cloud Shell (Fedora), and Google Cloud Shell (Debian 11). It only failed at debian 11 which has older glibc 2.31. If you are open to publish dotslash to pip later, building for manylinux or with cross
will lower glibc version requirement and make dotslash more portable.
Support one line install in dockerfile or cloudshell without curl | sh
curl -LSfs https://github.com/facebook/dotslash/releases/latest/download/dotslash-$(uname | tr DL dl)-$(uname -m).tar.gz | tar fxz - -C ~/.local/bin/
or in powershell
cmd /c "curl.exe -LSfs https://github.com/facebook/dotslash/releases/download/latest/dotslash-windows.tar.gz | tar fxz - -C .local\bin"
To support up-to-date one-liner, will need to remove version and replace arm64 with aarch64 in filename. Rename ubuntu-22.04 to linux since it works in fedora/debian with newer glibc too.
https://github.com/facebook/dotslash/releases/latest/download/dotslash-ubuntu-22.04.arm64.v0.2.0.tar.gz
becomes https://github.com/facebook/dotslash/releases/latest/download/dotslash-linux-aarch64.tar.gz
.
If pinned version is desired, it is still available by url https://github.com/facebook/dotslash/releases/download/v0.2.0/dotslash-linux-aarch64.tar.gz
update-version
which replaces version, updates metadata, and performs json formatdotslash -- update-version node 18.19.0 20.11.1
Congrats on the release!
Any interest in contributions supporting wildcard/latest matching for github releases? e.g. specify a pattern like 7.0.* for bazel, call the github api to see what is available, then download the metadata (hash etc) and content for the latest one matching (currently 7.0.2)
This would allow using dotslash to replace "get and run latest matching X" usages of bazelisk and similar downloader/caching tools.
Essentially would be adding a "find url and other metadata" step for cases where people trust the release stream to do the right thing for them given the pattern
CI appears to be failing since this commit titled "Rust 1.77 clippy fixes:"
I accidentally put the "format" config entry in the provider section, instead of the next level up inside "platforms". This was not detected as a config error, it just failed at runtime when the file wasn't unpacked.
Sometimes you need to hand-craft a dotslash config file and sometimes it's nice to add some commentary about said file.
For example I was authoring a config for gofmt
and wanted to document how I got all the values to make it easy for the next person to update it in future.
Currently when I added a #
line immediately after the hashbang, dotslash emitted the following:
dotslash error: problem with `/path/to/gofmt`
caused by: failed to parse DotSlash file
caused by: expected value at line 1 column 1
Hello! Congrats on open sourcing the project.
I just discovered it, and I have a question around best practices with bootstrapping.
From my point of view, dotslash helps make a monorepo (or even a regular repo) more self-contained: commit dependency executable scripts to the repository, and let dotslash manage them.
However, there's still one issue: you have to have dotslash installed to get the rest.
I'm starting this issue to discuss that:
How could a repository bootstrap dotslash so that it can use it, without having every contributor install it.
"That's a minimum requirement" is a completely valid position here.
Feel free to close the issue if that's the case.
An alternative that I can see this working is to use a shell script to bootstrap (similar to pantsw, buckw, bazelisk):
Provide a script that will download or build a specific version of dotslash, and cache it somewhere.
Basically a lightweight version of dotslash's own functionality.
(This is also close to the approach used by a similar tool, Hermit FWIW.)
I'm curious about the maintainers' thoughts about what their preferred approach would be here.
Will you be open to PR for supporting zip format? For example https://github.com/dprint/dprint/releases. Thanks!
Given this dotslash file:
#!/usr/bin/env dotslash
{
"name": "python-standalone",
"platforms": {
"macos-aarch64": {
"size": 26705084,
"hash": "blake3",
"digest": "03555c515b0b59c9a8bc15386343228767f3c452c474cddc4cd8949473c30c27",
"format": "tar.zst",
"path": "python/install/bin/python",
"providers": [
{
"url": "https://github.com/indygreg/python-build-standalone/releases/download/20240224/cpython-3.11.8+20240224-aarch64-apple-darwin-pgo-full.tar.zst"
}
]
},
"macos-x86_64": {
"size": 26292710,
"hash": "blake3",
"digest": "e7a824fdba50916674045b4d64dc07c1d172ec84d438f4cc6ba3c01e39992f56",
"format": "tar.zst",
"path": "python/install/bin/python",
"providers": [
{
"url": "https://github.com/indygreg/python-build-standalone/releases/download/20240224/cpython-3.11.8+20240224-x86_64-apple-darwin-pgo-full.tar.zst"
}
]
},
"linux-x86_64": {
"size": 35135207,
"hash": "blake3",
"digest": "1edbb8cbde2be264dda8c531c928ff3740a377d8398584dcac7cfeac3b5e190e",
"format": "tar.zst",
"path": "python/install/bin/python",
"providers": [
{
"url": "https://github.com/indygreg/python-build-standalone/releases/download/20240224/cpython-3.11.8+20240224-x86_64-unknown-linux-gnu-pgo-full.tar.zst"
}
]
}
}
}
I would expect executing it with no arguments to drop me in to a Python REPL. Instead I get this error:
$ ./scripts/bin/python
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = (not set)
program name = './scripts/bin/python'
isolated = 0
environment = 1
user site = 1
safe_path = 0
import site = 1
is in build tree = 0
stdlib dir = '/install/lib/python3.11'
sys._base_executable = '/Users/dan/devel/backend2/scripts/bin/python'
sys.base_prefix = '/install'
sys.base_exec_prefix = '/install'
sys.platlibdir = 'lib'
sys.executable = '/Users/dan/devel/backend2/scripts/bin/python'
sys.prefix = '/install'
sys.exec_prefix = '/install'
sys.path = [
'/install/lib/python311.zip',
'/install/lib/python3.11',
'/install/lib/python3.11/lib-dynload',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'
Current thread 0x00000001e8f13ac0 (most recent call first):
<no Python frame>
I think this means that python can't find the various libraries that it wants to link against.
If I run this same binary directly from the dotslash cache it works:
~/Library/Caches/dotslash/f0/d51d6feaa418f63e844885ba229db6c8815c74/python/install/bin/python3
Python 3.11.8 (main, Feb 25 2024, 03:37:49) [Clang 17.0.6 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
For toolchains like this is it necessary to modify them to work with dotslash? I read through #6 but as far as I can tell this python archive should be the first, straightforward case described there. Curious to learn how to handle this. :)
Thanks for open sourcing dotslash! It has made my life a lot easier recently.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.