sharkdp / hyperfine Goto Github PK
View Code? Open in Web Editor NEWA command-line benchmarking tool
License: Apache License 2.0
A command-line benchmarking tool
License: Apache License 2.0
It looks like 0.4.0
doesn't compile on Windows. master
does though. Could a release be cut?
I am trying to measure an incremental compilation benchmark (to reproduce something coming up on perf.rust-lang.org). Hyperfine consistently reports "Range (min … max): 11.0 ms … 765.1 ms" and "Time (mean ± σ): 590.1 ms ± 304.8 ms", which is effectively a useless result (it doesn't print a warning though, but still). However, when I time the same command myself (cargo
prints the time and I also tried time
), I never see anything below 700ms.
So my suspicion is that this 11ms result hyperfine is seeing is somehow caused by not calling the benchmark correctly, but I do not know how that could happen either.
To reproduce, clone https://github.com/rust-lang-nursery/rustc-perf/ and go to collector/benchmarks/coercions
. Do an initial cargo +nightly build
to fill the incremental cache. Now run touch src/main.rs && cargo +nightly build
many times; for me it is pretty stable between 730ms and 770ms.
Now run
hyperfine -w 2 -p "touch src/main.rs" "cargo +nightly build"
This shows a range from 10ms to 790ms. Something is clearly odd -- but it's not -p
, because
hyperfine -w 2 "touch src/main.rs && cargo +nightly build"
has all the same problems.
This is due to an upstream bug in console, which I have a pending PR against here: console-rs/console#5
On FreeBSD, ioctl()
accepts a u64. However, TIOCGWINSZ
is defined as a u32.
hyperfine can't be used to benchmark shell-specific shell functions without launching a new instance of a shell within the shell itself, breaking some of the underlying assumptions. As most shells support the -c
argument, it would useful if it were possible to pass in a --shell SHELL
option to override hyperfine's default.
e.g. I'm using hyperfine
to reassess some assumptions made in the development of fish
shell, and would like to be able to benchmark one version of a shell builtin against another, or benchmark the time a completion script takes to execute (which uses fish-specific language so would return an error under sh
).
This would be a straightforward replacement of sh
with whatever the user provided, but some might even find it useful to evaluate the performance of command1 executed under shell foo and command2 executed under shell bar (and not losing the benefit of the startup timing analysis that hyperfine provides).
λ cargo install hyperfine
Updating registry `https://github.com/rust-lang/crates.io-index`
Downloading hyperfine v0.4.0
Installing hyperfine v0.4.0
Downloading colored v1.6.0
Downloading [...]
Downloading rustc-serialize v0.3.24
Compiling strsim v0.6.0
Compiling [...]
Compiling hyperfine v0.4.0
error[E0432]: unresolved import `libc::getrusage`
--> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:12
|
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
| ^^^^^^^^^ no `getrusage` in the root
error[E0432]: unresolved import `libc::rusage`
--> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:23
|
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
| ^^^^^^ no `rusage` in the root
error[E0432]: unresolved import `libc::RUSAGE_CHILDREN`
--> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:31
|
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
| ^^^^^^^^^^^^^^^ no `RUSAGE_CHILDREN` in the root
error: aborting due to 3 previous errors
error: failed to compile `hyperfine v0.4.0`, intermediate artifacts can be found at `C:\Users\dkter\AppData\Local\Temp\cargo-install.2sl6wvAhi3wL`
Caused by:
Could not compile `hyperfine`.
To learn more, run the command again with --verbose.
Running with --verbose gives me this information:
Compiling hyperfine v0.4.0
Running `rustc --crate-name hyperfine .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\main.rs --crate-type bin --emit=dep-info,link -C opt-level=3 -C metadata=98f67aef2c923775 -C extra-filename=-98f67aef2c923775 --out-dir C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps -L dependency=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps --extern indicatif=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libindicatif-59764fc82c6811ce.rlib --extern libc=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\liblibc-f66ba3832bd58510.rlib --extern statistical=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libstatistical-50c68fb634eb9a96.rlib --extern colored=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libcolored-9098ef94b466db7a.rlib --extern clap=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libclap-04c95c98d9faa158.rlib --extern atty=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libatty-a371b164006500d6.rlib --cap-lints allow`
error[E0432]: unresolved import `libc::getrusage`
--> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:12
|
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
| ^^^^^^^^^ no `getrusage` in the root
error[E0432]: unresolved import `libc::rusage`
--> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:23
|
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
| ^^^^^^ no `rusage` in the root
error[E0432]: unresolved import `libc::RUSAGE_CHILDREN`
--> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:31
|
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
| ^^^^^^^^^^^^^^^ no `RUSAGE_CHILDREN` in the root
error: aborting due to 3 previous errors
error: failed to compile `hyperfine v0.4.0`, intermediate artifacts can be found at `C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi`
Caused by:
Could not compile `hyperfine`.
Caused by:
process didn't exit successfully: `rustc --crate-name hyperfine .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\main.rs --crate-type bin --emit=dep-info,link -C opt-level=3 -C metadata=98f67aef2c923775 -C extra-filename=-98f67aef2c923775 --out-dir C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps -L dependency=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps --extern indicatif=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libindicatif-59764fc82c6811ce.rlib --extern libc=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\liblibc-f66ba3832bd58510.rlib --extern statistical=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libstatistical-50c68fb634eb9a96.rlib --extern colored=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libcolored-9098ef94b466db7a.rlib --extern clap=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libclap-04c95c98d9faa158.rlib --extern atty=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libatty-a371b164006500d6.rlib --cap-lints allow` (exit code: 101)
Rust version is 1.20.0 and I'm on Windows 10.
It seems like parameters are only substituted for the actual command, not the preparation command:
$ hyperfine -P X 1 10 -p 'python -c "print 2**{X}"' --show-output -- ls
Benchmark #1: ls
Traceback (most recent call last):
File "<string>", line 1, in <module>
NameError: name 'X' is not defined
Error: The preparation command terminated with a non-zero exit code. Append ' || true' to the command if you are sure that this can be ignored.
I looked over the documentation, but, unless I'm missing something, I don't see an option for running commands sequentially versus concurrently (ie for load testing). If there is no option for specifying this, I'm assuming that commands run sequentially, is this correct?
Thanks.
I have a script that generates some cryptographic keys in the initial phase and only then starts the actual benchmark. The more keys I generate, the slower that initial phase will obviously be, but I only want to benchmark the second phase. Could it be possible to ignore the time spent until a specific string is printed to STDOUT?
It'd be nice to be able to show other statistics such as the 95th percentile or median runtime. HdrHistogram can record this with relatively little overhead, and there's a pretty good official Rust implementation here (I'm one of the maintainers).
It would be nice, by default, to know how many runs hyperfine did for a benchmark.
Automatically detect if the first timing-run was significantly slower than the remaining runs and suggest usage of --warmup
.
Possible implementation:
In the summary output, test whether or not the speedup of the "winning" command over the other(s) is statistically significant. This should be fairly easy to do, given that we already compute the standard deviations.
It could be interesting to run benchmarks where a (numerical) parameter is systematically changed, for example:
> hyperfine 'make -j{}' --parameter-range 1..8
(modulo syntax)
(a follow up to #71)
Currently the CLI automatically selects the units (seconds, milliseconds) based on the size of the mean
value, but the results export is always in seconds.
Choosing the units will force for the CLI and results export units to always match, and allow users to specify the units they are most familiar with/integrate with their reporting systems.
The option could be -u --units <Seconds|Milliseconds>
, and maybe extended to also include minutes (and hours)?
The units option value would be passed through to both format::format_duration_units
for the CLI and ExportManager::write_results
for the results export.
After 5GB and 2 hours' worth of dependency-chasing, I finally managed to install hyperfine
via cargo
. After then manually adding the cargo
install folder to the Windows PATH environment variable, I'm finally able to run hyperfine
in Cygwin.
However, when I do the following as a quick preliminary test:
hyperfine -m 10 --show-output --export-csv hyperfinetest.csv 'sleep 1'
I get the following output:
Benchmark #1: sleep 1
Time (mean ± σ): 1.036 s ± 0.006 s [User: 4.1 ms, System: 17.4 ms]
Range (min … max): 1.021 s … 1.040 s
I presume this means that only one benchmark was run, and checking hyperfinetest.csv seems to confirm this. What's going on here?
The user should be able to specify the maximum number of runs or the exact number of runs.
CyberShadow on HN:
This would be useful when comparing two similar commands, as interleaving them makes it less likely that e.g. a load spike will unfavorably affect only one of them, or due to e.g. thermal throttling negatively affecting the last command.
It would be useful to output the time percentiles of the run. For instance with a flag like --percentiles '50, 90, 95, 99, 99.9'
.
cf.
hyperfine read
cmd.exe
hides the real exit code from us)--style basic
by default.Is there any chance of porting the utility as a Cygwin package? I was really looking forward to using the program to benchmark a few commands on Cygwin running on top of Windows, but it's unfortunately not yet in the Cygwin package repository, which means it would require installing a whole bunch of other, heavy dependencies to install the usual way rather than what would usually be a much simpler apt-cyg install hyperfine
.
Thanks in advance.
When passing --export-markdown
the resulted Markdown file includes information about Mean
and Min..Max
measured in ms
even when the original result report printed to stdout is presented in seconds. I think it might be better if the produced report would use the same units since it's hard to read very long ms
numbers.
Instructions for the behavior reproduction:
hyperfine 'sleep 5' --export-markdown results.md
Command line output:
kbobyrev@kbobyrev ~/d/m/profile> hyperfine 'sleep 5' --export-markdown results.md
Benchmark #1: sleep 5
Time (mean ± σ): 5.002 s ± 0.000 s [User: 1.1 ms, System: 2.0 ms]
Range (min … max): 5.002 s … 5.003 s
results.md:
Command | Mean [ms] | Min…Max [ms] |
---|---|---|
sleep 5 |
5002.3 ± 0.2 | 5002.1…5002.7 |
Add a --export-markdown
option in analogy to #41 for CSV and #42 for JSON.
The output could look like this:
| Benchmark | Mean [ms] | Min. [ms] | Max. [ms] |
|-----------|-------------|-----------|-----------|
| command 1 | 205.1 ± 1.5 | 201.1 | 207.6 |
| command 2 | 403.5 ± 2.4 | 400.3 | 407.4 |
Rendered:
Command | Mean [ms] | Min. [ms] | Max. [ms] |
---|---|---|---|
command 1 | 205.1 ± 1.5 | 201.1 | 207.6 |
command 2 | 403.5 ± 2.4 | 400.3 | 407.4 |
Just like time(1):
$ /bin/time -p fd -HI -e jpg '' ~ > /dev/null
real 1.41
user 4.23
sys 2.89
It would be great if we could export benchmark results in different formats:
mean, stddev, min, max
Each of those could be new command line options (hyperfine --export-csv my-benchmark.csv
).
If there are multiple benchmarks, show a short summary of the results like command1 is a factor of 1.5 times faster than command2.
I was running some coverage using cargo-kcov and noted that currently coverage on the application is somewhat low. It may be good to identify points that should be placed under test and ensure that they are. For instance, with more options being added, extracting the processing of Clap's matches to a testable function may be prudent to ensure future changes do not alter the current correct state.
I am running hyperfine with many different commands at once. In order to not only see the factor between the different commands at the end but also (some) of the means I suggest one, some or all of these changes to the output (switchable by a command line option):
hyperfine ... | sed '/^ *$/d'
.hyperfine ... | sed '/^ *$/d'
is without color and when I run hyperfine -s full ... | sed '/^ *$/d'
instead I also get the interactive graph which doesn't play nicely with the pipe and sometimes messes up a line. So why not add a color
style to -s
which has color but is not interactive?-s
and -s
could be a comma separated list so that things like this would make sense: hyperfine -s compact,color
, hyperfine -s interactive,compact,nocolor
, hyperfine -s full,compact
, hyperfne -s basic,compact
...... it could be affected/limited by the intermediate shell
This covers exporting results to JSON as outlined in #38 and extends on the work done on #41.
The same format of --export-xxx should be used (in this case --export-json) with a filename parameter following. It should exist peacefully with simultaneous usage of --export-csv and multiple instances of both.
Add an option to run a certain command in between the different timing runs (e.g. a cache-clearing command)
This is definitely not in scope is hyperfine's goal is to be just a replacement for time
, but the thing I most wish for when using time to benchmark programs is if it could measure other relevant program execution statistics, like memory usage. After all, optimization (which is what benchmarking is driving) is always about tradeoffs in some way. But it's kinda hard to know if you're making the right tradeoffs for your project if you're measuring only one thing (here, execution speed).
Would something measuring memory usage, etc. that be in scope for hyperfine a a benchmarking tool?
A few things that I would like to do before releasing version 1.0:
plot_benchmark_results.py
script.Usual release checklist
cargo update
.rustup default nightly
and cargo clippy
cargo fmt
cargo test
cargo install -f
Cargo.toml
. Run cargo build
to update Cargo.lock
cargo publish --dry-run --allow-dirty
.git tag vX.Y.Z; git push --tags
cargo publish
.hyperfine
package as "out of date": https://aur.archlinux.org/packages/hyperfine/The estimated execution time should include the time to run the preparation command, e.g:
hyperfine --prepare 'sleep 10' 'sleep 0.1'
csv 1.0.0 is out, but hyperfile still depends in the beta version. Pease bump the dependency. This will help downstream (Debian) packaging.
Alternatively, add an option to pipe the output to a file
@dywedir It looks like you maintain(?) the hyperfine package for nix?
I would like to add install instructions to the README. Would this be:
nix-env -i hyperfine
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.