The hyperfine's discuss from sharkdp

Compilation failure on Windows

It looks like 0.4.0 doesn't compile on Windows. master does though. Could a release be cut?

Measure shell execution time and subtract it

hyperfine consistently measures short outlier that I cannot reproduce in the wild

I am trying to measure an incremental compilation benchmark (to reproduce something coming up on perf.rust-lang.org). Hyperfine consistently reports "Range (min … max): 11.0 ms … 765.1 ms" and "Time (mean ± σ): 590.1 ms ± 304.8 ms", which is effectively a useless result (it doesn't print a warning though, but still). However, when I time the same command myself (cargo prints the time and I also tried time), I never see anything below 700ms.

So my suspicion is that this 11ms result hyperfine is seeing is somehow caused by not calling the benchmark correctly, but I do not know how that could happen either.

To reproduce, clone https://github.com/rust-lang-nursery/rustc-perf/ and go to collector/benchmarks/coercions. Do an initial cargo +nightly build to fill the incremental cache. Now run touch src/main.rs && cargo +nightly build many times; for me it is pretty stable between 730ms and 770ms.

Now run

hyperfine -w 2 -p "touch src/main.rs" "cargo +nightly build"

This shows a range from 10ms to 790ms. Something is clearly odd -- but it's not -p, because

hyperfine -w 2 "touch src/main.rs && cargo +nightly build"

has all the same problems.

Compile failure on FreeBSD

This is due to an upstream bug in console, which I have a pending PR against here: console-rs/console#5

On FreeBSD, ioctl() accepts a u64. However, TIOCGWINSZ is defined as a u32.

about the Chinese readme.md 😊

repo

https://github.com/chinanf-boy/hyperfine-zh

Show the current estimate for the execution time while running the benchmark

Allow override of shell

hyperfine can't be used to benchmark shell-specific shell functions without launching a new instance of a shell within the shell itself, breaking some of the underlying assumptions. As most shells support the -c argument, it would useful if it were possible to pass in a --shell SHELL option to override hyperfine's default.

e.g. I'm using hyperfine to reassess some assumptions made in the development of fish shell, and would like to be able to benchmark one version of a shell builtin against another, or benchmark the time a completion script takes to execute (which uses fish-specific language so would return an error under sh).

This would be a straightforward replacement of sh with whatever the user provided, but some might even find it useful to evaluate the performance of command1 executed under shell foo and command2 executed under shell bar (and not losing the benefit of the startup timing analysis that hyperfine provides).

Unresolved import `libc::rusage`

λ cargo install hyperfine
    Updating registry `https://github.com/rust-lang/crates.io-index`
 Downloading hyperfine v0.4.0
  Installing hyperfine v0.4.0
 Downloading colored v1.6.0
 Downloading [...]
 Downloading rustc-serialize v0.3.24
   Compiling strsim v0.6.0
   Compiling [...]
   Compiling hyperfine v0.4.0
error[E0432]: unresolved import `libc::getrusage`
 --> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:12
  |
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
  |            ^^^^^^^^^ no `getrusage` in the root

error[E0432]: unresolved import `libc::rusage`
 --> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:23
  |
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
  |                       ^^^^^^ no `rusage` in the root

error[E0432]: unresolved import `libc::RUSAGE_CHILDREN`
 --> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:31
  |
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
  |                               ^^^^^^^^^^^^^^^ no `RUSAGE_CHILDREN` in the root

error: aborting due to 3 previous errors

error: failed to compile `hyperfine v0.4.0`, intermediate artifacts can be found at `C:\Users\dkter\AppData\Local\Temp\cargo-install.2sl6wvAhi3wL`

Caused by:
  Could not compile `hyperfine`.

To learn more, run the command again with --verbose.

Running with --verbose gives me this information:

   Compiling hyperfine v0.4.0
     Running `rustc --crate-name hyperfine .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\main.rs --crate-type bin --emit=dep-info,link -C opt-level=3 -C metadata=98f67aef2c923775 -C extra-filename=-98f67aef2c923775 --out-dir C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps -L dependency=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps --extern indicatif=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libindicatif-59764fc82c6811ce.rlib --extern libc=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\liblibc-f66ba3832bd58510.rlib --extern statistical=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libstatistical-50c68fb634eb9a96.rlib --extern colored=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libcolored-9098ef94b466db7a.rlib --extern clap=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libclap-04c95c98d9faa158.rlib --extern atty=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libatty-a371b164006500d6.rlib --cap-lints allow`
error[E0432]: unresolved import `libc::getrusage`
 --> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:12
  |
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
  |            ^^^^^^^^^ no `getrusage` in the root

error[E0432]: unresolved import `libc::rusage`
 --> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:23
  |
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
  |                       ^^^^^^ no `rusage` in the root

error[E0432]: unresolved import `libc::RUSAGE_CHILDREN`
 --> .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\hyperfine\cputime.rs:1:31
  |
1 | use libc::{getrusage, rusage, RUSAGE_CHILDREN};
  |                               ^^^^^^^^^^^^^^^ no `RUSAGE_CHILDREN` in the root

error: aborting due to 3 previous errors

error: failed to compile `hyperfine v0.4.0`, intermediate artifacts can be found at `C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi`

Caused by:
  Could not compile `hyperfine`.

Caused by:
  process didn't exit successfully: `rustc --crate-name hyperfine .cargo\registry\src\github.com-1ecc6299db9ec823\hyperfine-0.4.0\src\main.rs --crate-type bin --emit=dep-info,link -C opt-level=3 -C metadata=98f67aef2c923775 -C extra-filename=-98f67aef2c923775 --out-dir C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps -L dependency=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps --extern indicatif=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libindicatif-59764fc82c6811ce.rlib --extern libc=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\liblibc-f66ba3832bd58510.rlib --extern statistical=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libstatistical-50c68fb634eb9a96.rlib --extern colored=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libcolored-9098ef94b466db7a.rlib --extern clap=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libclap-04c95c98d9faa158.rlib --extern atty=C:\Users\dkter\AppData\Local\Temp\cargo-install.tjcNwOyjKYFi\release\deps\libatty-a371b164006500d6.rlib --cap-lints allow` (exit code: 101)

Rust version is 1.20.0 and I'm on Windows 10.

Parameters don't get expanded in preparation commands

It seems like parameters are only substituted for the actual command, not the preparation command:

$ hyperfine -P X 1 10 -p 'python -c "print 2**{X}"' --show-output -- ls
Benchmark #1: ls

Traceback (most recent call last):
  File "<string>", line 1, in <module>
NameError: name 'X' is not defined
Error: The preparation command terminated with a non-zero exit code. Append ' || true' to the command     if you are sure that this can be ignored.

JSON output file should have a newline at the end

Sequential vs Concurrent execution

I looked over the documentation, but, unless I'm missing something, I don't see an option for running commands sequentially versus concurrently (ie for load testing). If there is no option for specifying this, I'm assuming that commands run sequentially, is this correct?

Thanks.

Add possibility to ignore setup phase

I have a script that generates some cryptographic keys in the initial phase and only then starts the actual benchmark. The more keys I generate, the slower that initial phase will obviously be, but I only want to benchmark the second phase. Could it be possible to ignore the time spent until a specific string is printed to STDOUT?

Provide finer-grained statistics

It'd be nice to be able to show other statistics such as the 95th percentile or median runtime. HdrHistogram can record this with relatively little overhead, and there's a pretty good official Rust implementation here (I'm one of the maintainers).

Output number of runs

It would be nice, by default, to know how many runs hyperfine did for a benchmark.

Detect if first timing-run was significantly slower

Automatically detect if the first timing-run was significantly slower than the remaining runs and suggest usage of --warmup.

Possible implementation:

Let [t_1, t_2, ..., t_n] be the benchmarking results.
Let t_mean and t_stddev be the mean and standard-deviation for [t_2, ..., t_n]
Show the warning if t_1 > t_mean + 5 * t_stddev (for example)

Comparison: show if result is statistically significant

In the summary output, test whether or not the speedup of the "winning" command over the other(s) is statistically significant. This should be fairly easy to do, given that we already compute the standard deviations.

Parametrized benchmarks

It could be interesting to run benchmarks where a (numerical) parameter is systematically changed, for example:

> hyperfine 'make -j{}' --parameter-range 1..8

(modulo syntax)

Add option to choose the units used for CLI & results output

(a follow up to #71)

Currently the CLI automatically selects the units (seconds, milliseconds) based on the size of the mean value, but the results export is always in seconds.

Choosing the units will force for the CLI and results export units to always match, and allow users to specify the units they are most familiar with/integrate with their reporting systems.

The option could be -u --units <Seconds|Milliseconds>, and maybe extended to also include minutes (and hours)?

The units option value would be passed through to both format::format_duration_units for the CLI and ExportManager::write_results for the results export.

Hyperfine only runs 1 benchmark, irrespective of --min-runs

After 5GB and 2 hours' worth of dependency-chasing, I finally managed to install hyperfine via cargo. After then manually adding the cargo install folder to the Windows PATH environment variable, I'm finally able to run hyperfine in Cygwin.

However, when I do the following as a quick preliminary test:

hyperfine -m 10 --show-output --export-csv hyperfinetest.csv 'sleep 1'

I get the following output:

Benchmark #1: sleep 1

Time (mean ± σ): 1.036 s ± 0.006 s [User: 4.1 ms, System: 17.4 ms]

Range (min … max): 1.021 s … 1.040 s

I presume this means that only one benchmark was run, and checking hyperfinetest.csv seems to confirm this. What's going on here?

Support `--max-runs <n>` and `--runs <n>`

The user should be able to specify the maximum number of runs or the exact number of runs.

Option to interleave benchmarks for multiple commands

CyberShadow on HN:

This would be useful when comparing two similar commands, as interleaving them makes it less likely that e.g. a load spike will unfavorably affect only one of them, or due to e.g. thermal throttling negatively affecting the last command.

Add --color=auto/never/always option

Option to output the percentiles

It would be useful to output the time percentiles of the run. For instance with a flag like --percentiles '50, 90, 95, 99, 99.9'.

stdin is not closed

cf.

hyperfine read

Proper Windows support

Properly spawn shell commands (this doesn't seem to work at all at the moment)
Detect when a process fails with exit code != 0 (I'm guessing that cmd.exe hides the real exit code from us)
Compute user/system time (or hide the message)
Colors and even progress bars actually work fine in PowerShell. We should not set --style basic by default.

Cygwin package?

Is there any chance of porting the utility as a Cygwin package? I was really looking forward to using the program to benchmark a few commands on Cygwin running on top of Windows, but it's unfortunately not yet in the Cygwin package repository, which means it would require installing a whole bunch of other, heavy dependencies to install the usual way rather than what would usually be a much simpler apt-cyg install hyperfine.

Thanks in advance.

Compliant time units in CLI report and results export

When passing --export-markdown the resulted Markdown file includes information about Mean and Min..Max measured in ms even when the original result report printed to stdout is presented in seconds. I think it might be better if the produced report would use the same units since it's hard to read very long ms numbers.

Instructions for the behavior reproduction:

hyperfine 'sleep 5' --export-markdown results.md

Command line output:

kbobyrev@kbobyrev ~/d/m/profile> hyperfine 'sleep 5' --export-markdown results.md
Benchmark #1: sleep 5
  Time (mean ± σ):      5.002 s ±  0.000 s    [User: 1.1 ms, System: 2.0 ms]
  Range (min … max):    5.002 s …  5.003 s

results.md:

Command	Mean [ms]	Min…Max [ms]
`sleep 5`	5002.3 ± 0.2	5002.1…5002.7

Option to configure minimum number of runs and minumum time

Export to Markdown

Add a --export-markdown option in analogy to #41 for CSV and #42 for JSON.

The output could look like this:

| Benchmark | Mean [ms]   | Min. [ms] | Max. [ms] |
|-----------|-------------|-----------|-----------|
| command 1 | 205.1 ± 1.5 | 201.1     | 207.6     |
| command 2 | 403.5 ± 2.4 | 400.3     | 407.4     |

Rendered:

Command	Mean [ms]	Min. [ms]	Max. [ms]
command 1	205.1 ± 1.5	201.1	207.6
command 2	403.5 ± 2.4	400.3	407.4

Add ability to measure user and system time

Just like time(1):

$ /bin/time -p fd -HI -e jpg '' ~ > /dev/null
real 1.41
user 4.23
sys 2.89

Show more statistics

Minimum execution time
Maximum execution time
Median?

Show results even if benchmark was interrupted (via Ctrl-C)

Add export options

It would be great if we could export benchmark results in different formats:

Markdown for easy integration into README files
JSON for further processing
CSV for simple plotting (gnuplot) -- header + one line per benchmark with columns mean, stddev, min, max

Each of those could be new command line options (hyperfine --export-csv my-benchmark.csv).

Show summary / comparison

If there are multiple benchmarks, show a short summary of the results like command1 is a factor of 1.5 times faster than command2.

Increase unit test coverage

I was running some coverage using cargo-kcov and noted that currently coverage on the application is somewhat low. It may be good to identify points that should be placed under test and ensure that they are. For instance, with more options being added, extracting the processing of Clap's matches to a testable function may be prudent to ensure future changes do not alter the current correct state.

Compact output format, finer control over the output

I am running hyperfine with many different commands at once. In order to not only see the factor between the different commands at the end but also (some) of the means I suggest one, some or all of these changes to the output (switchable by a command line option):

Do not print empty lines between mean and min-max and the headlines and so on. I can currently get quite close to what I want when I run hyperfine ... | sed '/^ *$/d'.
The above mentioned hyperfine ... | sed '/^ *$/d' is without color and when I run hyperfine -s full ... | sed '/^ *$/d' instead I also get the interactive graph which doesn't play nicely with the pipe and sometimes messes up a line. So why not add a color style to -s which has color but is not interactive?
All of these could be an additional value for -s and -s could be a comma separated list so that things like this would make sense: hyperfine -s compact,color, hyperfine -s interactive,compact,nocolor, hyperfine -s full,compact, hyperfne -s basic,compact...

Use $SHELL instead of /bin/sh?

Add support for other time units (ms and min)

Add warmup mode

Warn if execution time is very fast

... it could be affected/limited by the intermediate shell

Export to JSON

This covers exporting results to JSON as outlined in #38 and extends on the work done on #41.

The same format of --export-xxx should be used (in this case --export-json) with a filename parameter following. It should exist peacefully with simultaneous usage of --export-csv and multiple instances of both.

Add option to run cleanup/cooldown command

Add an option to run a certain command in between the different timing runs (e.g. a cache-clearing command)

Support for benchmarking memory usage, etc.

This is definitely not in scope is hyperfine's goal is to be just a replacement for time, but the thing I most wish for when using time to benchmark programs is if it could measure other relevant program execution statistics, like memory usage. After all, optimization (which is what benchmarking is driving) is always about tradeoffs in some way. But it's kinda hard to know if you're making the right tradeoffs for your project if you're measuring only one thing (here, execution speed).

Would something measuring memory usage, etc. that be in scope for hyperfine a a benchmarking tool?

Quit if exit code is non-zero, add flag to ignore

Checklist for hyperfine 1.0

A few things that I would like to do before releasing version 1.0:

Merge #53
Merge #51
Add a chapter about the export options in the "Usage" section in the README. Possibly show a Markdown table example. Possibly refer to the plot_benchmark_results.py script.
Add a chapter about parameterized benchmarks.

Usual release checklist

Correctly estimate full execution time if --prepare is used

The estimated execution time should include the time to run the preparation command, e.g:

hyperfine --prepare 'sleep 10' 'sleep 0.1'

nix-env -i hyperfine

sharkdp / hyperfine Goto Github PK

hyperfine's Issues

repo

Recommend Projects

Recommend Topics

Recommend Org