Heya! I've been doing advent-of-code and trying out Divan while doing so. Someone show

check if the same happens with Criterion? <p dir="auto"

Adding unrelated code affects benchmark,about nvzqz/divan

Comments (5)

nvzqz commented on May 18, 2024

Is the increase of 100µs consistent across multiple cargo bench invocations?

Also, I noticed the spike of downloads yesterday and found your AoC video. Thanks for trying it out! Currently in the growing pains phase. 😄

from divan.

ChristopherBiscardi commented on May 18, 2024

The increase is consistent but I wasn't tracking the exact amount closely enough to say whether it was (for example) 100µs vs 130µs and such. It was not wildly fluctuating. At the very least I ran it multiple times in each configuration, in a row, one after the other, etc, etc. and the results seemed very consistent to me over many runs.

Also, I noticed the spike of downloads yesterday and found your AoC video. Thanks for trying it out! Currently in the growing pains phase. 😄

oh fun! glad I could point some people in your direction.

from divan.

nvzqz commented on May 18, 2024

This might be a bit of work on your end, but could you please check if the same happens with Criterion?

from divan.

polarathene commented on May 18, 2024

check if the same happens with Criterion?

I've encountered a similar experience and have provided a reproduction here with both divan and criterion consistently demonstrating a perf regression.

I've had some other users clone that and observe the same.

My reproduction is fairly simple, just copying a function that isn't used at all introduces a consistent overhead in timings measured on the actual benched function. I suspect it's due to compiler optimizations like this Criterion FAQ entry explains.

The reproduction I've linked identifies several changes that can be made to avoid the regression, so this sort of issue is unlikely anything to do with Divan (Criterion measures the same increase in timing).

from divan.

ChristopherBiscardi commented on May 18, 2024

I've confirmed that the behavior does exist in divan and criterion. It is only for this one benchmark (part2_nom from 375 to 497). The other 3 benchmarks appear stable.

The relevant commit is: https://github.com/ChristopherBiscardi/advent-of-code/tree/26ae09c876602e405dc7943ef61764afd70d8838

Runs without part2_subject

run 1

Timer precision: 41 ns
day_01                 fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ part1               39.95 µs      │ 98.29 µs      │ 40.33 µs      │ 41.24 µs      │ 100     │ 100
├─ part2               138.4 µs      │ 174.5 µs      │ 143.2 µs      │ 143.6 µs      │ 100     │ 100
├─ part2_aho_corasick  27.87 ms      │ 30.93 ms      │ 28.52 ms      │ 28.49 ms      │ 100     │ 100
╰─ part2_nom           365.5 µs      │ 522.1 µs      │ 376.2 µs      │ 380.4 µs      │ 100     │ 100

     Running benches/benchmarks-criterion.rs (target/release/deps/day_01_bench_criterion-c9c1d8429cb6331d)
part1/part1             time:   [28.212 µs 28.595 µs 28.999 µs]
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

part2/part2             time:   [142.92 µs 143.11 µs 143.31 µs]
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low severe
  5 (5.00%) low mild
  4 (4.00%) high mild
part2/part2_nom         time:   [380.46 µs 380.83 µs 381.27 µs]
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) low mild
  2 (2.00%) high mild
part2/part2_aho_corasick
                        time:   [28.350 ms 28.382 ms 28.413 ms]
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) low mild

run 2

Timer precision: 41 ns
day_01                 fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ part1               38.16 µs      │ 61.91 µs      │ 45.79 µs      │ 44.69 µs      │ 100     │ 100
├─ part2               138.2 µs      │ 178 µs        │ 144.7 µs      │ 144.4 µs      │ 100     │ 100
├─ part2_aho_corasick  27.86 ms      │ 29.82 ms      │ 28.43 ms      │ 28.4 ms       │ 100     │ 100
╰─ part2_nom           359.7 µs      │ 507.4 µs      │ 374.2 µs      │ 377.3 µs      │ 100     │ 100

     Running benches/benchmarks-criterion.rs (target/release/deps/day_01_bench_criterion-c9c1d8429cb6331d)
part1/part1             time:   [28.309 µs 28.927 µs 29.598 µs]
                        change: [-3.7815% -1.2474% +1.3855%] (p = 0.35 > 0.05)
                        No change in performance detected.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

part2/part2             time:   [143.46 µs 143.75 µs 144.06 µs]
                        change: [-0.0243% +0.2199% +0.4712%] (p = 0.11 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low mild
  1 (1.00%) high mild
  3 (3.00%) high severe
part2/part2_nom         time:   [380.39 µs 383.26 µs 389.43 µs]
                        change: [-0.3633% +1.0289% +4.5633%] (p = 0.63 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  9 (9.00%) low mild
  1 (1.00%) high severe
part2/part2_aho_corasick
                        time:   [28.401 ms 28.442 ms 28.483 ms]
                        change: [+0.0227% +0.2104% +0.3871%] (p = 0.02 < 0.05)
                        Change within noise threshold.

run 3

Timer precision: 41 ns
day_01                 fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ part1               40.83 µs      │ 86.7 µs       │ 44.54 µs      │ 44.97 µs      │ 100     │ 100
├─ part2               138.7 µs      │ 173.4 µs      │ 144.3 µs      │ 144.2 µs      │ 100     │ 100
├─ part2_aho_corasick  27.6 ms       │ 28.76 ms      │ 28.43 ms      │ 28.37 ms      │ 100     │ 100
╰─ part2_nom           352.9 µs      │ 466.9 µs      │ 375.9 µs      │ 377.4 µs      │ 100     │ 100

     Running benches/benchmarks-criterion.rs (target/release/deps/day_01_bench_criterion-c9c1d8429cb6331d)
part1/part1             time:   [27.161 µs 27.608 µs 28.093 µs]
                        change: [-3.5336% -1.2345% +1.0798%] (p = 0.32 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

part2/part2             time:   [143.06 µs 143.35 µs 143.69 µs]
                        change: [-0.3984% -0.0981% +0.2096%] (p = 0.54 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe
part2/part2_nom         time:   [380.71 µs 382.71 µs 385.95 µs]
                        change: [-3.1649% -0.2636% +1.6587%] (p = 0.88 > 0.05)
                        No change in performance detected.
Found 16 outliers among 100 measurements (16.00%)
  3 (3.00%) low severe
  4 (4.00%) low mild
  2 (2.00%) high mild
  7 (7.00%) high severe
part2/part2_aho_corasick
                        time:   [28.366 ms 28.411 ms 28.465 ms]
                        change: [-0.3222% -0.1083% +0.1199%] (p = 0.36 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

with part2_subject module

run 1 (immediately after the above runs)

Timer precision: 41 ns
day_01                 fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ part1               27.79 µs      │ 107.9 µs      │ 30.87 µs      │ 35.63 µs      │ 100     │ 100
├─ part2               144.7 µs      │ 188 µs        │ 151.7 µs      │ 153.8 µs      │ 100     │ 100
├─ part2_aho_corasick  27.58 ms      │ 35.72 ms      │ 28.48 ms      │ 28.71 ms      │ 100     │ 100
╰─ part2_nom           482.2 µs      │ 583.9 µs      │ 498.6 µs      │ 500.6 µs      │ 100     │ 100

     Running benches/benchmarks-criterion.rs (target/release/deps/day_01_bench_criterion-c9c1d8429cb6331d)
part1/part1             time:   [28.547 µs 29.003 µs 29.489 µs]
                        change: [+0.9037% +2.8745% +4.8163%] (p = 0.01 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

part2/part2             time:   [142.86 µs 143.10 µs 143.35 µs]
                        change: [-0.3534% -0.0867% +0.1954%] (p = 0.54 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  1 (1.00%) high severe
part2/part2_nom         time:   [499.69 µs 500.21 µs 500.70 µs]
                        change: [+28.932% +30.226% +31.214%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) low severe
  1 (1.00%) low mild
part2/part2_aho_corasick
                        time:   [28.407 ms 28.752 ms 29.277 ms]
                        change: [-0.0030% +1.1999% +3.2000%] (p = 0.13 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  4 (4.00%) high severe

run 2

Timer precision: 41 ns
day_01                 fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ part1               30.16 µs      │ 62.62 µs      │ 35.16 µs      │ 34.57 µs      │ 100     │ 100
├─ part2               136.1 µs      │ 169.4 µs      │ 144.7 µs      │ 143 µs        │ 100     │ 100
├─ part2_aho_corasick  27.24 ms      │ 28.73 ms      │ 28.41 ms      │ 28.3 ms       │ 100     │ 100
╰─ part2_nom           476.4 µs      │ 590.2 µs      │ 497.7 µs      │ 497.1 µs      │ 100     │ 100

     Running benches/benchmarks-criterion.rs (target/release/deps/day_01_bench_criterion-c9c1d8429cb6331d)
part1/part1             time:   [28.270 µs 28.877 µs 29.482 µs]
                        change: [-0.4674% +1.7851% +4.1144%] (p = 0.15 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

part2/part2             time:   [142.71 µs 143.41 µs 144.72 µs]
                        change: [-0.4805% +0.4224% +1.7963%] (p = 0.57 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) low mild
  1 (1.00%) high mild
  3 (3.00%) high severe
part2/part2_nom         time:   [499.59 µs 500.15 µs 500.68 µs]
                        change: [-0.2440% +0.0456% +0.3362%] (p = 0.76 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low severe
  3 (3.00%) low mild
  1 (1.00%) high mild
part2/part2_aho_corasick
                        time:   [28.454 ms 28.580 ms 28.727 ms]
                        change: [-2.4332% -0.5968% +0.7186%] (p = 0.55 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  7 (7.00%) high severe

run 3

Timer precision: 41 ns
day_01                 fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ part1               28.12 µs      │ 62.58 µs      │ 28.43 µs      │ 33.28 µs      │ 100     │ 100
├─ part2               139.3 µs      │ 184.6 µs      │ 145 µs        │ 145.7 µs      │ 100     │ 100
├─ part2_aho_corasick  27.9 ms       │ 28.93 ms      │ 28.44 ms      │ 28.39 ms      │ 100     │ 100
╰─ part2_nom           477 µs        │ 574.4 µs      │ 497.4 µs      │ 494.9 µs      │ 100     │ 100

     Running benches/benchmarks-criterion.rs (target/release/deps/day_01_bench_criterion-c9c1d8429cb6331d)
part1/part1             time:   [28.404 µs 28.897 µs 29.428 µs]
                        change: [-3.9832% -1.6333% +0.8383%] (p = 0.20 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

part2/part2             time:   [143.02 µs 143.89 µs 144.99 µs]
                        change: [-1.6549% -0.2855% +0.7242%] (p = 0.70 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) low mild
  3 (3.00%) high severe
part2/part2_nom         time:   [499.47 µs 500.23 µs 501.16 µs]
                        change: [-0.0951% +0.2081% +0.5882%] (p = 0.22 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe
part2/part2_aho_corasick
                        time:   [28.380 ms 28.436 ms 28.498 ms]
                        change: [-1.0464% -0.5065% -0.0087%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe

from divan.

Adding unrelated code affects benchmark about divan HOT 5 OPEN

Comments (5)

Runs without part2_subject

with part2_subject module

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent