For example, constructing a buffer that will be consumed by the iteration. This needs

I think this use case can be handled by adding a new method to <code class="notranslat

Implemented in <a class="issue-link js-issue-link" data-error-text="Failed to load tit

Support per-iteration setup about criterion.rs HOT 6 CLOSED

bheisler commented on August 19, 2024

Support per-iteration setup

from criterion.rs.

Comments (6)

japaric commented on August 19, 2024

I think this use case can be handled by adding a new method to Bencher that uses a different timing loop. The signature would look like this:

fn iter_with_setup<T, U>(&mut self, setup: || -> T, routine: |T| -> U);

A possible use case would be benchmarking a destructor like Vec.drop:

#[bench]
fn dealloc(b: &mut Bencher) {
    b.iter_with_setup(|| {
        Vec::from_elem(1024, 0u8)
    }, |buffer| {
        drop(buffer);
    };
}

As for the details, the timing loop would look like this:

fn iter_with_setup<T, U>(&mut self, setup: || -> T, routine: |T| -> U) {
    let niters = self.niters;
    let inputs = range(0, niters).map(|_| setup()).collect::<Vec<_>>();

    self.start_ns = time::precise_time_ns();
    for input in inputs.into_iter() {
        black_box(routine(input));
    }
    self.stop_ns = time::precise_time_ns();
}

And, the time model would be:

elapsed = precise_time_ns + niters * (routine + drop(U))

Criterion will analyze/report the execution time of (routine + drop(U)). For the example of Vec.drop, U is (), so there is no drop(U), and Criterion will directly measure Vec.drop.

@kmcallister Do you think this could cover your use cases?

from criterion.rs.

kmcallister commented on August 19, 2024

Sounds pretty good. I'm mostly worried about memory consumption, e.g. if we do 1k iterations on 8 MB (the size of the HTML spec) then we need 8 GB to run the test suite.

I had another idea which is to benchmark setup + drop(T) separate from setup + routine + drop(U), and then compute statistics about the difference of these two random variables.

from criterion.rs.

japaric commented on August 19, 2024

Sounds pretty good. I'm mostly worried about memory consumption, e.g. if we do 1k iterations on 8 MB (the size of the HTML spec) then we need 8 GB to run the test suite.

Yes, high memory usage is one drawback of that approach.

I had another idea which is to benchmark setup + drop(T) separate from setup + routine + drop(U), and then compute statistics about the difference of these two random variables.

This approach also comes with its own set of drawbacks:

The difference won't be meaningful unless drop(T) is the same operation as drop(U).
Subtracting two distributions will always yield a distribution with a higher variance than any of the "parents". In general, the higher the variance, the less useful the measurement is. In particular, if the variance of setup happens to be very large, you may end with measurements like mean = 200 ns, std_dev = 500 ns.
And more importantly, we can't reuse Criterion's analysis tools if we follow this approach. We'll have to generate a PDF from the "difference" of two samples (data sets), which I don't know how to do correctly; and then compute statistics from that "continuous" PDF (there is no library support for that, although it's not hard to code).

Yet another option is the following timing loop:

fn iter_with_setup<T, U>(&mut self, setup: || -> T, routine: |T| -> U) {
    self.start_ns = 0;
    self.stop_ns = 0;

    for _ in range(0, self.niters) {
        let input = setup();
        let start = time::precise_time_ns();
        let output = routine(input);
        let stop = time::precise_time_ns();
        self.stop_ns += stop - start;
        // drop(output);
    }
}

Which has the following time model:

elapsed = niters * (precise_time_ns + routine)

And comes with its ups and downs:

Not memory hungry
Can use existing Criterion analysis infrastructure
Shouldn't incur in the high variance issue
Criterion analyzes/reports (precise_time_ns + routine) instead of just routine. (Not that bad if routine is way bigger than precise_time_ns)
The for loop is not as tight as it could be. (Unsure about how much can this affect the measurement)

Let's first check if any of these two timing loops can fulfill your use case before attempting to do something more elaborate, I'll ping you when I get a PR up.

from criterion.rs.

kmcallister commented on August 19, 2024

The difference won't be meaningful unless drop(T) is the same operation as drop(U).

Yeah, as a user of the library I'm basically fine with accounting for that myself. In particular it's no problem when assessing whether a change to routine improves overall performance or not.

Subtracting two distributions will always yield a distribution with a higher variance than any of the "parents"

We can deal with this by increasing the sample size. Central Limit Theorem is on your side. I'm fine waiting longer for benchmark results if they are extremely sophisticated and useful :)

And more importantly, we can't reuse Criterion's analysis tools if we follow this approach. We'll have to generate a PDF from the "difference" of two samples (data sets), which I don't know how to do correctly

It can be done. If we need to dig deep in the literature, I know some people :)

Let's first check if any of these two timing loops can fulfill your use case before attempting to do something more elaborate, I'll ping you when I get a PR up.

Sounds great! Any kind of solution in this space, I'm very interested in trying it out!

from criterion.rs.

apasel422 commented on August 19, 2024

Was any progress made on this?

from criterion.rs.

japaric commented on August 19, 2024

Implemented in #62

from criterion.rs.

Support per-iteration setup about criterion.rs HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent