Currently parallel iterators assume cheap operations. I am thinking that should be cha

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Definitely significant progress here with <a class="user-mention notranslate" data-hov

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Change default weight of parallel iterators to assume expensive ops about rayon HOT 11 CLOSED

rayon-rs commented on July 18, 2024

Change default weight of parallel iterators to assume expensive ops

from rayon.

Comments (11)

dirvine commented on July 18, 2024

Intel et al' parallel_for etc. do not seem to assume cheap operations and in c++ at least force refactor of some code to allow parallelism. I wonder if there is a link between being completely automatic and specific. I see this lib and think wow imagine if it were just a drop in for iter() or similar than think a bit and realise folk need to code algorithms really to take advantage.

I know this is not a direct correlation to the question, but I feel it is linked. In re-factoring for parallel many times code gets cleaner. It forces/allows the programmer to reason. This is my quandary at the moment to be honest. I love the idea of automating parallelism but have found in the past being told by the compiler I need to refactor to be handy.

In essence I feel an opt-in to be the way forward here, possibly later giving hints at optimisation of algorithms for parallelism at compile tim . Hope that helps a little anyway

from rayon.

nikomatsakis commented on July 18, 2024

@dirvine thanks for the input. A few thoughts:

I'm not trying to make anything completely automatic -- you have to choose to write par_iter. But I do want to make it easy.
Rust's type system has the nice benefit of basically encouraging you (though not forcing you) to write parallel-safe from the get-go, so I do hope that only minimal refactoring is needed.
Depending on your loop and what it does, though, you may still need to do some manual refactoring. The more that you are relying on the iterator adapters, vs writing ad-hoc code in your for loop body, the better off you are, basically (though some adapters, like fold, are not very parallelizable, but hopefully you can make do with reduce).

from rayon.

nikomatsakis commented on July 18, 2024

Hmm. I have been experimenting with this in a branch. One interesting result that I found was that, when I ported the nbody demo, the par-reduce variant (which can generate quite a lot of inexpensive tasks...) ran ridiculously slow until I raised up the sequential threshold. This isn't really surprising I guess -- the defaults are very wrong for this case -- but it did point out of course the danger of changing our weights.

from rayon.

nikomatsakis commented on July 18, 2024

I guess if we did more work on making task spawning cheap (work that would be very profitable in any case) that might help out here. (For that matter, par-reduce is still always slower than the more coarse-grained version.)

from rayon.

nikomatsakis commented on July 18, 2024

The branch (for the record) is no-more-weight.

from rayon.

nikomatsakis commented on July 18, 2024

See https://github.com/nikomatsakis/rayon/pull/81

from rayon.

nikomatsakis commented on July 18, 2024

Definitely significant progress here with @cuviper's https://github.com/nikomatsakis/rayon/pull/106. I still think we want to remove the existing weight stuff before 1.0 -- and maybe add back with some other APIs.

from rayon.

edre commented on July 18, 2024

Maybe rayon could sample how long leaf nodes take to run and dynamically adjust? Of course some elements may require much more processing than others, but starting with fine grained splitting and dynamically increasing splits may get the best of both worlds.

from rayon.

nikomatsakis commented on July 18, 2024

@edre we already do this, effectively, via the mechanism of work stealing as well as the adaptive changes. What we are talking about is tuning that mechanism.

from rayon.

nikomatsakis commented on July 18, 2024

In particular I think the current mechanism should work pretty well except for when things are both highly variable between tasks and bigger tasks are clumped together.

from rayon.

nikomatsakis commented on July 18, 2024

Now that @cuviper added dynamic balancing, I think this is basically all done. Or done enough. Closing in favor of #111.

from rayon.

Change default weight of parallel iterators to assume expensive ops about rayon HOT 11 CLOSED

Comments (11)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent