Comments (4)
I do think yielding is the appropriate default behavior, and documentation PRs are fine to clarify this. I think we could also add a variant or two of install
with different behavior. Here's the current install
behavior, and some thoughts:
- If you're not in any pool, queue the work in the target pool and then block.
- This should be uncontroversial.
- If you're in a different pool, queue the work in the target pool and then yield.
- We could add a mode that blocks here instead. The potential downside is that this pool could get completely blocked, and even cause deadlocks if any of that inner pool work tries to queue and wait for any work back in the outer pool. But as long as you know there's no "loopback", it may be fine.
- If you're already in the target pool, execute it directly.
- Note that this may also lead to work-stealing of outer work if it ends up waiting for other threads to complete its inner work.
- We could have an even stronger mode that blocks here as well, but that would have an even higher risk of deadlocking if all its threads are tied up this way.
BTW, your Mutex
scenario is also discussed in #592, which may or may not involve multiple pools.
from rayon.
@cuviper, I appreciate your explanation of the install()
behavior. I do want to make an important distinction that yielding is not the opposite of blocking. For example, consider the behavior of tokio::spawn()
:
statement1;
let handle = spawn(async { some_work });
statement2;
handle.await.unwrap();
statement1
is executed on what we'll call thread1. The call to spawn()
returns immediately without blocking. statement2
is executed on thread1. Then the call to await blocks, potentially yielding execution to another task on thread1. Here, the non-blocking spawn and the yield occur in different statements, ensuring that statement2
will always execute after statement1
on thread1. Furthermore, the explicit await
syntax makes it obvious to the programmer where the yield point is.
If you're in a different pool, queue the work in the target pool and then yield.
* We could add a mode that blocks here instead. The potential downside is that this pool could get completely blocked, and even cause deadlocks if any of that inner pool work tries to queue and wait for any work back in the outer pool. But as long as you know there's no "loopback", it may be fine.
In the simplest case:
some_par_iter.par_for_each(|| {
statement1;
other_pool.install(|| { some_work; });
statement2;
}
There is no need to make install()
block here. It could queue the work on the other threadpool and then return immediately just like spawn()
. What's confusing is that install()
doesn't just not block -- it also yields, which means there may be several calls to statement1
before statement2
ever gets a chance to execute. Rayon's syntax doesn't make yield points explicit, and the implicit yield is not documented (except in this GItHub issue). This leads to surprising results not only with mutexes as in #592, but also with unbounded memory growth and stack overflows as mentioned in my first comment. Even more surprisingly, install()
may or may not implicitly yield()
depending on the context!
IMHO, if we were writing Rayon from scratch, I would suggest making all yields explicit. The default behavior would then be expressed as follows. Making the yield explicit makes the behavior obvious to the programmer, and it also gives the programmer the choice of moving statement2
up between install()
and yield()
.
some_par_iter.par_for_each(|| {
statement1;
other_pool.install(|| { some_work; });
rayon::yield_now();
statement2;
}
I do think yielding is the appropriate default behavior, and documentation PRs are fine to clarify this. I think we could also add a variant or two of
install
with different behavior.
I have to agree, but only because changing the behavior of install()
could cause deadlocks in existing code that depend on the implicit yield. I would love to see clearer documentation and a yield-free versions of install()
and similar methods (not sure if spawn()
can yield in rayon). It looks like this would require some intimate work on unsafe
parts of rayon. Do you envision this as something a maintainer might work on, or something you would like an outside PR for?
from rayon.
In the simplest case:
some_par_iter.par_for_each(|| { statement1; other_pool.install(|| { some_work; }); statement2; }
There is no need to make
install()
block here.
Since install
returns a value, that is inherently blocking with respect to following statements. It also allows borrowing since its closure is not constrained to 'static
lifetimes -- from the compiler borrowck perspective, those borrows only last for the duration of this call, so we cannot safely return until we're done with that work. If you don't want that imperative control flow, that's what ThreadPool::spawn
and Scope::spawn
are for.
So when I talk about yielding vs. blocking here, I'm only talking about what the thread should do while it waits, but we do have to wait for install
completion regardless.
We've also had discussions about actual await
able futures, like my old WIP #679, but that didn't address the executor part, which you would need if you wanted to await
that within a parallel for_each
-- or at least a block_on
kind of thing.
I would love to see clearer documentation and a yield-free versions of
install()
and similar methods (not sure ifspawn()
can yield in rayon). It looks like this would require some intimate work onunsafe
parts of rayon. Do you envision this as something a maintainer might work on, or something you would like an outside PR for?
The design is the more interesting part that needs discussion. Regarding spawn
, that only queues the work and returns right away. The other main primitives are join
and scope
, but I think it will get a lot hairier if we try to add variations of those, and I haven't fully thought about those implications. Higher-level stuff like the parallel for_each
also runs on join
, and I definitely don't want to fork the entire ParallelIterator
API.
Another idea that I've raised before is that we could have a kind of "critical section" call, inhibiting all work-stealing on that thread until that call returns. (Possibly with some nuance like allowing local work-stealing of work queued while in that call.) This might be a better way to have broad impact without forking a bunch of APIs.
If we decide on just adding install
variations, I can mentor the implementation if you're keen, and as a first step take a look at Registry::in_worker
. The variations I described before are pretty much contained in the three different branches in that function, where the "cold" part is the fully blocking case.
rayon/rayon-core/src/registry.rs
Lines 486 to 504 in 7449d7d
The cold path currently debug_assert!
s that it's not in any pool though, so we'd have to re-evaluate that at least. In a way, that represents the philosophy of the current design, that pool threads should always be trying to find work -- but I do think there are reasonable scenarios, like yours, where it makes sense to opt out of that work-stealing behavior.
from rayon.
Since
install
returns a value, that is inherently blocking with respect to following statements. It also allows borrowing since its closure is not constrained to'static
lifetimes -- from the compiler borrowck perspective, those borrows only last for the duration of this call, so we cannot safely return until we're done with that work. If you don't want that imperative control flow, that's whatThreadPool::spawn
andScope::spawn
are for.
Ah, I stand corrected. Since install()
borrows and returns a value, it must either block or yield before returning.
If you don't want that imperative control flow, that's what ThreadPool::spawn and Scope::spawn are for.
I'll refactor my code with spawn()
and scope()
.
EDIT It looks like calling ThreadPool::scope()
will internally call ThreadPool::install()
, triggering the yield behavior.
Another idea that I've raised before is that we could have a kind of "critical section" call, inhibiting all work-stealing on that thread until that call returns. (Possibly with some nuance like allowing local work-stealing of work queued while in that call.) This might be a better way to have broad impact without forking a bunch of APIs.
This sounds difficult to implement, but it sounds like it would be the most ergonomic solution in the long term.
EDIT Might still need a non-yielding install()
variant to prevent install()
from work-stealing on the parent thread pool but still allow work-stealing on the inner thread pool. Otherwise, it's not clear how you would specify the critical section to cover one but not the other.
from rayon.
Related Issues (20)
- Add SIMD SORT as an option HOT 1
- Handle/guard support for current thread pool HOT 1
- Optional parallelization
- Way to have assertion whether something is outside of a rayon task HOT 2
- how drop rayon whren it in a dylib and dylib should be droped? HOT 4
- Error reporting in scoped tasks
- rayon-core tests fail to build. HOT 6
- Matrix multiplication with Rayon doesn't see perf improvements HOT 3
- general purpose WASM support? HOT 1
- `fold` creates identity for each sublist HOT 10
- Support for some sort of collect()/extend() into `LinkedList<Vec<T>>` HOT 1
- A yield_now/yield_local alternative that allows the thread to go to sleep HOT 3
- Rayon slower than single threaded performance when mixing iter and par_iter HOT 9
- Potential to modify ordering for split_count and tour_counter HOT 2
- rayon 1.9.0 doctest failure - stack overflow on aarch64-unknown-linux-gnu HOT 2
- par_bridge_recursion test sometimes hanging indefinitely HOT 3
- rayon works very slow after MacOS Sonoma 14.4 has beed updated HOT 6
- `&mut [T]` should implement `rayon::slice::ParallelSliceMut` HOT 4
- Suggestion for stack size when calling par_sort_unstable HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rayon.