Comments (5)
so I am running on a intel core i5-5200u @ 2.2GHz 2.19 GHz
I Isolated the first stage, the stages are run sequentially but each one calls into_par_iter.
The first stage, actually looks good.
with weight_max
61s, 56s, 51s
without 46s, 58s, 59s
So let me try to isolate the second stage, as so far the new algorithm seems comparable.
stage 2 is 1-2s either way. much of which is the linear overhead before the into_par_it
so going back to doing all stages together:
with: 43s, 36s, 35s
without: 37s, 37s, 39s
and if I replace into_par_iter with into_iter: 69s, 69s, 72s. so rayon is giving me a 2x improvement in speed, good for a 2 core machine. Is there a programmatic way to determine the number of cores as opposed to the number of logical processors?
Don't know why all stages together are faster than just the first stage. But It looks like without weight_max
is basically comparable. :-) well done!
from rayon.
As a quick test I took a work project and cargo updated to 0.4.3. With weight_max
it runes 32s. Then I removed the weight_max
and it runs in 48s. (each best of 3 runs, wall clock time.) Did I test this correctly? Is there any info I can get so as to be helpful? Unfortunately I can't share the example. :-(
from rayon.
@Eh2406 can you characterize the workload somehow? Is the work well balanced for each item, or skewed heavily? Maybe you can synthesize something that performs similarly?
For possible tunables related to length, I think we'll want both a minimum (don't bother splitting below this length) and a maximum (always split at least down to this length), where weight_max()
corresponds to asking for a maximum length 1.
from rayon.
I really dislike the weights
mechanism, since it relies on an arbitrary threshold. Which makes it meaningless as a measure, since the expensiveness of task can also vary arbitrarily. So I'm in favor removing it.
However, I'd like to retain some control, how the work is split. So I'd propose to simply introduce an optional way, to specify the number of threads used.
Something like .par_iter().jobs(42).whatever()
.
It is minimally intrusive, but offers precise manual control, if needed.
EDIT: Just saw, this was considered through sequential_threshold(N)
in #81. I'd agree, this is the right way to do this, and falling back to the adaptive scheme if not specified.
from rayon.
I think @cuviper has some thoughts here and wanted to take a crack at this. I'm going to assign to them and remove the Help-Wanted label for now.
from rayon.
Related Issues (20)
- par_bridge and IndexedParallelIterator HOT 2
- Feature Request: par_enumerate() HOT 3
- rayon_wasm : "Operation not supported on this platform" when setting number of global threads to be used HOT 1
- Extremely deep call stack on MacOS HOT 2
- calling `buffer.par_sort_unstable_by_key` from a task calls the task itself HOT 4
- How to implement ParallelIterator for a custom Range? HOT 13
- How to dynamically change the number of threads during runtime? HOT 2
- `ParallelExtend` for tuples of references HOT 3
- Using async iterator-like SQLX fetch with Rayon HOT 4
- Docs on "spawn" don't say what exactly this function does HOT 1
- Add SIMD SORT as an option HOT 1
- Handle/guard support for current thread pool HOT 1
- Optional parallelization
- Way to have assertion whether something is outside of a rayon task HOT 2
- how drop rayon whren it in a dylib and dylib should be droped? HOT 4
- Error reporting in scoped tasks
- cooperative yield in ThreadPool::install() causes unexpected behavior in nested pools HOT 4
- rayon-core tests fail to build. HOT 6
- Matrix multiplication with Rayon doesn't see perf improvements HOT 3
- general purpose WASM support? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rayon.