Comments (8)
Probably because theh addprocs
version is already compiled; https://github.com/JuliaLang/julia/blob/0c284839fef6c8c153edc01fddfa37a9f5ac6752/contrib/generate_precompile.jl#L44-L45.
from distributed.jl.
@fredrikekre did you close because there's no way to get similar speed for -p4
?
from distributed.jl.
It doesn't seem like this should have been closed. It should be as fast, and -p needed for it to be in the hands of the user, not programmer. See also: JuliaLang/julia#35830 (comment)
from distributed.jl.
Should that issue be closed and this one opened then?
from distributed.jl.
No, keep both open. Mine is not a dup (about scalability), while slightly different, the cause may or may not be the same.
First, I saw no difference, for this issue, on Julia 1.0 using defaults, nor on most recent ASSUMING these settings only:
$ hyperfine -w1 "~/julia-1.6.0-DEV-8f512f3f6d/bin/julia --compile=min -O0 --startup-file=no -E 'using Distributed; addprocs(4);'"
Benchmark JuliaLang/julia#1: ~/julia-1.6.0-DEV-8f512f3f6d/bin/julia --compile=min -O0 --startup-file=no -E 'using Distributed; addprocs(4);'
Time (mean ± σ): 1.320 s ± 0.011 s [User: 3.226 s, System: 2.114 s]
Range (min … max): 1.304 s … 1.333 s 10 runs
$ hyperfine -w1 "~/julia-1.6.0-DEV-8f512f3f6d/bin/julia -p4 --compile=min --startup-file=no -O0 -E ''"
Benchmark JuliaLang/julia#1: ~/julia-1.6.0-DEV-8f512f3f6d/bin/julia -p4 --compile=min --startup-file=no -O0 -E ''
Time (mean ± σ): 1.323 s ± 0.008 s [User: 3.259 s, System: 2.020 s]
Range (min … max): 1.309 s … 1.335 s 10 runs
For default settings, there is a difference, and even with -O0
min..max ranges do not overlap, so as I've seen that setting eliminate invalidations, I would say those are implicated?
from distributed.jl.
Now performance is switched, so problem solved!
vtjnash@deepsea4:~/julia$ hyperfine -w1 "./julia -p4 -E 'using Distributed; nprocs()'" "./julia -E 'using Distributed; addprocs(); nprocs()'"
Benchmark 1: ./julia -p4 -E 'using Distributed; nprocs()'
Time (mean ± σ): 8.952 s ± 1.129 s [User: 26.344 s, System: 0.740 s]
Range (min … max): 8.058 s … 10.398 s 10 runs
Warning: The first benchmarking run for this command was significantly slower than the rest (10.222 s). This could be caused by (filesystem) caches that were not filled until after the first run. You should consider using the '--warmup' option to fill those caches before the actual benchmark. Alternatively, use the '--prepare' option to clear the caches before each timing run.
Benchmark 2: ./julia -E 'using Distributed; addprocs(); nprocs()'
Time (mean ± σ): 14.585 s ± 0.315 s [User: 62.846 s, System: 2.424 s]
Range (min … max): 14.057 s … 14.948 s 10 runs
Summary
'./julia -p4 -E 'using Distributed; nprocs()'' ran
1.63 ± 0.21 times faster than './julia -E 'using Distributed; addprocs(); nprocs()''
Clearly needs more precompile
statements, now that Distributed
is a separate stdlib that is much more reasonable then when it was included in the default image.
from distributed.jl.
Code at JuliaLang/julia#42156
from distributed.jl.
@KristofferC Should we go ahead and enable precompile?
from distributed.jl.
Related Issues (20)
- Allow @everywhere include(...) to override default path behavior
- Distributed worker manager doesn't use socket connection to infer worker ip HOT 2
- `isready(::AbstractWorkerPool)` is inconsistent with whether `take!` will block
- Can we have `bind(::RemoteChannel, ::Process)`? HOT 1
- [Distributed.jl] inconsistent serialization of closures over global vars HOT 2
- Dynamic @distributed scheduling HOT 3
- Distributed.jl - possibility to use other Serialization libraries? HOT 6
- Underministic behavior of `addprocs()` of `SSHManager` HOT 1
- MKL_NUM_THREADS
- SIGTERM test leaks stderr interrupt trace HOT 3
- Uncaught failure or noisy test in julia CI HOT 1
- Spurious `@spawnat` parallelism with single worker, single thread HOT 2
- improve pmap code for arrays, with type/shape of result the same as map (with PR)
- RFC: "for-loop" compliant @parallel for.... take 3 (with PR)
- Use a custom hashing function for remotecall_fetch to mitigate #48 (with PR).
- add RemoteLogger for distributed logging (with PR code)
- do not lock up addprocs on worker setup errors (with PR)
- RFC: Make addprocs() safe for reuse (i.e. doesn't request more processes if re-called) (with PR)
- Add a wait(::[Abstract]WorkerPool) (with PR code)
- broken error handling in message_handler_loop
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from distributed.jl.