Comments (7)
Hello Rui,
the likelihood is automatically vectorised in createBayesianSetup, so if you set parallel = T, your likelihood should be automatically parallelised. You should see that several Rsessions are open on your system in this case.
If your likelihood is very easy to compute, this will not show in much additional CPU load because most of the time, the CPUs are idle and time is spend in communicating within the socket cluster. I have found in practice that likelihood parallelisation makes sense for likelihoods with > 50ms evaluation time or so. For faster likelihoods, it makes more sense to parallelise the MCMC chains,
Best
Florian
from bayesiantools.
Hi Florian,
Many thanks for helping me address my question!
It takes ~1min to compute my likelihood function, where ODEs are computed multiple times against different data sets. I turned on MCMC for 10000 iterations 2 days ago while I posted this question, and it just finished 2000 iterations this morning. The package returned a message "parallel function execution created with39cores", and as you mentioned, R opened multiple sessions after that. However, the total CPU usage is less than 10%, so it seems that the computation has not taken advantage of parallelization. I am wondering if I coded likelihood function wrong, such that computation has to be done sequentially somehow. I remember that when I use package "DEoptim" for parallel computation, it requires, as an argument, a list of names of packages and functions used in my likelihood calculation. Does BayesianTools have a similar requirement?
Rui
from bayesiantools.
Hello Rui,
you can control package export by hand, but per default in BT, your entire environment (data + packages) are exported, so be careful what you have in your environment or control by hand.
What algorithm are you using? Note that parallelisation can only work up to the number of internal chains in your algorithm - so if you run a DEzs with 3 internal chains, it doesn't help if you have 39 cores, it will still only use 3 cores at a time.
If you have a computer with 40 cores, I would think the best use is to run three independent MCMC chains (has to be done by hand) and then set up the DEzs with parallel chains to make best use of your hardware.
Best
F
from bayesiantools.
Hi Florian,
Please forgive my naive MCMC questions. I am using a metropolis / AM sampler. Does it mean that I can at best use 3 cores if I only have 3 MCMC chains? If I switch to DEzs sampler, I can maximize the use of CPU by increasing the number of internal chains of DEzs algorithm. Is this a correct understanding?
Could you provide me an example about how to set the number of internal chain vs independent MCMC chain? I believe that I can change the number of MCMC chain by specifying "nrChains" in runMCMC settings list. How to change the number of DEzs internal chain?
In addition, is there any general guidance about when should we use which sampling algorithm?
Thank you!
Rui
from bayesiantools.
Hi Rui,
MCMCs are usually not parallelizable, because the next step depends on the previous step.
There are only a few specific things that can me parallelised, e.g. you can do parallel proposals all Samplers that apply rejection (basically all Samplers in BT, but not implemented in BT), our you can calculate the chains in population MCMCs such as DEzs in parallel.
If you want to use a large number of cores, you should probably go for an SMC, see our recent paper Speich, M., Dormann, C. F., & Hartig, F. (2021). Sequential Monte-Carlo algorithms for Bayesian model calibration–A review and method comparison✰. Ecological Modelling, 455, 109608. https://doi.org/10.1016/j.ecolmodel.2021.109608 The code for this is in a branch of the BT GitHub repo, I haven't managed yet to merge it into the main branch.
As a default for most users with complicated models are runtime problems, I would recommend to
- Use DEzs and possibly increase the number of internal chains (this is set by the z-matrix, see help of DEzs)
- Turn on parallelisation
- If you want to run several independent MCMC chains for convergence checks (recommended), run this in parallel as well, see https://cran.r-project.org/web/packages/BayesianTools/vignettes/InterfacingAModel.html#parallelization
There is an open issue #181 to improve the documentation on the parallelisation, and I'll take this as a nudge to bump this up the priority list
from bayesiantools.
Many thanks for detailed explanation, Florian!
For my curiosity, is there a plan to include additional "popular" samplers into BayesianTools? For example, Gibbs and NUTS offered in BUGS and Stan? In your opinion, what are advantage and disadvantage of these samplers over SMC and DE you've included in BayesianTools?
from bayesiantools.
No, currently my idea is that BT will only include "black box samplers" that do not require either derivatives or the structure of the likelihood. For Gibbs or NUTS, you would need a metalanguage such as in JAGS or STAN that allows the sampler to understand the mathematical structure of the likelihood.
from bayesiantools.
Related Issues (20)
- Pass on extra arguments to the likelihood HOT 4
- Unwritable Outputs For Some Random Seeds HOT 7
- CRAN problem (HTML5)
- Combine DE and DEzs code
- CRAN requires HTML5 for documentation HOT 1
- Add red card dataset to DHARMa?
- Metropolis sampler startup error: Matrix seems negative semi-definite HOT 1
- No longer on CRAN HOT 2
- Incoming CRAN / win builder checks - detritus in temp directory HOT 3
- bridgesample function neccessary? HOT 2
- Questions about BT
- Simulation-based calibration HOT 1
- Number of calls to likelihood function HOT 1
- CRAN DHARMa.Rd issue HOT 3
- Vignette - should there be an example for convergence check?? HOT 1
- Examples in Vignette with eval = F intentional? HOT 1
- Spatially constrained priors? HOT 3
- AR1 function sd estimate wrongly scaled HOT 3
- Writing a hierarchical model - unsure where to assign population-level parameters
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bayesiantools.