Coder Social home page Coder Social logo

harvestr's Introduction

The harvestr Parallel Simulation Framework.

Travis-CI Build Status Coverage Status CRAN version

The harvestr package is a framework for conducting replicable parallel simulations in R. It builds off the the popular plyr package for split apply combine framework, and the parallel combined multiple-recursive generator from L'Ecuyer (1999).

Due to the replicable simulations being based off seed values,this package takes a theme of seeds and farming. The principal functions are as follows:

  • gather - Creates a list of parallel rng seeds.
  • farm - Uses seeds from gather to evaluate expressions after each seed has been set. This is usefull for generating data.
  • harvest - This will take the results from farm and continue evaluation with the random number generation where farm left off. This is useful for the evaluating data generated with farm, through stochastic methods such as Markov Chain Monte Carlo.
  • reap - is the single version of harvest for a single element that has appropriately structured seed attributes.
  • plant - takes a list of objects, assumed to be of the same class, and gives each element a parallel seed value to use with harvest for evaluation.
  • graft - splits RNG sub-streams from a main object.
  • sprout - gets the seeds for use in graft.

Lists

All of the functions work off lists, They expect and return lists, which can be easily converted to data frames. I would do this with ldply(list, I).

Parallel

The advantage of setting the seeds like this is that parallelization is seamless and transparent, similar to the plyr framework each function has a .parallel argument, which defaults to FALSE, but when set to true will evaluate and run in parallel. An appropriate parallel backend must be specified. For example, with a multicore backend you would run the following code.

library(doMC)
regiserDoMC()

See the plyr and foreach packages documentation for what backends are currently supported.

Operating Systems

harvestr is limited in it's capabilities by the packages that it depends on, mainly foreach and plyr The Parallel backends are platform limited read the individual packages documentation:

Notes

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

harvestr's People

Contributors

halpo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

sierxue

harvestr's Issues

Function to apply a list of functions to a generated dataset

We need a function that will create apply a list of functions to a generate dataset where the dataset holds the seed but the functions are also stochastic such as might contain cross validation.

The effect would be:

    methods <- .T( lasso, lar, scad, enet, relaxo, penalized.pls, penalized )
    FUNS <- paste0("analyze.", methods) %>% sapply(match.fun)
    seeds <- sprout(data, length(FUNS))
    F2 <- plant(FUNS, seeds=seeds)
    harvest(F2, function(f, ...){f(...)}, data=data, beta=beta)

Options:

  • could be implemented in graft as an
  • could be implemented as a new function, possibly splice?

Note:
This could also implement for arguments such as

call.list <- 
mapply( call
      , "analyze"
      , c(substitute(data))
      , method = methods
      , SIMPLIFY=FALSE
      )
seeded.calls <- plant(call.list, seeds=sprout(data, length(call.list)))
harvest(seeded.calls, eval, envir=environment())

This does seem to have issues though.

Release Version 1.0.0

Complete all the following prior to releasing to CRAN:

  • Documentation
    • DESCRIPTION is up to date
      • Title case for package title.
      • Description includes complete sentences with proper grammar.
    • Man pages up to date.
      • All functions include examples
      • All functions include \value/@return
    • NEWS.md is up to date.
  • devtools pre-release checks
    • spell_check()
    • check(cran=TRUE)
    • check_rhub()
    • check_win_release()
    • check_win_devel()
  • create/update cran-comments.md (create with usethis::use_cran_comments())
  • release() to CRAN.
  • tag release in git.

Convert caching to rds format

  • use saveRDS() and readRDS() for writing and reading cache files.
  • add option for harvests cache compression..
  • Add cache.file attribute to harvestr-fruit objects.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.