Coder Social home page Coder Social logo

split/apply/combine paradigm about biocparallel HOT 6 OPEN

mllg avatar mllg commented on July 20, 2024
split/apply/combine paradigm

from biocparallel.

Comments (6)

lawremi avatar lawremi commented on July 20, 2024

split/apply/combine is a nice mental model, but maybe it does not need
explicit representation in code. Another direction is thinking about faster
ways to iterate, i.e., can we form partitions of data more efficiently? The
data.table package has some interesting approaches.

Michael

On Thu, Dec 12, 2013 at 2:42 AM, Michel [email protected] wrote:

I'd like to get started on this one and use this tracker to collect and
discuss ideas.

AFAIR @lawremi https://github.com/lawremi suggested back in September
to use split/by (split), bp*apply (apply) and stack (combine).

I'm rather unsure what functionality is needed. Usually I'm fine with
split, bplapply and l*ply/Reduce.


Reply to this email directly or view it on GitHubhttps://github.com//issues/29
.

from biocparallel.

DarwinAwardWinner avatar DarwinAwardWinner commented on July 20, 2024

When I need to do a split-apply-combine type of operation, I usually turn to plyr::ddply.

from biocparallel.

lawremi avatar lawremi commented on July 20, 2024

Yes, that's a useful tool. Would be nice to have a similar API on top of
BiocParallel (and thus BatchJobs). We worked toward making aggregate()
behave that way through omission of the LHS, but I think we ended up
punting due to release deadlines. Also, we'd want it to be more generic,
with support for e.g. GRanges. I rarely use a data.frame.

On Fri, Dec 13, 2013 at 5:39 PM, Ryan Thompson [email protected]:

When I need to do a split-apply-combine type of operation, I usually turn
to plyr::ddply.


Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-30556845
.

from biocparallel.

vobencha avatar vobencha commented on July 20, 2024

Any opposition to closing this issue?

from biocparallel.

lawremi avatar lawremi commented on July 20, 2024

This issue sort of depends on having a clean API in base R for aggregation. Currently, aggregate and friends fall a bit short. Once we have that, then BiocParallel will need a corresponding frontend. Perhaps there is no need for a specific issue.

It would seem that BiocParallel needs a bp analog to every member of the apply family. In ddR, we instead define data structures that represent partitioned, distributed data that is managed by some computational engine, so we are able to use existing generics, with implicit parallelism.

from biocparallel.

vobencha avatar vobencha commented on July 20, 2024

OK. I've marked this as an enhancement.

from biocparallel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.