Comments (6)
split/apply/combine is a nice mental model, but maybe it does not need
explicit representation in code. Another direction is thinking about faster
ways to iterate, i.e., can we form partitions of data more efficiently? The
data.table package has some interesting approaches.
Michael
On Thu, Dec 12, 2013 at 2:42 AM, Michel [email protected] wrote:
I'd like to get started on this one and use this tracker to collect and
discuss ideas.AFAIR @lawremi https://github.com/lawremi suggested back in September
to use split/by (split), bp*apply (apply) and stack (combine).I'm rather unsure what functionality is needed. Usually I'm fine with
split, bplapply and l*ply/Reduce.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29
.
from biocparallel.
When I need to do a split-apply-combine type of operation, I usually turn to plyr::ddply
.
from biocparallel.
Yes, that's a useful tool. Would be nice to have a similar API on top of
BiocParallel (and thus BatchJobs). We worked toward making aggregate()
behave that way through omission of the LHS, but I think we ended up
punting due to release deadlines. Also, we'd want it to be more generic,
with support for e.g. GRanges. I rarely use a data.frame.
On Fri, Dec 13, 2013 at 5:39 PM, Ryan Thompson [email protected]:
When I need to do a split-apply-combine type of operation, I usually turn
to plyr::ddply.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/29#issuecomment-30556845
.
from biocparallel.
Any opposition to closing this issue?
from biocparallel.
This issue sort of depends on having a clean API in base R for aggregation. Currently, aggregate
and friends fall a bit short. Once we have that, then BiocParallel will need a corresponding frontend. Perhaps there is no need for a specific issue.
It would seem that BiocParallel needs a bp
analog to every member of the apply
family. In ddR, we instead define data structures that represent partitioned, distributed data that is managed by some computational engine, so we are able to use existing generics, with implicit parallelism.
from biocparallel.
OK. I've marked this as an enhancement.
from biocparallel.
Related Issues (20)
- Error in serialize(data, node$con, xdr = FALSE) : ignoring SIGPIPE signal HOT 5
- Performance (speed) degradation in MulticoreParam with default force.G = TRUE HOT 3
- Nested parallellization question HOT 6
- stop.on.error = FALSE for DoParam doesn't work as expected HOT 2
- Troubles with bplapply within function (using SnowParam on Windows) HOT 6
- MulticoreParam bplapply unable to restart upon interrupt HOT 1
- Increase depth of traceback beyond `tryCatch()` for bp* functions - possible enhancement HOT 1
- The running time isn't reduced when using bplapply()? HOT 3
- BiocParallel errors HOT 4
- Handle worker abort better HOT 1
- move Rmpi to Enhances: HOT 1
- "foreach" %in% loadedNamespaces() instead of "package:foreach" %in% search()? HOT 1
- BiocParallel errors:could not find function ".OLD_read_block" HOT 11
- I meet a error when I use BiocParallel HOT 6
- BiocParallel : long vectors are not supported in .C() HOT 5
- BiocParallel for parallelization in BEER: Error and GPU Compatibility HOT 8
- BatchtoolsParam fails to propagate errors in bpiterate HOT 2
- Extremely minor: typo in docs? man/MulticoreParam-class.Rd HOT 1
- BiocParallel socketConnection error HOT 5
- strategy of tasks in MulticoreParam HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from biocparallel.