Comments (5)
Hi Pete,
This one has been patiently sitting in a corner for a while ;-)
blockApply()
and family has matured a bit in the last 2 years, and now we have viewportApply()
and viewportReduce()
in addition to blockApply()
and blockReduce()
.
viewportReduce()
would be a better choice in general than blockApply(x, FUN, sink)
for block-processing with on-the-fly writing to a realization sink. For your use case, it would look something like this:
library(DelayedArray)
## Normalization function:
my_powerful_normalization_algo <- function(m, shift=0) { m + shift }
## Block-processed version of the normalization function (we want to be as much
## backend-agnostic as we can, so no parallelization):
BLOCK_my_powerful_normalization_algo <- function(sink, m, shift=0, verbose=NA)
{
stopifnot(identical(dim(sink), dim(m)))
## By setting 'block.shape' to "first-dim-grows-first" we're guaranteed to be
## compatible with realization sinks that only support linear writing (e.g.
## TENxRealizationSink objects).
grid <- defaultAutoGrid(sink, block.shape="first-dim-grows-first")
## Define callback function to pass to viewportReduce().
FUN <- function(viewport, sink, shift) {
block <- read_block(m, viewport)
block <- my_powerful_normalization_algo(block, shift)
write_block(sink, viewport, block)
}
viewportReduce(FUN, grid, sink, shift=shift, verbose=verbose)
}
Let's try it:
library(TileDBArray)
## Matrix to normalize:
M <- writeTileDBArray(matrix(1:6000, nrow=50))
Create HDF5 realization sink:
library(HDF5Array)
sink <- HDF5RealizationSink(dim(M), chunkdim=c(20, 20))
Block process:
setAutoBlockSize(8000)
sink <- BLOCK_my_powerful_normalization_algo(sink, M, 0.1, verbose=TRUE)
# \ Processing viewport 1/12 ... OK
# \ Processing viewport 2/12 ... OK
# \ Processing viewport 3/12 ... OK
# \ Processing viewport 4/12 ... OK
# \ Processing viewport 5/12 ... OK
# \ Processing viewport 6/12 ... OK
# \ Processing viewport 7/12 ... OK
# \ Processing viewport 8/12 ... OK
# \ Processing viewport 9/12 ... OK
# \ Processing viewport 10/12 ... OK
# \ Processing viewport 11/12 ... OK
# \ Processing viewport 12/12 ... OK
Close sink and coerce:
close(sink)
as(sink, "DelayedArray")
# <50 x 120> matrix of class HDF5Matrix and type "double":
# [,1] [,2] [,3] ... [,119] [,120]
# [1,] 1.1 51.1 101.1 . 5901.1 5951.1
# [2,] 2.1 52.1 102.1 . 5902.1 5952.1
# [3,] 3.1 53.1 103.1 . 5903.1 5953.1
# [4,] 4.1 54.1 104.1 . 5904.1 5954.1
# [5,] 5.1 55.1 105.1 . 5905.1 5955.1
# ... . . . . . .
# [46,] 46.1 96.1 146.1 . 5946.1 5996.1
# [47,] 47.1 97.1 147.1 . 5947.1 5997.1
# [48,] 48.1 98.1 148.1 . 5948.1 5998.1
# [49,] 49.1 99.1 149.1 . 5949.1 5999.1
# [50,] 50.1 100.1 150.1 . 5950.1 6000.1
This is with DelayedArray 0.15.14.
The major difference with the blockApply(x, FUN, sink)
approach is that here we walk on a grid defined on the sink instead of on x
. Turns out that for the particular normalization use case the transformation is isometric so x
and the sink
have the same geometry but this is not the case in general. For the general case, we want to walk on a grid defined on the sink. blockApply(x, FUN, sink)
doesn't allow this.
Hope this makes sense,
H.
from delayedarray.
Hi @PeteHaitch ,
Was never entirely happy with viewportReduce()
as the primary tool for walking on a realization sink and filling it with data. Today I added sinkApply()
in DelayedArray 0.17.11 as a slightly better tool for that. Name and interface are a little bit more intuitive, I hope. It's documented in ?sinkApply
.
H.
from delayedarray.
FYI, this is what I'm currently using, but I'd prefer to retire this for an 'official' solution. It requires (and doesn't check) that the sink
is the appropriate type and dimensions.
blockApplyWithRealization <- function(x, FUN, ..., grid = NULL, sink = NULL,
BPREDO = list(), BPPARAM = bpparam()) {
FUN <- match.fun(FUN)
grid <- DelayedArray:::.normarg_grid(grid, x)
nblock <- length(grid)
bplapply(seq_len(nblock), function(b) {
if (DelayedArray:::get_verbose_block_processing()) {
message("Processing block ", b, "/", nblock, " ... ",
appendLF = FALSE)
}
viewport <- grid[[b]]
block <- DelayedArray:::extract_block(x, viewport)
if (!is.array(block)) {
block <- DelayedArray:::.as_array_or_matrix(block)
}
attr(block, "from_grid") <- grid
attr(block, "block_id") <- b
block_ans <- FUN(block, ...)
# NOTE: This is the only part different from DelayedArray::blockApply()
if (!is.null(sink)) {
write_block_to_sink(block_ans, sink, viewport)
block_ans <- NULL
}
if (DelayedArray:::get_verbose_block_processing()) {
message("OK")
}
},
BPREDO = BPREDO,
BPPARAM = BPPARAM)
}
from delayedarray.
Thanks, Herve. I'm not currently working on anything that requires this but it's good to have it there.
from delayedarray.
Yeah I realize this arrives kind of late. Oh well, maybe at some point when I've nothing else to do I'll replace blockApplyWithRealization()
with sinkApply()
in minfi and bsseq ;-)
from delayedarray.
Related Issues (20)
- Have extract_array() accept non-integer dim() outputs HOT 1
- Where can i find the old version HOT 1
- Can't `cbind` DelayedArray instances with other matrix-likes
- More elements in the method signature (2) than in the generic signature (1) for function ‘type<-’ HOT 2
- Bioconductor devel / multiple methods tables found for ‘aperm’ HOT 1
- Install error HOT 2
- `set_grid_context()` changed in `RELEASE_3_17` HOT 2
- Thoughts on migrating from `read_sparse_block()` to SVTs HOT 2
- rowRanges/colRanges not preserving row/col names HOT 2
- colMaxs/rowMaxs and colMins/rowMins useNames behavior incorrect in RELEASE_3_17 HOT 2
- `NSBS` method for `DelayedArray` objects? HOT 3
- Unable to find an inherited method for function ‘OLD_extract_sparse_array’ for signature ‘"CSR_H5SparseMatrixSeed"’ HOT 2
- Feature request: add `BPCell` backend HOT 8
- Feature request: make `seedApply` as a generic function HOT 4
- More informative error than `the supplied seed must support extract_array()` HOT 3
- Inconsistent `dimnames` after matrix multiplication with zero-column output HOT 2
- Importing DelayedArray messes up as.vector() for Arrow arrays HOT 1
- DelayedMatrixStats no longer supports scalar `center=` arguments HOT 5
- `base::cbind()`
- the problem "the supplied seed must support extract_array() " HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from delayedarray.