Coder Social home page Coder Social logo

Comments (5)

hpages avatar hpages commented on July 24, 2024 1

Hi Pete,

This one has been patiently sitting in a corner for a while ;-)

blockApply() and family has matured a bit in the last 2 years, and now we have viewportApply() and viewportReduce() in addition to blockApply() and blockReduce().

viewportReduce() would be a better choice in general than blockApply(x, FUN, sink) for block-processing with on-the-fly writing to a realization sink. For your use case, it would look something like this:

library(DelayedArray)

## Normalization function:
my_powerful_normalization_algo <- function(m, shift=0) { m + shift }

## Block-processed version of the normalization function (we want to be as much
## backend-agnostic as we can, so no parallelization):
BLOCK_my_powerful_normalization_algo <- function(sink, m, shift=0, verbose=NA)
{
    stopifnot(identical(dim(sink), dim(m)))

    ## By setting 'block.shape' to "first-dim-grows-first" we're guaranteed to be
    ## compatible with realization sinks that only support linear writing (e.g.
    ## TENxRealizationSink objects).
    grid <- defaultAutoGrid(sink, block.shape="first-dim-grows-first")

    ## Define callback function to pass to viewportReduce().
    FUN <- function(viewport, sink, shift) {
        block <- read_block(m, viewport)
        block <- my_powerful_normalization_algo(block, shift)
        write_block(sink, viewport, block)
    }

    viewportReduce(FUN, grid, sink, shift=shift, verbose=verbose)
}

Let's try it:

library(TileDBArray)
## Matrix to normalize:
M <- writeTileDBArray(matrix(1:6000, nrow=50))

Create HDF5 realization sink:

library(HDF5Array)
sink <- HDF5RealizationSink(dim(M), chunkdim=c(20, 20))

Block process:

setAutoBlockSize(8000)
sink <- BLOCK_my_powerful_normalization_algo(sink, M, 0.1, verbose=TRUE)
# \ Processing viewport 1/12 ... OK
# \ Processing viewport 2/12 ... OK
# \ Processing viewport 3/12 ... OK
# \ Processing viewport 4/12 ... OK
# \ Processing viewport 5/12 ... OK
# \ Processing viewport 6/12 ... OK
# \ Processing viewport 7/12 ... OK
# \ Processing viewport 8/12 ... OK
# \ Processing viewport 9/12 ... OK
# \ Processing viewport 10/12 ... OK
# \ Processing viewport 11/12 ... OK
# \ Processing viewport 12/12 ... OK

Close sink and coerce:

close(sink)
as(sink, "DelayedArray")
# <50 x 120> matrix of class HDF5Matrix and type "double":
#         [,1]   [,2]   [,3] ... [,119] [,120]
#  [1,]    1.1   51.1  101.1   . 5901.1 5951.1
#  [2,]    2.1   52.1  102.1   . 5902.1 5952.1
#  [3,]    3.1   53.1  103.1   . 5903.1 5953.1
#  [4,]    4.1   54.1  104.1   . 5904.1 5954.1
#  [5,]    5.1   55.1  105.1   . 5905.1 5955.1
#   ...      .      .      .   .      .      .
# [46,]   46.1   96.1  146.1   . 5946.1 5996.1
# [47,]   47.1   97.1  147.1   . 5947.1 5997.1
# [48,]   48.1   98.1  148.1   . 5948.1 5998.1
# [49,]   49.1   99.1  149.1   . 5949.1 5999.1
# [50,]   50.1  100.1  150.1   . 5950.1 6000.1

This is with DelayedArray 0.15.14.

The major difference with the blockApply(x, FUN, sink) approach is that here we walk on a grid defined on the sink instead of on x. Turns out that for the particular normalization use case the transformation is isometric so x and the sink have the same geometry but this is not the case in general. For the general case, we want to walk on a grid defined on the sink. blockApply(x, FUN, sink) doesn't allow this.

Hope this makes sense,

H.

from delayedarray.

hpages avatar hpages commented on July 24, 2024 1

Hi @PeteHaitch ,

Was never entirely happy with viewportReduce() as the primary tool for walking on a realization sink and filling it with data. Today I added sinkApply() in DelayedArray 0.17.11 as a slightly better tool for that. Name and interface are a little bit more intuitive, I hope. It's documented in ?sinkApply.

H.

from delayedarray.

PeteHaitch avatar PeteHaitch commented on July 24, 2024

FYI, this is what I'm currently using, but I'd prefer to retire this for an 'official' solution. It requires (and doesn't check) that the sink is the appropriate type and dimensions.

blockApplyWithRealization <- function(x, FUN, ..., grid = NULL, sink = NULL,
                                      BPREDO = list(), BPPARAM = bpparam()) {
    FUN <- match.fun(FUN)
    grid <- DelayedArray:::.normarg_grid(grid, x)
    nblock <- length(grid)
    bplapply(seq_len(nblock), function(b) {
        if (DelayedArray:::get_verbose_block_processing()) {
            message("Processing block ", b, "/", nblock, " ... ",
                    appendLF = FALSE)
        }
        viewport <- grid[[b]]
        block <- DelayedArray:::extract_block(x, viewport)
        if (!is.array(block)) {
            block <- DelayedArray:::.as_array_or_matrix(block)
        }
        attr(block, "from_grid") <- grid
        attr(block, "block_id") <- b
        block_ans <- FUN(block, ...)
        # NOTE: This is the only part different from DelayedArray::blockApply()
        if (!is.null(sink)) {
            write_block_to_sink(block_ans, sink, viewport)
            block_ans <- NULL
        }
        if (DelayedArray:::get_verbose_block_processing()) {
            message("OK")
        }
    },
    BPREDO = BPREDO,
    BPPARAM = BPPARAM)
}

from delayedarray.

PeteHaitch avatar PeteHaitch commented on July 24, 2024

Thanks, Herve. I'm not currently working on anything that requires this but it's good to have it there.

from delayedarray.

hpages avatar hpages commented on July 24, 2024

Yeah I realize this arrives kind of late. Oh well, maybe at some point when I've nothing else to do I'll replace blockApplyWithRealization() with sinkApply() in minfi and bsseq ;-)

from delayedarray.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.