Coder Social home page Coder Social logo

Comments (2)

AdeelH avatar AdeelH commented on June 5, 2024

SemanticSegmentationLabels does not provide a mechanism for subsetting the extent. If you want to do that, you will need something like the following:

import numpy as np
from rastervision.core.data.label import SemanticSegmentationDiscreteLabels
from rastervision.core.box import Box

# Imagine my raster is 100 x 100
# I have a prediction for the Box from (10,10) to (20,20)
window = Box(ymin=10, xmin=10, ymax=20, xmax=20)
prediction = np.ones((10, 10), dtype=np.uint8)

# define bbox to spatially subset the full raster
bbox = Box(ymin=10, xmin=10, ymax=20, xmax=20)
# the RV convention is to use "extent" to mean a Box(0, 0, H, W) box
extent = bbox.extent

# the window coords as coords within the bbox
# i.e. Box(10, 10, 20, 20) --> Box(0, 0, 10, 10)
window_bbox_coords = window.to_local_coords(bbox)

ssdl_test = SemanticSegmentationDiscreteLabels.from_predictions(
    windows=[window_bbox_coords],
    predictions=[prediction],
    extent=extent,
    num_classes=2
)

ssdl_test.hit_mask.sum()
#> 100
ssdl_test.pixel_counts.sum()
#> 100

I think the issue is here. I'm a little confused about the indexing here - what's src, what's dst, and which arrays correspond to which coordinate system. The conversion to_global_coords and back to_local_coords seems to be where my mental model of what should happen differs from what's actually happening.

Yeah, that bit of code is not the clearest.

Imagine self.extent is Box(0, 0, 100, 100) and window is Box(80, 80, 120, 120) and pixel_class_ids is a 40x40 array corresponding to the window. What we want to do is read from pixel_class_ids[:20, :20] write to self.pixel_counts[:, 80:100, 80:100]

So window_dst will be Box(80, 80, 100, 100) and window_src will be Box(0, 0, 20, 20) i.e. the coords within the pixel_class_ids array.

Here's the code with some more annotation (and some lines moved around):

    def add_window(self, window: Box, pixel_class_ids: np.ndarray) -> None:
        # The window might overflow the extent, so we subset the window to
        # the portion that intersects with the extent.
        # dst := the self.pixel_counts array
        window_dst = window.intersection(self.extent)

        # Map subsetted window to the appropriate coords within the input
        # pixel_class_ids array.
        # src := the input pixel_class_ids array
        window_src = window_dst.to_global_coords(
            self.extent).to_local_coords(window)

        # read sub-window from the source array
        src_yslice, src_xslice = window_src.to_slices()
        pixel_class_ids = pixel_class_ids.astype(self.dtype)
        pixel_class_ids = pixel_class_ids[..., src_yslice, src_xslice]

        # write sub-window in the destination array
        dst_yslice, dst_xslice = window_dst.to_slices()
        window_pixel_counts = self.pixel_counts[:, dst_yslice, dst_xslice]
        for ch_class_id, ch in enumerate(window_pixel_counts):
            ch[pixel_class_ids == ch_class_id] += 1
        self.hit_mask[dst_yslice, dst_xslice] = True

from raster-vision.

AdeelH avatar AdeelH commented on June 5, 2024

Closing this now. Feel free to reopen if there are more questions.

from raster-vision.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.