Coder Social home page Coder Social logo

Comments (6)

praeclarumjj3 avatar praeclarumjj3 commented on May 20, 2024

Hi @rockywind, thanks for your interest in our work.

If I understand your question correctly, you want to generate a single-channel segmentation mask for the instance segmentation predictions. Please provide more description about your issue if this is not the case.

You can loop through the pred_masks stored in the instance predictions, assign an ID to each mask, and aggregate those into a single channel mask.

result.pred_masks = (mask_pred > 0).float()

from oneformer.

rockywind avatar rockywind commented on May 20, 2024

Hi,
thank you for your help.
Each pixel value represents an instance category, the value is 1,2,3, and so on. The 0 is the representation's background.
But, I found that the value of result.pred_masks is between 0 and 1, the shape of result.pred_masks is [7, 1114, 2191], the image's size is [1114, 2191] .

from oneformer.

praeclarumjj3 avatar praeclarumjj3 commented on May 20, 2024

I believe you are talking about the semantic segmentation result, where each pixel corresponds to the corresponding object's category.

You need to do an argmax operation on the semantic predictions to obtain those.

predictions["sem_seg"].argmax(dim=0).to(self.cpu_device), alpha=0.7

from oneformer.

rockywind avatar rockywind commented on May 20, 2024

Hi,
Sorry for not being clear before. The following is sample data。
There are 3 cars in the picture, the first car's pixel value is 1, the second car's pixel value is 2, and the third car's pixel is 3.
0151

from oneformer.

praeclarumjj3 avatar praeclarumjj3 commented on May 20, 2024

Each pixel value represents an instance category, the value is 1,2,3, and so on. The 0 is the representation's background.
But, I found that the value of result.pred_masks is between 0 and 1, the shape of result.pred_masks is [7, 1114, 2191], the image's size is [1114, 2191] .

Right, that's what I thought you wanted to do. You can loop through the result.pred_masks, assign an ID (starting from 1) to each mask, and aggregate them on an all-zeros mask. Please find the pseudo-code below:

# create an all-zeros mask
single_channel_mask = torch.zeros_like(image) # or torch.zeros((1114, 2191))
count = 0

# loop through all instance masks
for mask in result.pred_masks:
    count += 1
    mask *= count
    single_channel_mask = torch.max(single_channel_mask, mask)

Let me know if you have any more issues.

from oneformer.

rockywind avatar rockywind commented on May 20, 2024

Thank you very much.
I have a try!

from oneformer.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.