Comments (3)
I re-ran the original oneformer model using the ade20k dataset with swin backbone. I used the config file oneformer_swin_large_bs16_160k.yaml and I see similar inconsistencies
Where I printed the image shape using print(images[0].shape)
after this line:
OneFormer/oneformer/oneformer_model.py
Line 274 in 4962ef6
And the mask cls and mask pred shapes are printed after this line:
OneFormer/oneformer/oneformer_model.py
Line 309 in 4962ef6
with print(mask_cls_results.shape)
and print(mask_pred_results.shape)
My point here is that the pred shapes are not guaranteed to match my input image resolution. So at what point in the model do these shapes match? Where should I begin evaluation?
Also, despite the mask_pred_results being different heights and widths than the original input images, the model does seem to produce outputs that are in the ballpark of the original image (when run on the ade20k dataset and config). Meanwhile, with my custom dataset, the outputs are 64 x 64 when I need them to be 256 x 256. Would you expect your model to behave this way? Or would you expect that my configuration and custom dataset may be wrong?
from oneformer.
I'm still scratching my head trying to figure this out. In a previous Issue, you suggested to extract the correct masks as follows: #82 (comment)
specifically with specific_categiry_mask = (sem_seg == category_id).float()
but using the ade20k config file, I noticed the output of sem seg is:
Looking at the unique values output by this matrix, how could (sem_seg == category_id)
possibly be valid? Not just because category_id are ints
and the values in sem_seg
are floats, but also because the range of category_id
s for ade20k are greater than 3?
To reiterate, I am genuinely confused about this model, its outputs, and what would be its expected behaviors.
from oneformer.
After comparing the model input format documentation for detectron2 with the provided example for semantic segmentation custom dataset mappers I noticed a mismatch.
In detectron2 they expect the values of sem_seg to be class labels with the ground truth resolution as [H, W]. However, following your custom mapper class, you use "instance" like labels, such that, the gt_masks
are [N,H,W].
I have not tested this; but if this is the issue, I would highly recommend adding additional documentation for this.
from oneformer.
Related Issues (20)
- undefined symbol: _ZNK3c107SymBool10guard_boolEPKcl
- whether I need to download Swin transformer pretraining weight for training
- Swin large backbone warning: "norm.bias will not be loaded. Please double check and see if this is desired." HOT 1
- AssertionError when trying to reproduce result HOT 2
- Inference time
- Class labels Issue
- Different result in different GPU HOT 1
- Update colab demo HOT 1
- Resources required to fine-tune this model with swin-l HOT 13
- Potential code bug but always good model
- Freezing layers for DiNAT model
- Cityscapes val SOTA results not found in the paper HOT 2
- Training ADE20K for Instance Segmentation Only
- ModuleNotFoundError: No module named 'detectron2.config' HOT 3
- Wrong citation
- Installation and setting up this repo is challenging HOT 1
- How to set prefetch_factor? HOT 1
- How to set the image size to run demo ?
- Runtime Error HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from oneformer.