Comments (7)
Hi @javadan ,
As you already noticed, this topic was discussed a few times now and to my knowledge there are 2 options:
- separate model for each class with separate output mask for each class (256x256x1 with 0..1 values range)
- separate binary mask for each class (256x256x1 with 0..1 values range), stacked together (256x256xNUM_CLASS with 0..1 values range) and used as a target mask for a single model with modified
num_classes
param
Otherwise, there was an option to +1 to num_classes and change the output shape to (n, w, h, num_classes). Then each class gets a full (w x h) binary mask of its own. (You are still using sigmoid and binary_crossentropy for this?)
You can use either sigmoid
(if one pixel can belong to more than one class) or softmax
(in case one pixel can only be connected with a single class) but for loss function I'm still using binary_crossentropy
, yes.
Then there's what I think I'm interested in,
where images are (n, 256 , 256, 1) (i.e. gray-scale)
and masks are also (n, 256 , 256, 1) because the pixel options are just integers from 0 to 255.
That's what I want, as output, too. I'll read the prediction mask pixel values to get the class numbers.
I see Keras recommends 'sparse_categorical_crossentropy' as the loss function for this use case, and then it apparently doesn't matter if you use sigmoid or softmax.
I'm sorry but I don't understand how this would work. Both sigmoid and softmax activation functions output values between 0..1 so I don't see how would they transform that output into what you're looking for (0..255).
Closest to what you're trying to achieve is using linear
regression but I don't see how that could work either.
Again, from what I know in terms of multi-class image segmentation problem it all comes down to 2 different methods which I pointed out at the top of this answer. Can you read these again and let me know what holds you back from using these methods?
Happy to discuss this further if you need
from keras-unet.
Hi @karolzak
Ok, I will let you know if I work out how to do it the way I'm describing.
Otherwise, I'll use one of the multiple binary-segmentation methods.
The multiple binary-segmentation methods should work fine for me, once I've turned my 5-class mask into 5 x 1-class masks. I'm just thinking ahead, in case I decide to add more classes later.
If num_classes increases,
Then with an integer-encoded single layer output, no changes would need to be made to the architecture or code, and the size of the network doesn't increase.
With the layer per class methods, the architecture and code and size of the network increases with every new class.
I imagine it's possible, as the occasional answer here and there seem to suggest that softmax and sparse_categorical_crossentropy could allow for integer encoding. (Perhaps the class ids are dividing by 255, to get them between 0 to 1, for training, and then multiplied by 255, to get back to 0 to 255 for the final PNG output).
But anyway, was just finding out if you were familiar with integer-encoding multi-class segmentation. I'll give it a try, and will probably end up using one of your suggested methods, in the end, when it doesn't work.
Thanks for your time
from keras-unet.
If num_classes increases,
Then with an integer-encoded single layer output, no changes would need to be made to the architecture or code, and the size of the network doesn't increase.
With the layer per class methods, the architecture and code and size of the network increases with every new class.
Well I partially agree with this statement although the change is tiny and it's only the output tensor size that changes so it would never become a concern in terms of network size. On top of that if you write your training logic well then there's no need for code changes while retraining. num_classes
can be easily provided automatically based on your masks.
In fact in terms of trainable params it would barely change at all:
As you can see above, in comparison to the overall network size the difference is omittable.
I imagine it's possible, as the occasional answer here and there seem to suggest that softmax and sparse_categorical_crossentropy could allow for integer encoding. (Perhaps the class ids are dividing by 255, to get them between 0 to 1, for training, and then multiplied by 255, to get back to 0 to 255 for the final PNG output).
I read through these suggestions and run some experiments with sparse_categorical_crossentropy
but it doesn't change much tbh. Yes you can pass in 256x256x1 tensor of integers as Y to calculate the loss function but it does not change the fact that the output tensor from the network still needs to be of shape 256x256xNUM_CLASSES (same as with binary_crossentropy
) where NUM_CLASS==max_class_ID
. If you use a mask with values like [0 1 2 3 256] your NUM_CLASS needs to be 256 so it would be best to encode 256 into 4 to avoid artificially blowing up the size of the output tensor.
So in fact the network size using sparse_categorical_crossentropy is the same as when using binary_crossentropy because for both of these the output network tensor would be of the same size/shape - the only difference is that for sparse you need target of shape 256x256x1
vs for binary you need 256x256xNUM_CLASS
.
Good luck and do let me know how it went!
from keras-unet.
hello,
Can ypu please help me underatand, how can i address the overlapping masks. I have 15 class masks as binary mask. I could use categorical cross entropy loss by one hot encoding the targets, but i manually removed the overlapping pixels from some of the classes. The output is not so accurate. Can i make use of the overlapped binary masks for 15 classes with binary cross entropy loss?
thank you
from keras-unet.
Hi @soans1994
How big of an overlap are we talking about here?
I suspect the problem of your output not being so accurate might be caused by something else than just overlapping pixels.
When it comes to image segmentation for multiple classes what I found working best is training a separate binary classification model for each class
from keras-unet.
Hi @karolzak
Can you help me understand this line If you use a mask with values like [0 1 2 3 256] your NUM_CLASS needs to be 256 so it would be best to encode 256 into 4 to avoid artificially blowing up the size of the output tensor.
Why NUM_CLASS needs to be 256 instead of 5?
from keras-unet.
@akashsindhu96 he meant I should just use 4 to represent the value 256. (Output layer would need to be
256x256x256 if I need it to output 0-256 but only needs to be 256x256x5 if I need it to output 0-4)
from keras-unet.
Related Issues (20)
- Early stopping in model HOT 2
- Including multiple classes in satellite unet HOT 1
- ValueError: logits and labels must have the same shape HOT 4
- Loading the model weights HOT 1
- Image Patching -> Image Reconstructing Workflow HOT 3
- input size HOT 2
- loss and iou is nan in satellite_unit HOT 3
- IoU smooth parameters incompatible with input being np.uint8 HOT 1
- Error with the utils file while using RGB images HOT 1
- [New functionality] Adding custom U-Net with 1D convolutions
- Multiclass segmentation with different labels and satellite unet
- Input 0 is incompatible with layer model HOT 12
- satellite_unet custom_objects: '<' not supported between instances of 'function' and 'str' HOT 6
- Loss function return shape HOT 1
- Training on Jetson NX HOT 5
- Input 0 of layer "sequential_1" is incompatible with the layer: expected shape=(None, 1, 3), found shape=(None, 3) HOT 1
- Modifying image to accept 8 band satellite image HOT 1
- Semantic segmentation for satellite images
- logits and labels must have the same first dimension HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from keras-unet.