I have a question about the interpretability of the quantization error. <p dir="au

hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

quantization error (theoretical question) about minisom HOT 13 CLOSED

justglowing commented on May 17, 2024

quantization error (theoretical question)

from minisom.

Comments (13)

lachhebo commented on May 17, 2024 1

I can, but i'm more interested on the internal validity of my clusters.

My plan is to use the clustering operated by the SOM as a way to assess the number of clusters and maybe to use this unsupervised clustering in a supervised model.

from minisom.

JustGlowing commented on May 17, 2024 1

Of course, thanks for using Minisom. Leave a star if you like it!

from minisom.

JustGlowing commented on May 17, 2024

hi @lachhebo, the quantization error simply tells you how much information you lose in case that you quantize your data with the SOM. Just to give you an idea, If the quantization error is 0 the weights of your network are exactly as the original data. To know if the SOM is reliable, you have to test it for your specific application.

from minisom.

lachhebo commented on May 17, 2024

In my case, i'm trying to assess the number of cluster in a dataset.

What I'm thinking to do is to separate my dataset in two : train and test.
Then train my som on the training dataset optimising the quantization error.
Eventually, i would compare the distance map of my som to the activation frequencies of the testing dataset.

Do you think it is the way to go to get the reliable as possible som ?

from minisom.

JustGlowing commented on May 17, 2024

Is your data labeled?

from minisom.

lachhebo commented on May 17, 2024

Yes, it is

from minisom.

JustGlowing commented on May 17, 2024

Then you have can compare the clusters you obtain with your labels.

from minisom.

JustGlowing commented on May 17, 2024

Then you can use a cluster quality measure. There are many, this is an example: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.silhouette_score.html

from minisom.

lachhebo commented on May 17, 2024

IMHO, directly use the silhouette score on the clustering operated by the som is not pertinent as many nodes are next to each other, hence the silhouette score will be low. The correct number of clusters is probably inferior to the number of nodes.

from minisom.

JustGlowing commented on May 17, 2024

It depends on how you derive your clusters, I usually recommend to give to use small maps and assume that each position in the map gives you a cluster. For example, a 2-by-2 map will give you 4 clusters. This way the silhouette score is suitable.

from minisom.

lachhebo commented on May 17, 2024

It will work but i will get a higher quantization error and simpler algorithm like Affinity propagation will probably as well in this case.

I think it's better to user a bigger map with a lower quantization error and then try to interpret the distance map and see if it is reliable.

from minisom.

lachhebo commented on May 17, 2024

Thanks for your time and your work, it is a great package and i already starred it !

from minisom.

JustGlowing commented on May 17, 2024

Anyway, to go back to your initial question. You need to tune the SOM to have the quantization error that you desire. More clusters means lower quantization error. The best solution only depends in how many clusters there's in your data.

from minisom.

quantization error (theoretical question) about minisom HOT 13 CLOSED

Comments (13)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent