Comments (46)
This activation matrix is calculated explicitly only for the GPU kernel and it is not directly accessible to the user. The solution you are suggestion is flawless, perhaps it is worth merging back to the Python code, although it is bound to be sluggish for large maps and very high dimensional spaces (unless your numpy was compiled with a parallel BLAS/LAPACK).
I don't necessary see what you mean by using D as the input to another SOM. This D is basically nothing but a special Gram matrix.
from somoclu.
I added a new method view_activation_map
to study what could possible go wrong (commit aa4d017). As yet, I have no idea.
from somoclu.
I keep forgetting that the codebook entries are initialized randomly between zero and one. This is fine as long as your data is also such. I played with random data initialized somewhere in [-1,1]^2, so the map itself was garbage, the activation map should have been my least problem. So once I work with data in [0,1]^n, the map is fine.
The next issue is that the dot product and the Euclidean distance should generate the same topology. Actually, this is only true if the vectors are normalized (both codebook and data vectors), as I myself pointed out in issue #30 and then duly forgot.
If you factor in these two observations, I think the results of the activation map should be fine. Without normalization, you still have many identical values for the argmax, but otherwise this bit of code generates nice maps and matching activation maps (only one point is labelled to avoid mess):
import somoclu
import numpy as np
c1 = np.random.rand(50, 2)/5
c2 = (0.2, 0.5) + np.random.rand(50, 2)/5
c3 = (0.4, 0.1) + np.random.rand(50, 2)/5
data = np.float32(np.concatenate((c1, c2, c3)))
colors = ["red"] * 50
colors.extend(["green"] * 50)
colors.extend(["blue"] * 50)
n_rows, n_columns = 30, 50
som = somoclu.Somoclu(n_columns, n_rows, data=data, maptype="toroid",
gridtype="rectangular")
som.train(epochs=10)
labels = ["" for _ in range(150)]
labels[0] = 0
som.view_umatrix(bestmatches=True, bestmatchcolors=colors, labels=labels)
map_state = som.get_surface_state()
som.view_activation_map(map_state, 0, labels=labels)
from somoclu.
To be on the safe side, I added the CITATION file. Thanks for the hint!
from somoclu.
Oh, and with respect to my results, the github repo I have written will be made available as will a pre-print and an OSF repo.
from somoclu.
JSS looked like a nice open access journal... Awesome about the repo! I wish everybody else was doing it.
from somoclu.
Many thanks. I made some cosmetic changes and added a bit of documentation. I believe we are following the best practices of open source software development ;-)
from somoclu.
I have to admit I have never heard of a Gramian matrix before, so I just looked it up on Wikipedia. I obviously don't fully understand it yet, however maybe reposing my question might help you understand my idea.
So I was wondering if I could create a 2nd SOM, which took the D
(the special Gram matrix - in what way is it special by the way?) as input in order to self-organise itself. Is it a meaningful thing to attempt? This would be a hierarchical SOM, similar to, e.g., Hierarchical SOM-Based Detection of Novel Behavior for 3D Human Tracking, by German Ignacio Parisi and Stefan Wermter.
Am I making any sense? 😄
from somoclu.
PS: If you want to merge it I can set up a pull request, which .py
shall I put the function in and what name would you like it to have? 😉
from somoclu.
I did not know about this hierarchical variant, but I think you need something different than D. So you have some n/10 nodes, where n is the number of data instances, and you initialize the map with randomly selected vectors of your data instances. Then, if I got it right, they identify the non-outliers, and either the BMU's weight vector, or a non-outlier data instance is used to construct the feature vectors described in Eqs. (2-4) -- I am not sure which. In any case, D does not seem to play a role.
The helper functions go into train.py, preferably as part of the Somoclu object. Many thanks!
from somoclu.
OK, sorry that might have confused things more.
I am not trying to build on or replicate their work. I just wanted to use the activation on the map (as the function I will add to somoclu now does). Is your intuition that that will not work?
from somoclu.
Let me try to understand it. You get an #rows x #columns matrix as the activation map. Then what would be the training data for the next SOM?
from somoclu.
It would be trained on the activation for each input on the surface of the map below it.
from somoclu.
PR for the above function: #44 (hope that is consistent with the way you have been doing things as much as possible) 😀
from somoclu.
Thanks, I merged it.
from somoclu.
One last silly question: When does np.argmax(map_state, axis=1)
(where map_state
is the output of get_surface_state
) become equal to the BMUs? Sometimes never?
from somoclu.
I am looking into now. They should always match. An obvious difference is that your get_surface_state
calculates a dot product, and not a distance, so that argmax looks suspicious.
from somoclu.
Oooh, how else do you locate the most active unit? I must have something conceptually wrong in my head.
from somoclu.
Okay, the BMUs are fine. If I calculate the activation map with the Euclidean distance, the argmin coincides with the BMU. I forgot why, but in the BMUs, the second index is the row.
from somoclu.
That makes absolute sense. Thanks.
from somoclu.
I still don't get it, though. The argmax on the dot product often gives the same value, even when the BMUs are different. I have to leave it here now, but I'll try to get back to this
from somoclu.
Yes, that bit doesn't make sense to me either. Have you managed to get that on a map on your end too?
from somoclu.
Thanks so much. 😄
from somoclu.
Honestly, I quite like this activation map. I just made the plotting a bit smarter, so you can pass an arbitrary vector and see the activation (commit b718f01). For instance, you can do this:
new_vector = np.array([0.2, 0.3])
som.view_activation_map(data_vector=new_vector)
If you have time, it might be worth changing the calculation of the activation to the Euclidean distance.
from somoclu.
Yes, that's great. I would really like to explore this more, I am hitting up against a deadline (tomorrow and next week) but will when I have time.
from somoclu.
Done! See: #46
This also fixes everything with respect to my initial question!
from somoclu.
Many thanks. I pushed commit debac27 just now to achieve the same thing for new vectors when viewing the activation map. Well, I still did not catch your original question, but if it fixes things for you, maybe we can close this.
from somoclu.
Yeah sure - the reason I/we (on my project) came up with the idea for the surface map will become clear hopefully when I can more freely discuss the model I am working on. But I do agree with you that it is nonetheless useful more generally too.
When the time comes how would you like me to cite you/Somoclu? Tangentially, have you heard of CITATION files?
from somoclu.
Thanks for asking, this is a bit of a sad story. The paper on Somoclu was accepted for publication ages ago by the Journal of Statistical Software, after a two-year review process. As of today, the editor still refuses to tell anything about the publication date. This is one reason why there is no CITATION file, the other being my lack of awareness.
The manuscript is on arXiv, so as for now, I would like to ask you to cite that version, as referenced in the Readme:
Peter Wittek, Shi Chao Gao, Ik Soo Lim, Li Zhao (2015). Somoclu: An Efficient Parallel Library for Self-Organizing Maps. arXiv:1305.1422.
Then Depsy theoretically scans manuscripts for links to PyPI or CRAN packages, so a link to Somoclu's PyPI page would also be welcome. Unfortunately, Depsy is not doing a great job, at least not yet.
You might want to consider making your result available as a notebook or as a repository. I am ardent supporter of open science, and I believe that making the results of a paper easily reproducible is hugely important. Well, anyhow, no pressure, and thanks again for asking.
from somoclu.
That's so sad, can't you un-submit it and send it to a more welcoming journal - perhaps?
from somoclu.
I have been using seaborn a lot. And I wonder if this is in any way as useful to you as it is to me. I have been visualising the surface states using (dis)similalrity matrices:
import seaborn as sns
import matplotlib.pylab as plt
def create_similarity_matrix(X, labels, save_as):
# Calculate the pairwise correlations as a metric for similarity
corrmat = 1-pairwise_distances(X, metric="correlation")
# Set up the matplotlib figure
f, ax = plt.subplots(figsize=(12,9))
# Y axis has inverted labels (seaborn default, no idea why)
yticklabels = np.atleast_2d(labels)
yticklabels = np.fliplr(yticklabels)[0]
# Draw the heatmap using seaborn
sns.heatmap(corrmat, vmax=1, vmin=-1, square=True,
xticklabels = labels, yticklabels = labels, cmap = "RdBu_r", center = 0)
f.tight_layout()
# This sets the ticks to a readable angle
plt.yticks(rotation=0)
plt.xticks(rotation=90)
# This sets the labels for the two axes
ax.set_yticklabels(yticklabels, ha = 'right', va='center', size= 8)
ax.set_xticklabels(labels, ha = 'center', va = 'top', size= 8)
# Save and close the figure
plt.savefig(save_as+'_heatmap.png', bbox_inches='tight')
plt.close()
create_similarity_matrix(som.get_surface_state(), labels, 'state')
from somoclu.
I love seaborn -- just importing it does magic to figures. I wish it was merged with matplotlib.
I am trying to grasp what this heatmap implies. This would be another way of looking at clustering structures between data points. Is this correct? We struggle a lot with evaluating cluster consistency and shifts between the clusters over evolving maps, so it is indeed useful, thanks. Perhaps you can add it to train.py
as a helper function -- not as a method of the Somoclu class -- with minor changes to allow both saving to a file and inline visualization. No pressure, though, only if you have a bit of time.
Risking of being overly pedantic, allow me one tiny comment: scoping in Python is very simplistic. It deletes variables in exactly one specific case: when it returns from a function. So the del
at the end of the function definition is unnecessary, at least in principle.
from somoclu.
Yeah, sorry about the del
was copy pasting in a hurry from a non-function/script to make this into a function for you.
I will have to ask you to look at a paper using (dis)similarity matrices. It's a very Googleable term. You could look at my recent preprint which explains them here. Hopefully that will help. Then please feel free to ask me questions if it's still unclear how that could be applied to SOMs, but it should be easier to understand. Perhaps running it on the example you have in the docs for somoclu might also help clarify the usefulness or indeed lack thereof. I have no idea if you will find it as useful as I do!
Tangent, but I actually presented SOMs connected in a hierarchy last week at a conference and I used these matrices to show their representational spaces.
from somoclu.
PS: Clusters in the heatmap would be detected by calculating the pairwise correlations between each pattern's activation on the surface of the map.
from somoclu.
Wow, and I thought I knew SOMs well. Thanks for the paper. I've seen some work on hierarchical SOMs, but I never made the connection to deep learning. Normally I think of SOMs as the shallowest possible learners, but this hierarchy changes the game. This heatmap is handy indeed.
from somoclu.
No, sorry, I'm confusing you because I was being brief.
So that preprint uses DSMs, but not SOMs. It uses convolutional artificial neural networks.
My hierarchical SOMs are not written up yet. Hopefully at some point.
Even though they are a little similar they are not overlapping other than by my presence. Completely different PIs. Two different projects.
from somoclu.
I think I got it -- the question is more about the representation. It is still very interesting, though.
from somoclu.
I really hope it's useful and clear. Otherwise please get back to me. I don't want you spending too much time on it. And thanks for nice comments.
from somoclu.
Also I'd just like to say it's been really nice interacting with you as an open source developer who's so positive and welcoming to new ideas and contributions. Definitely get back to me if you want me to incorporate this idea into somoclu.
from somoclu.
PS: Tidied up the code above even more. 😄
from somoclu.
It looks great. Please add it to train.py
and send a pull request when you have a moment. I want to make sure you get the credit for it. Then I'll do some minor changes to integrate it nicely with the rest of the module.
from somoclu.
My PR is very rough, apologies: #47
from somoclu.
I did become interested in the activation patterns, I was especially interested if I would be able to find any activation pattern that has two distinct 'hotzones', this doesn't appear to be possible.
Here are a few collages of 100 activation patterns each from a 44x44 codebook I trained with a toroidal hexagonal lattice:
Brighter=greater distance between sample and that point in the codemap.
After some consideration I decided that isn't informative enough, here is the inverse (1/distance):
The two are not related, I used random sampling.
Code:
def collage(arr):
def unblockshaped(arr, h, w):
n, nrows, ncols = arr.shape
return (arr.reshape(h//nrows, -1, nrows, ncols).swapaxes(1,2).reshape(h, w))
sqrt_lastdim = math.sqrt(arr.shape[0])
sqrt_ceil = math.ceil(sqrt_lastdim)
if not sqrt_lastdim.is_integer():
diff = sqrt_ceil ** 2 - arr.shape[0]
arr = np.concatenate((arr, np.zeros((diff, arr.shape[1],arr.shape[2]))), axis=0)
grid = unblockshaped(arr, arr.shape[1]*sqrt_ceil, arr.shape[2]*sqrt_ceil)
return grid
X = data
W = codebook.reshape((codebook.shape[0] * codebook.shape[1], codebook.shape[2]))
D = -2*np.dot(W, X.T) + (W**2).sum(1)[:, None] + (X**2).sum(1)[:, None].T
D = D.swapaxes(0,1).reshape(-1,codebook.shape[0],codebook.shape[1])
grid = collage(codebook.swapaxes(0,2).swapaxes(1,2))
plt.imshow(grid)
plt.show()
for i in range(10):
idx = np.random.choice(D.shape[0], 100)
grid = collage(D[idx])
plt.close()
plt.gcf().set_size_inches(4, 4)
plt.gca().axes.get_xaxis().set_visible(False)
plt.gca().axes.get_yaxis().set_visible(False)
plt.gca().set_aspect('equal', adjustable='box')
plt.imshow(grid)
plt.savefig(str(i)+'.png', bbox_inches='tight', pad_inches=0, dpi=256)
I fairly sure I confused the heck out of the x,y axis somewhere, so any non square lattice may not work.
from somoclu.
This is interesting. I am not all that surprised that you have hot zones all over the place, since a plain, random-initialized SOM can only be expected to preserve the local topology.
Perhaps you can get fewer and more localized hot spots. Initialize the map with PCA, then train it for one epoch with a low radius and a low learning rate. This map will not look great, since in one epoch, the training phase does not have a chance to pull the points apart that the PCA collapses to the same spot. The only purpose of this is to see if only one region lights up in the activation map. If this is the case, from here it is a matter of tuning the training parameters to get what you want.
from somoclu.
I actually feel somewhat stupid, all I had to do was make the values negative to make the images look more intuitive, now brighter is actually better without having to use inverse:
The second album of the previous post does however show the absolute centers of influence though. So I guess inverse has it's uses nevertheless.
I'm actually not sure how to interpret these results, what significance does the activation pattern hold? Because it's of higher dimensionality than the data was to begin with - it is however very constrained, much like the pixels of images. Quite interesting, still have to think about it and how it may be of use. It might have potential when you need to project data into a higher dimensional space, although I'm not sure of how much use it would be because it adopts the same problems that images have - being spatial.
It certainly ought to be far more useful than a BMU, considering that the BMU is a tiny tiny fraction of the whole, lots of information gets discarded. I'm pretty sure the activation map each sample makes is significant when comparing samples.
from somoclu.
The activation pattern allows us to calculate the similarity matrix. This is my only use really. The similarity matrix depicts a representational space that could not be captured by the BMUs since they are by definition orthogonal.
from somoclu.
Related Issues (20)
- /home/docker/R/Rsomoclu/libs/Rsomoclu.so: undefined symbol: _ZTI8Snapshot
- Licensing and GPL HOT 6
- MATLAB interfece Batch algorithm HOT 4
- (core dumped) HOT 3
- Attempting to use an MPI routine before initializing MPI HOT 4
- single dimensional clustering
- upgradation problem HOT 1
- errors importing native C library (python 3.8.6 & swig 4.0.1) HOT 4
- update official repos (pypi and conda) HOT 3
- TypeError: train expected 23 arguments, got 22 HOT 2
- Can I set the random seed? HOT 4
- Get bmu of testing data HOT 2
- Numpy requirement in setup.py HOT 5
- conda-forge PackagesNotFoundError HOT 6
- How can I assign a cluster to new data? HOT 1
- Batch mode and learning rate HOT 3
- Can't build wheel with somoclu and pip 23.1 HOT 2
- Warning: the binary library cannot be imported. You cannot train maps, but you can load and analyze ones that you have already saved. If you installed Somoclu with pip on Windows, this typically means missing DLLs. Please refer to the documentation. HOT 4
- linux/arm64 for conda-forge
- About UMatrix visualization (question)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from somoclu.