Comments (14)
The kernelType is different? Have you tried both use kernelType=0?
from somoclu.
The starting learning rate is also different.
from somoclu.
Copied wrong CLI command. They were both done with l =1. (BTW, I was doing parameter optimization and found that, using gaussian, the starting learning rate has little effect on quant error which I had seen previously.) The CLI run was gpu so kernal is 1. I had previously ran cpu on CLI with kernel = 0 and got identical results to the corresponding gpu run.
from somoclu.
Actually, if it is compiled without CUDA support, the GPU kernel (=1) falls back to the CPU kernel without saying a word.
In any case, the problem is odd. To comply with CRAN, the random number generator of the R version is the one from <R.h>
:
(RAND_MAX * unif_rand())
which should be identical in effect to the rand()
function in <cstdlib>
. The generated integer random number is than transformed to the [0, 1] interval in both cases. I saw major discrepancies if and only if the data coordinates were not normalized to [0, 1]. So lets get through a couple of basic points:
- Is your data normalized?
- Did you set the environment variable OMP_NUM_THREADS? It should not have any impact on the actual result, but it is good to know how many cores you are using.
- How do you evaluate quantization error?
from somoclu.
The CLI version was compiled with CUDA support.
Randomization also should not matter as I am using an initial codebook.
The data is not nomalized and is identical in both cases.
I have not made any changes to OMP_NUM_THREADS.
I evaluate quant error by averaging the euclidean distance between each input vector and its BMU using the rdist function in the fields package in R:
weights <- res$codebook
inputs <- dataSource
distMatrix <- rdist(inputs, weights)
result <- t(sapply(seq(nrow(distMatrix)), function(i) {
j <- which.min(distMatrix[i,])
c(distMatrix[i,j])
}))
MinM <- mean(result)
from somoclu.
For me the above commit gives me the codebook more similar with the CLI version than before. So it might be related to the wrong handling of column-major matrix when converting array between C and R. Please try if this fixes the issue.
from somoclu.
I deleted my previous comment with the images. Although they are accurate, I decided I am not ready to have my work in a public forum yet. I hope that is OK.
from somoclu.
I'll give that commit a try and let you know. Does that commit include the neighborhood function parameter?
from somoclu.
That includes the neighborhood function parameter
from somoclu.
Tryed #26. Gaussian gave quant error of 5.78. Very much in line with CLI. Bubble gave improved quant error of 5.34. With Matlab and kohonen R package I was getting < 5. I may need to optimize parameters.
Thanks
from somoclu.
The R interface is a bit of a mistreated foster child, as we are inexperienced with it. Thanks for pointing out this bug.
from somoclu.
The fix is on CRAN now.
from somoclu.
Thanks very much. I did some overdue clean up and tagged version 1.5.1. The update is released on MLOSS and GitHub. Please update PyPI.
from somoclu.
OK. Just uploaded source to PyPI, will build the binaries later.
from somoclu.
Related Issues (20)
- /home/docker/R/Rsomoclu/libs/Rsomoclu.so: undefined symbol: _ZTI8Snapshot
- Licensing and GPL HOT 6
- MATLAB interfece Batch algorithm HOT 4
- (core dumped) HOT 3
- Attempting to use an MPI routine before initializing MPI HOT 4
- single dimensional clustering
- upgradation problem HOT 1
- errors importing native C library (python 3.8.6 & swig 4.0.1) HOT 4
- update official repos (pypi and conda) HOT 3
- TypeError: train expected 23 arguments, got 22 HOT 2
- Can I set the random seed? HOT 4
- Get bmu of testing data HOT 2
- Numpy requirement in setup.py HOT 5
- conda-forge PackagesNotFoundError HOT 6
- How can I assign a cluster to new data? HOT 1
- Batch mode and learning rate HOT 3
- Can't build wheel with somoclu and pip 23.1 HOT 2
- Warning: the binary library cannot be imported. You cannot train maps, but you can load and analyze ones that you have already saved. If you installed Somoclu with pip on Windows, this typically means missing DLLs. Please refer to the documentation. HOT 4
- linux/arm64 for conda-forge
- About UMatrix visualization (question)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from somoclu.