Coder Social home page Coder Social logo

Comments (14)

xgdgsc avatar xgdgsc commented on July 18, 2024

The kernelType is different? Have you tried both use kernelType=0?

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

The starting learning rate is also different.

from somoclu.

brogie62 avatar brogie62 commented on July 18, 2024

Copied wrong CLI command. They were both done with l =1. (BTW, I was doing parameter optimization and found that, using gaussian, the starting learning rate has little effect on quant error which I had seen previously.) The CLI run was gpu so kernal is 1. I had previously ran cpu on CLI with kernel = 0 and got identical results to the corresponding gpu run.

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

Actually, if it is compiled without CUDA support, the GPU kernel (=1) falls back to the CPU kernel without saying a word.

In any case, the problem is odd. To comply with CRAN, the random number generator of the R version is the one from <R.h>:

(RAND_MAX * unif_rand())

which should be identical in effect to the rand() function in <cstdlib>. The generated integer random number is than transformed to the [0, 1] interval in both cases. I saw major discrepancies if and only if the data coordinates were not normalized to [0, 1]. So lets get through a couple of basic points:

  • Is your data normalized?
  • Did you set the environment variable OMP_NUM_THREADS? It should not have any impact on the actual result, but it is good to know how many cores you are using.
  • How do you evaluate quantization error?

from somoclu.

brogie62 avatar brogie62 commented on July 18, 2024

The CLI version was compiled with CUDA support.

Randomization also should not matter as I am using an initial codebook.

The data is not nomalized and is identical in both cases.

I have not made any changes to OMP_NUM_THREADS.

I evaluate quant error by averaging the euclidean distance between each input vector and its BMU using the rdist function in the fields package in R:

weights <- res$codebook
inputs <- dataSource
distMatrix <- rdist(inputs, weights)
result <- t(sapply(seq(nrow(distMatrix)), function(i) {
j <- which.min(distMatrix[i,])
c(distMatrix[i,j])
}))

MinM <- mean(result)

from somoclu.

xgdgsc avatar xgdgsc commented on July 18, 2024

For me the above commit gives me the codebook more similar with the CLI version than before. So it might be related to the wrong handling of column-major matrix when converting array between C and R. Please try if this fixes the issue.

from somoclu.

brogie62 avatar brogie62 commented on July 18, 2024

I deleted my previous comment with the images. Although they are accurate, I decided I am not ready to have my work in a public forum yet. I hope that is OK.

from somoclu.

brogie62 avatar brogie62 commented on July 18, 2024

I'll give that commit a try and let you know. Does that commit include the neighborhood function parameter?

from somoclu.

xgdgsc avatar xgdgsc commented on July 18, 2024

That includes the neighborhood function parameter

from somoclu.

brogie62 avatar brogie62 commented on July 18, 2024

Tryed #26. Gaussian gave quant error of 5.78. Very much in line with CLI. Bubble gave improved quant error of 5.34. With Matlab and kohonen R package I was getting < 5. I may need to optimize parameters.

Thanks

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

The R interface is a bit of a mistreated foster child, as we are inexperienced with it. Thanks for pointing out this bug.

from somoclu.

xgdgsc avatar xgdgsc commented on July 18, 2024

The fix is on CRAN now.

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

Thanks very much. I did some overdue clean up and tagged version 1.5.1. The update is released on MLOSS and GitHub. Please update PyPI.

from somoclu.

xgdgsc avatar xgdgsc commented on July 18, 2024

OK. Just uploaded source to PyPI, will build the binaries later.

from somoclu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.