Coder Social home page Coder Social logo

Comments (10)

peterwittek avatar peterwittek commented on July 18, 2024

Could you tell me a bit more? I need the size of the map and the kernel you are using.

from somoclu.

ajallooeian avatar ajallooeian commented on July 18, 2024

I tried 256x256 and 128x128, both lead to seg fault.
I use the sparse kernel -k 2. This is the full command:
nice -n 11 somoclu -k 2 -m toroid -r 1.0 -x 128 -y 128 somefile somefile.som

from somoclu.

ajallooeian avatar ajallooeian commented on July 18, 2024

Also one more thing: I think that my sparse file (libsvm format) has a 1-based indexing and does not have a zero index. Would that cause a problem?

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

You need a fair bit of memory to run this. Assuming that the file is correctly parsed (0- or 1-based indexing makes no difference), you will need at least:

  • 4 * 128 * 128 * 173326/(1024 * 1024) MByte or about 10.8 GByte for calculating the distances to the nodes.
  • 4 * 128 * 128 * 374998/(1024 * 1024) MByte or about 23 Gbyte for the codebook. The codebook is always dense, even if your data is sparse, because it is unlikely that there would be any nonzero element.

We normally run such workloads on a cluster. Then the codebook memory requirement is split evenly, but the matrix of the distance calculation is not.

Try it on a much smaller map first, say, 50x50. If that works, then the memory is insufficient.

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

I take that back. In the sparse and dense CPU kernels, you do not need the first array, the one with the distances. The only constraint is the codebook. So you can split total memory use by running the sparse kernel on a cluster.

from somoclu.

ajallooeian avatar ajallooeian commented on July 18, 2024

Thanks for the response. I am running it on a 32-core 32-GB linux server, in multicore. Do you mean anything else by a cluster?
I will also run it on a smaller map and update here.

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

If you compile Somoclu with MPI, you will get a more verbose output, which could be useful for debugging.You could also paste the first two or three lines of your data file here to double check if parsing works correctly.

from somoclu.

ajallooeian avatar ajallooeian commented on July 18, 2024

The first five lines look like this:
220126:0.707107 245393:0.707107
220126:0.707107 245393:0.707107
1428:0.169031 34812:0.169031 58140:0.338062 64671:0.169031 75771:0.169031 89153:0.169031 120368:0.169031 162833:0.169031 167461:0.169031 212353:0.169031 215390:0.169031 216221:0.169031 225341:0.169031 229670:0.169031 250778:0.169031 289375:0.169031 293240:0.169031 295938:0.169031 301211:0.169031 305249:0.169031 321932:0.169031 328294:0.169031 334946:0.169031 341730:0.169031 343771:0.338062 356616:0.338062
36118:0.408248 209819:0.408248 284562:0.816497
104515:0.223607 196090:0.223607 197477:0.223607 197687:0.223607 198165:0.223607 219615:0.447214 224369:0.223607 229163:0.223607 231257:0.447214 243357:0.223607 255648:0.447214

and the last 5:

68905:0.229416 75482:0.229416 137863:0.458831 169314:0.229416 173709:0.229416 183970:0.458831 240233:0.229416 245747:0.229416 261398:0.229416 262862:0.229416 284689:0.229416 306297:0.229416 309908:0.229416
58280:0.408248 112103:0.408248 121273:0.408248 185691:0.408248 217166:0.408248 264256:0.408248
142786:0.707107 337011:0.707107
10678:0.301511 36839:0.301511 46730:0.301511 63762:0.301511 104113:0.301511 106364:0.301511 127196:0.301511 137058:0.301511 298432:0.301511 306478:0.301511 319849:0.301511
95851:0.5 159101:0.5 183673:0.5 199851:0.5

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

This works just fine:

$ somoclu -k 2 -m toroid -r 1.0 -x 50 -y 50 data/test.svm data/test
nVectors: 10 nVectorsPerRank: 10 nDimensions: 356617 
Epoch Time: 21.7777
     10% [======                                            ]
Epoch Time: 22.9673
     20% [===========                                       ]
Epoch Time: 22.3178
     30% [================                                  ]
Epoch Time: 22.9555
     40% [=====================                             ]
Epoch Time: 21.6739
     50% [==========================                        ]
Epoch Time: 23.0707
     60% [===============================                   ]
Epoch Time: 22.6494
     70% [====================================              ]
Epoch Time: 23.1652
     80% [=========================================         ]
Epoch Time: 22.1516
     90% [==============================================    ]
Epoch Time: 21.6896
    100% [===================================================]
Total training Time: 224.426
    Done!
    Saving best matching units data/test.bm
    Saving Codebook data/test.wts

from somoclu.

peterwittek avatar peterwittek commented on July 18, 2024

I noticed that after finishing an epoch, the memory use jumps about 50%, then goes back to the original. With the 50x50 map and 356617 dimensions, it went from 6.6G to 9.2, then back to 6.6. This is due to the temporary structures that are used to update the codebook. With a larger map, it is possible that you are pushing it too far.

from somoclu.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.