Coder Social home page Coder Social logo

multicube's People

Contributors

jpinedaf avatar keflavich avatar vlas-sokolov avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

multicube's Issues

speed-up the `best_guess` method!

The best_guess method is the bottleneck of the best guesses estimation. To allow efficient processing of moderate dimensionalities (read: realistic data cubes with several LoS components), the method needs to be refactored!

Problem summary: we need to find the best residuals among a number of synthetic spectra, at every pixel. Why is this challenging? Assuming 1000 channels and 8-byte floats, if we are looking for a grid of 4x4x4 for every LoS component, then just to store the grid of spectral models for three components we need (np.prod([4, 4, 4]*3) * 1000 * 8) / (2**30) ~ 2 GB of RAM!

Current approach: try to broadcast as large of an array as possible, then look for the best residuals on a single CPU. In a world with infinite RAM, best_guess will grab a slice of memory pie the size of (Nmodels, Nchan, X, Y) - for a modest map size of 300x200 and example above, that's more than 100 TB of memory.

What tends to happen? Needless to say, in most applications, the method only has enough memory to iterate on individual pixels, and - since it was designed to bite more than it can chew - runs on one thread only.

What should be done? The memory slice with the synthetic models should be shared with other jobs that iterate among individual pixels, and the best matching model results collected back into the main thread. Additionally (this is simpler), some parts of the current residual calculations can be sped up - dropping the numpy nan-methods and replacing residual rms calculation by the sub of squares should give a significant speedup already.

  • refactor residual calculations using simpler, faster numpy functions
  • parallelize the loop over all the pixels

(... an alternative possibility, one that scales up into cluster computing, is to look into how the big data folks handle this kind of processing)

Eating CPU immoderately

Dear @keflavich,
Thanks very much for your many efforts on this program. It would be very useful for me. Actually, I am using it to fit a huge amount of data (~ 1 Gb). However, when I run your code to sometime, there is warning message as below and then I find the program is eating my CPU immoderately so that no more CUP can be used for other work. So, do you have any idea about that problem? Thanks a lot in advance.
Best regards,
@hongliliu
PS: warning message: WARNING: Selected model is best only for less than %5 of the cube, consider using the map of guesses. [multicube.subcube]
INFO: Overall best model: selected #600 [ 1.91 7.09 0.27] [multicube.subcube]
INFO: Best model @ highest SNR: #442 [ 7.15 5.64 0.62] [multicube.subcube]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.