Coder Social home page Coder Social logo

Comments (5)

micheles avatar micheles commented on August 17, 2024

Actually the only solution is to reduce the number of sites, since the memory/disk space occupation is quadratic with the number of sites.

from oq-engine.

raoanirudh avatar raoanirudh commented on August 17, 2024

Why doesn't storing them in the .tmp.hdf5 work? This is data needed only during the calculation and doesn't need to be stored in the final calc.hdf5

from oq-engine.

micheles avatar micheles commented on August 17, 2024

Because you will soon run out of disk space, this is how Cata discovered the issue. Also, once you start storing 100+GB then reading the data will kill your calculation (out of memory or so slow to be impossible to run). No matter how big is your machine, a quadratic calculation will run out of resources pretty soon. You would need an algorithm not quadratic with the number of sites.

from oq-engine.

raoanirudh avatar raoanirudh commented on August 17, 2024

Opening this issue again as it still persists.

The issue is not related to having too many sites in the calculation. It was that the conditioned/mean_covs data that is now stored in the calc_xxx.hdf5 file is useful only while the calculation is running, and can safely be deleted from the datastore once the calculation is completed. Or the other option might be to store it in the calc_xxx.tmp.hdf5 file instead, which gets deleted at the end of the calculation, since this interim data is not useful to the user after the calculation is over. If the conditioned/mean_covs data is deleted from the datastore, the hdf5 file sizes in oqdata should go back to the regular sizes for scenario calculations that do not involve conditioning.

from oq-engine.

micheles avatar micheles commented on August 17, 2024

You are partially right @raoanirudh , but my point still stand that calculations with too many points will be impossible. The only solution I see for Aristotle calculations is to use a large enough region_grid_spacing so that calculations can run. Then, to avoid wasting too much disk space we can store the temporary data in _tmp.hdf5 or even better only keep it in memory as it was originally, before #9094 (retrospectively, it was a bad idea, trading a decent but not impressive speedup for too much disk space).

from oq-engine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.