Coder Social home page Coder Social logo

Repex large file sizes about brokenyank HOT 5 OPEN

choderalab avatar choderalab commented on June 12, 2024
Repex large file sizes

from brokenyank.

Comments (5)

jchodera avatar jchodera commented on June 12, 2024

Excellent point. This is certainly undesirable.

I like the idea of a multi-resolution output file format, with the large, full restart snapshots being written less frequently.

I felt an important feature of these NetCDF files and repex.py was robustness to early termination. Resuming from an existing NetCDF file should ideally be painless, where it finds the last "good" snapshot and resumes from there.

I propose we allow the user to tune the checkpoint frequency separately from the frequency at which other properties are written. When a run is resumed, we should ERASE the data following the last checkpoint and resume from that point. This may cause the odd behavior where running for a short amount of time after a termination will actually remove samples, but I think it gives the most robust overall behavior.

repex.py should also predict the final file sizes for you so you can tell if you are going to run out of storage.

Finally, I like the idea of a "plug-in" system for computing different properties to be included in the NetCDF file. Structuring in a manner similar to the Reporter objects is a good idea.

This feature will be very important as we start doing explicit solvent free energy calculations. This has recently been enabled by Peter's addition of a dispersion correction to the CustomNonbondedForce class.

I'll mark this high priority.

from brokenyank.

kyleabeauchamp avatar kyleabeauchamp commented on June 12, 2024

This might be something to punt until the 2.0 release. To me, we could just fix all outstanding minor issues (e.g. installation cleanup, GPU issues, MPI issues) for the 1.0 release but have a list of "harder" changes that will wait until 2.0.

from brokenyank.

jchodera avatar jchodera commented on June 12, 2024

Good point. It's not essential for explicit solvent free energy calculations if you have sufficient storage, so we can punt this.

from brokenyank.

kyleabeauchamp avatar kyleabeauchamp commented on June 12, 2024

The other idea I had for reducing file sizes is to only output results for specific thermodynamic states. For example, when I do repex for conformational change, I don't actually care about what goes on at 500K. I just want those states to be exchanging with ambient temperatures. Thus, it would make sense to simulate 300-500K, but output only 300-320K.

I think this idea might be less useful for alchemical simulations, however.

from brokenyank.

jchodera avatar jchodera commented on June 12, 2024

We can certainly allow more flexibility as to which data precisely is output every iteration. I'd still want to write out full-precision "checkpoints" every so often, to allow the simulation to cleanly resume from these checkpoints.

from brokenyank.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.