For explicit solvent, the size of our NCFiles are probably going to be problematic to

Repex large file sizes about brokenyank HOT 5 OPEN

choderalab commented on June 12, 2024

Repex large file sizes

from brokenyank.

Comments (5)

jchodera commented on June 12, 2024

Excellent point. This is certainly undesirable.

I like the idea of a multi-resolution output file format, with the large, full restart snapshots being written less frequently.

I felt an important feature of these NetCDF files and repex.py was robustness to early termination. Resuming from an existing NetCDF file should ideally be painless, where it finds the last "good" snapshot and resumes from there.

I propose we allow the user to tune the checkpoint frequency separately from the frequency at which other properties are written. When a run is resumed, we should ERASE the data following the last checkpoint and resume from that point. This may cause the odd behavior where running for a short amount of time after a termination will actually remove samples, but I think it gives the most robust overall behavior.

repex.py should also predict the final file sizes for you so you can tell if you are going to run out of storage.

Finally, I like the idea of a "plug-in" system for computing different properties to be included in the NetCDF file. Structuring in a manner similar to the Reporter objects is a good idea.

This feature will be very important as we start doing explicit solvent free energy calculations. This has recently been enabled by Peter's addition of a dispersion correction to the CustomNonbondedForce class.

I'll mark this high priority.

from brokenyank.

kyleabeauchamp commented on June 12, 2024

This might be something to punt until the 2.0 release. To me, we could just fix all outstanding minor issues (e.g. installation cleanup, GPU issues, MPI issues) for the 1.0 release but have a list of "harder" changes that will wait until 2.0.

from brokenyank.

jchodera commented on June 12, 2024

Good point. It's not essential for explicit solvent free energy calculations if you have sufficient storage, so we can punt this.

from brokenyank.

kyleabeauchamp commented on June 12, 2024

The other idea I had for reducing file sizes is to only output results for specific thermodynamic states. For example, when I do repex for conformational change, I don't actually care about what goes on at 500K. I just want those states to be exchanging with ambient temperatures. Thus, it would make sense to simulate 300-500K, but output only 300-320K.

I think this idea might be less useful for alchemical simulations, however.

from brokenyank.

jchodera commented on June 12, 2024

We can certainly allow more flexibility as to which data precisely is output every iteration. I'd still want to write out full-precision "checkpoints" every so often, to allow the simulation to cleanly resume from these checkpoints.

from brokenyank.

Repex large file sizes about brokenyank HOT 5 OPEN

Comments (5)

Related Issues (15)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent