openmm / openmm-ml Goto Github PK

High level API for using machine learning models in OpenMM simulations

License: Other

Python 100.00%

openmm-ml's Introduction

OpenMM: A High Performance Molecular Dynamics Library

Introduction

OpenMM is a toolkit for molecular simulation. It can be used either as a stand-alone application for running simulations, or as a library you call from your own code. It provides a combination of extreme flexibility (through custom forces and integrators), openness, and high performance (especially on recent GPUs) that make it truly unique among simulation codes.

Getting Help

Need Help? Check out the documentation and discussion forums.

openmm-ml's People

Contributors

Stargazers

Watchers

Forkers

peastman anthonialcaraz jthorton yueyericardo dominicrufa linhduongtuan mikemhenry rschireman stjordanis asclepiusinformatica jharrymoore sef43 johanneskarwou mfkiwl saratk1 jmorado cqpes dingye18

openmm-ml's Issues

`setVelocitiesToTemperature` throws `GIL` error

following this conda installation (i.e. conda create -n tmp-test -q -c conda-forge openmm-torch nnpops torchani openmmtools cudatoolkit=11.3) and adding a single line to TestMLPoential.py that looks like:

interpContext.setVelocitiesToTemperature(300*unit.kelvin)

passes the assertAlmostEqual but fails on the velocity setting with:

======================================================================
ERROR: testCreateMixedSystem (__main__.TestMLPotential)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/lila/home/rufad/github/openmm-ml/test/TestMLPotential.py", line 33, in testCreateMixedSystem
    interpContext.setVelocitiesToTemperature(300*unit.kelvin)
  File "/home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/openmm.py", line 12141, in setVelocitiesToTemperature
    return _openmm.Context_setVelocitiesToTemperature(self, *args)
openmm.OpenMMException: The autograd engine was called while holding the GIL. If you are using the C++ API, the autograd engine is an expensive operation that does not require the GIL to be held so you should release it with 'pybind11::gil_scoped_release no_gil;'. If you are not using the C++ API, please report a bug to the pytorch team.
Exception raised from execute at /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1650965732461/work/torch/csrc/autograd/python_engine.cpp:120 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x68 (0x2afa6d274a78 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0xe8 (0x2afa6d251d6e in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #2: torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&) + 0xb0 (0x2afac8327030 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x2c1db16 (0x2afa39749b16 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #4: torch::autograd::backward(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, c10::optional<bool>, bool, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) + 0x69 (0x2afa3974a5d9 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0x2c75fe1 (0x2afa397a1fe1 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #6: at::Tensor::_backward(c10::ArrayRef<at::Tensor>, c10::optional<at::Tensor> const&, c10::optional<bool>, bool) const + 0x49 (0x2afa378cdb69 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #7: TorchPlugin::ReferenceCalcTorchForceKernel::execute(OpenMM::ContextImpl&, bool, bool) + 0xdff (0x2afabef3828f in /home/rufad/miniconda3/envs/omm_dev2/lib/plugins/libOpenMMTorchReference.so)
frame #8: OpenMM::ContextImpl::calcForcesAndEnergy(bool, bool, int) + 0xc9 (0x2afa1ab310d9 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/../../../libOpenMM.so.7.7)
frame #9: OpenMM::ReferenceCustomCVForce::calculateIxn(OpenMM::ContextImpl&, std::vector<OpenMM::Vec3, std::allocator<OpenMM::Vec3> >&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, double, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, double> > > const&, std::vector<OpenMM::Vec3, std::allocator<OpenMM::Vec3> >&, double*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, double, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, double> > >&) const + 0xde (0x2afa1ac3375e in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/../../../libOpenMM.so.7.7)
frame #10: OpenMM::ReferenceCalcCustomCVForceKernel::execute(OpenMM::ContextImpl&, OpenMM::ContextImpl&, bool, bool) + 0x302 (0x2afa1ac0c612 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/../../../libOpenMM.so.7.7)
frame #11: OpenMM::ContextImpl::calcForcesAndEnergy(bool, bool, int) + 0xc9 (0x2afa1ab310d9 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/../../../libOpenMM.so.7.7)
frame #12: OpenMM::Context::setVelocitiesToTemperature(double, int) + 0xd0 (0x2afa1ab2d990 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/../../../libOpenMM.so.7.7)
frame #13: <unknown function> + 0x118456 (0x2afa1a88e456 in /home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/_openmm.cpython-39-x86_64-linux-gnu.so)
<omitting python frames>


----------------------------------------------------------------------
Ran 1 test in 18.655s

FAILED (errors=1)

Create a README with information on how to use this (or where the docs live)

Simulation performance steadily declines

I ran a pure waterbox simulation within a 25 Angstrom box with Ani-2x and the torchani implementation. The simulation performance decreased from ~2ns/day to 0.3 ns/day within the first 1ns of simulation time.

I've attached a script to reproduce this behavior and the output of the StateReporter. Is there anything that I need to do differently?

#"Step","Time (ps)","Potential Energy (kJ/mole)","Total Energy (kJ/mole)","Temperature (K)","Density (g/mL)","Speed (ns/day)"
500,0.5000000000000003,-100516778.12917194,-100514547.79711843,118.98305105506671,0.9591993474692536,0
1000,1.0000000000000007,-100515425.77060814,-100511819.93468693,192.36299762377715,0.9591993474692536,2.06
1500,1.4999999999999456,-100514835.4803809,-100510368.56665199,238.29950496707687,0.9591993474692536,1.77
2000,1.9999999999998905,-100514522.33117871,-100509525.64425702,266.5616781910217,0.9591993474692536,1.86
2500,2.4999999999998357,-100514196.04165852,-100508978.46768452,278.3454910015101,0.9591993474692536,1.9
3000,2.9999999999997806,-100514152.50934735,-100508801.46777023,285.465678603462,0.9591993474692536,1.93
3500,3.4999999999997256,-100513990.48632172,-100508530.74921964,291.26433319200413,0.9591993474692536,1.95
4000,3.9999999999996705,-100514099.18439446,-100508432.39956689,302.309849959495,0.9591993474692536,1.96
4500,4.4999999999998375,-100513837.67253573,-100508312.70271303,294.74434780102064,0.9591993474692536,1.97
...
995500,995.4999999833715,-100515995.70435952,-100510307.35485645,303.4602754673055,0.9591993474692536,0.304
996000,995.9999999833597,-100515819.44682398,-100510114.86655764,304.32614910742063,0.9591993474692536,0.304
996500,996.4999999833478,-100515949.61809933,-100510331.93236865,299.6905268216136,0.9591993474692536,0.304
997000,996.999999983336,-100516013.21894117,-100510187.2782918,310.80044455015786,0.9591993474692536,0.304
997500,997.4999999833242,-100515947.76273051,-100510117.73721096,311.01836291642246,0.9591993474692536,0.304
998000,997.9999999833124,-100516183.55732429,-100510327.88930751,312.38633077208556,0.9591993474692536,0.304
998500,998.4999999833005,-100515961.05077694,-100510513.25682221,290.6271939333003,0.9591993474692536,0.304
999000,998.9999999832887,-100515947.95052087,-100510372.42897089,297.4411654058028,0.9591993474692536,0.304
999500,999.4999999832769,-100515917.83395359,-100510158.53726706,307.2451434368191,0.9591993474692536,0.303
1000000,999.9999999832651,-100515956.23833577,-100510335.1903263,299.8698966099735,0.9591993474692536,0.303

waterbox_simulation_1ns.zip

Question about removing non bonded forces for ML potential substituion

I was trying to adapt the code to allow for more specific customisation of hybrid systems which allow ML potential for the protein and ligand, while using MM potentials for the solvent due to the VRAM limits on GPUs.

Within the createMixedSystem() function of MLPotential, there is this block of code which I don't quite understand why it is written in this way:

  atomList = list(atoms)
  for force in newSystem.getForces():
      if isinstance(force, openmm.NonbondedForce):
          for i in range(len(atomList)):
              for j in range(i):
                  force.addException(i, j, 0, 1, 0, True)

If atoms is not a continuous list of atom indexes (i.e. [1, 3, 5, 6, 7, 8]), won't the i in range(len(atomList)) fail to properly obtain the atom indexes for adding Exceptions?
Shouldn't the adding of exceptions be force.addException(atomList[i], atomList[j], 0, 1, 0, True)
Does the force.addException() function work separately with respect to the first and second particle? For instance can particle A experience the force contribution by particle B but not the other way around?

adding the torchani `EnergyShifter` to subtract the self energies of molecules

would it be possible to add EnergyShifter to the current torchani workflow in order to remove the self energy of molecules treated with ML? Energies at present are very large/small and there is concern that this could leading to numerical instability with finite precision.

Errors with ANI

I'm trying to use this on a new computer (an ARM Mac) for the first time, and I ran into an error I've never seen before. When it tries to compile the TorchANI model to TorchScript, I get the exception

RuntimeError: Could not cast attribute '_species_to_tensor' to type __torch__.torchani.utils.ChemicalSymbolsToInts (of Python compilation unit at: 0x600002facc98): Could not cast attribute 'rev_species' to type Dict[Tensor, int]: Unable to cast Python instance to C++ type (compile in debug mode for details)

I have PyTorch 1.10.0 and TorchANI 2.2, both installed from conda-forge. Has anyone seen this before?

How should we document this repo?

@peastman @giadefa : Any thoughts on how we should document this repository?

Would we want the API docs to be hosted in a separate location on the openmm.org site? Do we want docs besides the API docs?

For projects within the lab, we typically use readthedocs, and combine getting started guides, tutorials, and API docs in one place, as in openmmtools.

We should pull in Josh Mitchell into this discussion as well once he's onboard.

TestMLPotential.py fails

Sorry for the cross package issue --- I think this might involve openMM-torch, but I get the error executing the test script of openmm-ml, so I am posting here. Running the test script I get

======================================================================
ERROR: testCreateMixedSystem (__main__.TestMLPotential)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/mwieder/openmm-ml/test/TestMLPotential.py", line 19, in testCreateMixedSystem
    mixedContext = mm.Context(mixedSystem, mm.VerletIntegrator(0.001), platform)
  File "/data/shared/software/python_env/anaconda3/envs/rew/lib/python3.9/site-packages/openmm/openmm.py", line 16230, in __init__
    _openmm.Context_swiginit(self, _openmm.new_Context(*args))
openmm.OpenMMException: Specified a Platform for a Context which does not support all required kernels

I am not super sure where the problem originates from. I have built openMM-torch from source with the nightly build openMM and it seemed to have passed all the necessary tests. But when running make PythonInstall I get a lot of warnings (it runs successfully though):

[100%] Generating TorchPluginWrapper.cpp
/data/shared/software/python_env/anaconda3/envs/rew/include/swig/OpenMMSwigHeaders.i:2242: Warning 314: 'None' is a python keyword, renaming to '_None'
/data/shared/software/python_env/anaconda3/envs/rew/include/swig/OpenMMSwigHeaders.i:496: Warning 453: Can't apply (std::vector< double > &OUTPUT). No typemaps are defined.
/data/shared/software/python_env/anaconda3/envs/rew/include/swig/OpenMMSwigHeaders.i:503: Warning 453: Can't apply (OpenMM::Context &OUTPUT). No typemaps are defined.
/data/shared/software/python_env/anaconda3/envs/rew/include/swig/OpenMMSwigHeaders.i:538: Warning 453: Can't apply (std::vector< double > &OUTPUT). No typemaps are defined.
...

is this expected?

How is openmm-ml supposed to work with NNPOps

@peastman @raimis : I was chatting with @dominicrufa and @jeherr today, and we're unclear on how we are supposed to use the new NNPOps conda package with openmm-ml. Can you give us an example of how we are supposed to use NNPOps with openmm-ml?

Make openmm-ml conda-forge installable

Accessing the force as an attribute in a MLPotentialImpl

Hello!
I need to access just the TorchForce created for the system with the MLPotential. I was wondering if it would be possible to assign the force to an attribute so it can be accessed separately?
So for example in the mace potential, something like this:

        # Create the TorchForce and add it to the System.
        force = openmmtorch.TorchForce(module)
        force.setForceGroup(forceGroup)
        force.setUsesPeriodicBoundaryConditions(isPeriodic)
        self.force = force
        system.addForce(force)

In this way then I could get the force for my hypothetical system using:

        mace_system = deepcopy(system) # system being an openmm system created earlier

        macepotimpl = models.macepotential.MACEPotentialImpl(name="mace-off23-small", modelPath="")
        macepotimpl.addForces(
                        modeller.topology, # the modeller topology used to create the system above
                        mace_system,
                        ml_atoms, # indices of the ligand
                        0,
                        periodic = True,
        )

        ml_force = deepcopy(macepotimpl.force)

Alternatively if there is some other way to get the force from this that I missed?

Thank you so much!

Installing openmm-ml does not install nnpops

Following the directions in the README does not lead to NNPOps being installed, which results in slow performance:

$ mamba install -c conda-forge openmm-ml
...
  Package             Version  Build                  Channel                     Size
────────────────────────────────────────────────────────────────────────────────────────
  Install:
────────────────────────────────────────────────────────────────────────────────────────

  future               0.18.3  pyhd8ed1ab_0           conda-forge/noarch        Cached
  h5py                  3.7.0  py39h737f45e_0         pkgs/main/linux-64          1 MB
  importlib_metadata    5.0.0  hd8ed1ab_1             conda-forge/noarch          8 KB
  lark-parser          0.12.0  pyhd8ed1ab_0           conda-forge/noarch         78 KB
  ninja                1.11.1  h924138e_0             conda-forge/linux-64        2 MB
  openmm-ml               1.0  pyhd8ed1ab_0           conda-forge/noarch         14 KB
  openmm-torch            1.0  cuda112py39h440defe_1  conda-forge/linux-64      135 KB
  pytorch              1.12.1  cpu_py39he8d8e81_0     pkgs/main/linux-64         61 MB
  setuptools-scm        6.3.2  pyhd8ed1ab_0           conda-forge/noarch         29 KB
  setuptools_scm        6.3.2  hd8ed1ab_0             conda-forge/noarch          4 KB
  tomli                 2.0.1  pyhd8ed1ab_0           conda-forge/noarch        Cached
  torchani              2.2.2  cpu_py39hc31d6b3_6     conda-forge/linux-64       22 MB

We should (1) recommend nnpops be included, and (2) consider adding it to the openmm-ml feedstock recipe.

cc @peastman @raimis @mikemhenry

createMixedSystem() for switching a whole waterbox

I would like to use the OpenMM-ML package to switch a complete waterbox from MM to ML description using the createMixedSystem() functionality. I know that technically it is not made for that, but I wonder if that might - in theory - be possible.
For the simulation, I’m using the CutoffPeriodic option and set up the ML system the following way (I can provide a more detailed minimal example if needed):

potential = MLPotential("ani2x")
system = potential.createMixedSystem(
    psf.topology,
    mm_system,
    ml_atoms,
    interpolate=True,
    removeConstraints=True,
    implementation="torchani",
)
…
simulation.context.setParameter("lambda_interpolate", lamb)

I believe that when using a cutoff, the CustomBondForce within the CustomCVForces may cause issues in reproducing the Nonbonded Interaction for particles described only as ML atoms (https://github.com/openmm/openmm-ml/blob/c3d8c28eb92bf5c4b16efb81ad7a44b707fc5907/openmmml/mlpotential.py#LL298C14-L298C14). It seems that the PBC is not replicated for the ML portion of the system, which may not be problematic if only one small molecule is being treated by ML. However, when dealing with a system where all (or multiple) particles are described using ML (like the waterbox I would be interested in), this is not describing PBC correctly and leads to strange behavior of the box during the simulation (box blowing up). Therefore, I have created a Nonbonded Force using the openmm.CustomNonbondedForce according to the steps outlined in this discussion (openmm/openmm#3281). In the context of the CustomCVForce this might look like this:

for force in system.getForces():
  if isinstance(force, openmm.NonbondedForce):
    if topology.getNumAtoms() == len(atomList):
      eps_solvent = force.getReactionFieldDielectric()
      cutoff = force.getCutoffDistance()
      krf = (1/ (cutoff**3)) * (eps_solvent - 1) / (2*eps_solvent + 1)
      crf = (1/ cutoff) * (3* eps_solvent) / (2*eps_solvent + 1)
      ONE_4PI_EPS0 = 138.935456
      energy_expression = "4*epsilon*((sigma/r)^12 - (sigma/r)^6) + ONE_4PI_EPS0*chargeprod*(1/r + krf*r*r - crf);"
      energy_expression += "krf = {:f};".format(krf.value_in_unit(unit.nanometer**-3))
      energy_expression += "crf = {:f};".format(crf.value_in_unit(unit.nanometer**-1))
      energy_expression += "epsilon = sqrt(epsilon1*epsilon2);"
      energy_expression += "sigma = 0.5*(sigma1+sigma2);"
      energy_expression += "chargeprod = charge1*charge2;"
      energy_expression += "ONE_4PI_EPS0 = {:f};".format(ONE_4PI_EPS0)
      custom_nonbonded_force = openmm.CustomNonbondedForce(energy_expression)
      custom_nonbonded_force.addPerParticleParameter('charge')
      custom_nonbonded_force.addPerParticleParameter('sigma')
      custom_nonbonded_force.addPerParticleParameter('epsilon')
      custom_nonbonded_force.setNonbondedMethod(openmm.CustomNonbondedForce.CutoffPeriodic)
      custom_nonbonded_force.setCutoffDistance(cutoff)

      for index in range(force.getNumParticles()):
        charge, sigma, epsilon = force.getParticleParameters(index)
        custom_nonbonded_force.addParticle([charge, sigma, epsilon])
      for index in range(force.getNumExceptions()):
        j, k, chargeprod, sigma, epsilon = force.getExceptionParameters(index)
        custom_nonbonded_force.addExclusion(j, k)

      if custom_nonbonded_force.getNumParticles() > 0:
        name = f'mmForce{len(mmVarNames)+1}'
        cv.addCollectiveVariable(name, custom_nonbonded_force)
        mmVarNames.append(name)

Using this approach, I have been able to reproduce the expected energies that would result from simulating the box using only MM (ff.createSystem) or only ML (potential.createSystem).

force	MM (`createSystem`)	MM/ML (`createMixedSystem`)	MM/ML (`createMixedSystem`)	ML (`createSystem`)
lambda		0 (=MM)	1 (=ML)
HarmonicBondForce	4.4e-06	4.4e-06	4.4e-06	–
HarmonicAngleForce	3.8e-07	3.8e-07	3.8e-07	–
NonbondedForce	-1142.246	-1142.2456	-1142.246	–
TorchForce	–	-5993629	-5993629	-5993629
CustomCVForce		-1142.245	-5993629	–

In your opinion, do you believe that this is a valid approach for transitioning a box from MM to ML? Or is there anything I am missing?

is single precision required?

@peastman , in looking at the anipotential.py, i noticed that you are hardcoding single precision. is this a requirement/will it cause artifacts if I am using mixed precision on a GPU?

Failure to import openmmtorch

I'm stuck trying to make dependencies work, related to openmm/openmm-torch#127.

After standard mamba installation of openmm-ml I get:

File ~/miniforge3/envs/openmm-ml/lib/python3.11/site-packages/openmmml/models/anipotential.py:80, in ANIPotentialImpl.addForces(self, topology, system, atoms, forceGroup, filename, implementation, modelIndex, **args)
     78 import torchani
     79 import torch
---> 80 import openmmtorch
     82 # `nnpops` throws error if `periodic_table_index`=False if one passes `species` as `species_to_tensor` from `element`
     83 _kwarg_dict = {'periodic_table_index': True}

File ~/miniforge3/envs/openmm-ml/lib/python3.11/site-packages/openmmtorch.py:15
     13     from . import _openmmtorch
     14 else:
---> 15     import _openmmtorch
     17 try:
     18     import builtins as __builtin__

ImportError: libtorch_cuda.so: cannot open shared object file: No such file or directory

on trying to createSystem. I have the same problem with separately installing openmmtorch and this solution worked. But trying to install openmm-ml on top of that working openmmtorch (which requires using --no-deps) leads to the 'undefined symbol' error from torchani import, as seen here.

PBC and torchani/openMM

Hi Peter,

As @wardhaddadin1 mentioned yesterday in the openmm/EXS call, the neighbor list implemented in Torchani seems to assume that input coordinates are wrapped in the simulation box.

I have tested this with torchani (5 Angstrom box length, pbc set in the torchani input options) and a 2 particle system in which I moved one of the particles outside the defined simulation cell. I have attached the plots showing the trajectory of the particles (the box is indicated in black, note the different x,y,z scales) and the calculated energy.
The displacement outside of the initial box results in incorrect energies if distances are > box_length + cutoff.

I have also tested this with openmm-ml, with a 2 particle system running an MD simulation. The box length was 10 Angstrom, and the cutoff of ANI was 5 Angstrom. I have attached the input file and the script. The trajectory and the energy calculation seem to indicate that openmm-ml is also affected by this and the results are not correct. I have attached a plot of the potential energy at each time step. The trajectory can be reproduced using the script.

test.zip

Error when trying to create mixed system

Hi,

I've been trying to use ANI-2x to model the ligand during the simulation of a protein-ligand complex. I've been using the code below to set up the system.

prmtop = AmberPrmtopFile(prmtop_file)
inpcrd = AmberInpcrdFile(crd_file)

potential = MLPotential('ani2x')
ml_atoms  = [atom.index for atom in prmtop.topology.atoms() if atom.residue.name == "MOL"]
mm_system = prmtop.createSystem(nonbondedMethod=PME,
nonbondedCutoff=1.0*unit.nanometers, constraints=HBonds, rigidWater=True,
ewaldErrorTolerance=0.0005)
ml_system = potential.createMixedSystem(prmtop.topology, mm_system, ml_atoms)

The last line in the code raises the following error:

Traceback (most recent call last):
  File "/Users/victorprincipe/Local/project2021/local_tests/openmm_ani_test/md_ani.py", line 26, in <module>
    ml_system = potential.createMixedSystem(prmtop.topology, mm_system, ml_atoms)
  File "/Users/victorprincipe/miniconda3/envs/openmmani/lib/python3.9/site-packages/openmmml-1.0-py3.9.egg/openmmml/mlpotential.py", line 278, in createMixedSystem
    self._impl.addForces(topology, newSystem, atomList, forceGroup, **args)
  File "/Users/victorprincipe/miniconda3/envs/openmmani/lib/python3.9/site-packages/openmmml-1.0-py3.9.egg/openmmml/models/anipotential.py", line 85, in addForces
    species = model.species_to_tensor(elements).unsqueeze(0)
  File "/Users/victorprincipe/miniconda3/envs/openmmani/lib/python3.9/site-packages/torchani/models.py", line 165, in species_to_tensor
    return self._species_to_tensor(*args, **kwargs) \
  File "/Users/victorprincipe/miniconda3/envs/openmmani/lib/python3.9/site-packages/torchani/utils.py", line 237, in __call__
    rev = [self.rev_species[s] for s in species]
  File "/Users/victorprincipe/miniconda3/envs/openmmani/lib/python3.9/site-packages/torchani/utils.py", line 237, in <listcomp>
    rev = [self.rev_species[s] for s in species]
KeyError: 'l'

I'm not sure why this is occurring, so any help would be greatly appreciated. I've also attached the prmtop and inpcrd files below. Thanks in advance!

system_files.zip

calling `CustomCVForce_getCollectiveVariableValues(self, context)` throws a `GIL` error when collective variable is a `TorchForce`

I tried to call an energy/force operation on a TorchForce wrapped in a CustomCVForce and i ran into this error. Not sure if this should be fixed on the OpenMM level or pytorch level. my guess is the latter based on the message. i can construct a minimally-working example if necessary for further debugging if the solution isn't obvious.

File ~/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/openmm/openmm.py:4935, in CustomCVForce.getCollectiveVariableValues(self, context)
   4920 def getCollectiveVariableValues(self, context):
   4921     r"""
   4922     getCollectiveVariableValues(self, context)
   4923     Get the current values of the collective variables in a Context.
   (...)
   4933         the values of the collective variables are computed and stored into this
   4934     """
-> 4935     return _openmm.CustomCVForce_getCollectiveVariableValues(self, context)

OpenMMException: The autograd engine was called while holding the GIL. If you are using the C++ API, the autograd engine is an expensive operation that does not require the GIL to be held so you should release it with 'pybind11::gil_scoped_release no_gil;'. If you are not using the C++ API, please report a bug to the pytorch team.
Exception raised from execute at /home/conda/feedstock_root/build_artifacts/pytorch-recipe_1640869844479/work/torch/csrc/autograd/python_engine.cpp:120 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x68 (0x7fa92b91d8c8 in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, char const*) + 0xf4 (0x7fa92b8fdac5 in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/torch/lib/libc10.so)
frame #2: torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&) + 0xb0 (0x7fa8ba6fee10 in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/torch/lib/libtorch_python.so)
frame #3: <unknown function> + 0x3b63512 (0x7fa973be9512 in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #4: torch::autograd::backward(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, c10::optional<bool>, bool, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) + 0x6b (0x7fa973beb81b in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #5: <unknown function> + 0x3bc5866 (0x7fa973c4b866 in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #6: at::Tensor::_backward(c10::ArrayRef<at::Tensor>, c10::optional<at::Tensor> const&, c10::optional<bool>, bool) const + 0x49 (0x7fa9711f3d19 in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so)
frame #7: TorchPlugin::ReferenceCalcTorchForceKernel::execute(OpenMM::ContextImpl&, bool, bool) + 0xe7a (0x7fa8c891c70a in /home/dominic/anaconda3/envs/openmm_torch/lib/plugins/libOpenMMTorchReference.so)
frame #8: OpenMM::ContextImpl::calcForcesAndEnergy(bool, bool, int) + 0xc9 (0x7fa9782f06e9 in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/openmm/../../../libOpenMM.so.7.7)
frame #9: OpenMM::Context::getState(int, bool, int) const + 0x166 (0x7fa9782ef586 in /home/dominic/anaconda3/envs/openmm_torch/lib/python3.9/site-packages/openmm/../../../libOpenMM.so.7.7)

How do we select whether we want to use entire ANI ensemble or just one model?

Can we document how we select whether we use the entire ANI ensemble or just one model?
Presumably, this would be handled by one of the **args passed to ANIPotentialImpl, but I couldn't figure out if we actually support this or, if we do, what the args should be.

`createMixedSystem` enhancement for switching between MM and `TorchForce` with `GlobalParameter`s

I think it might be worthwhile to add an enhancement to the createMixedSystem functionality where instead of modelling a region (set of atoms) with a TorchForce and the rest of the system with conventional MM forcefields, we might want to have GlobalParameters that turn off the MM-ff of a region of interest and turn on the TorchForce with some GlobalParameters. I think it would also be very valuable to have a temperature-like scale factor that 'softens' the interactions between the MM/TorchForce and MM-only region in a REST-like capacity.

I have a (complicated) working implementation of this (note that this functionality only modifies the existing MM-forces. the user needs to specify a list of atoms constituting a region for which the MM-energies should be scaled down. The TorchForce generator lives here ). The most complicated part is defining the MM-forces (namely softcoring the intra-region nb interactions so as to avoid singularities when the MM-energies are switched off).

Would it be possible that we might integrate the linked code (or a variation of it) into openmm-ml and add some pytests? I'd be happy to help/troubleshoot.

NaNs when using torchani

I'm running simulations of mixed ML/MM systems where part is computed with ANI-2x and part with Amber. As long as I specify implementation='nnpops' in the call to createMixedSystem() it works well. But if I specify implementation='torchani', the simulation immediately blows up with NaN coordinates. I tried a few molecules and the result is the same for all of them.

Does anyone have an idea what could be causing this? I can put together a test case to reproduce the problem, if that's helpful. My current system is too big to post here. Here are the relevant packages from my conda environment.

nnpops                    0.4             cuda112py310hd4d1af5_0    conda-forge
openmm                    8.0.0           py310h5728c26_0    conda-forge
openmm-ml                 1.0                pyhd8ed1ab_0    conda-forge
openmm-torch              1.0             cuda112py310hbd91edb_0    conda-forge
pytorch                   1.13.1          cuda112py310he33e0d6_200    conda-forge
torchani                  2.2.2           cuda112py310haf08e2f_7    conda-forge

Generate Protein - Ligand System

Hi,

I am wondering what would be a clean way to create a protein ligand system where the ligand is parametrized with
one of the MLFFs ?

In the example it is shown how to create a mixed system but this requires already having a topology + system
parametrized without the MLFF.
Currently I am trying to do this by just creating a mixed system using one of the small molecule FFs together with
e.g. amber. So I get a system where my Protein is parametrized with amber and the Ligand with the small molecule FF.
I can then use this system and topology together with a list of atom ids for the ligand to use the mixed system functionality from
openmm-ml but this seems very messy.

Is there a better way, or am I maybe doing something horribly wrong in this approach to start with ?

Make conda-forge package

Set up continuous integration for openmm-ml

We should also set up continuous integration for this repo.

As part of this process, we should implement #3 and have @peastman upgrade to use the MolSSI cookiecutter for ease of maintenance.

Feature request: Interface with NEQUIP

Excited to see the ANI interface. I wonder how hard it would be to interface with NEQUIP? https://github.com/mir-group/nequip It's also PyTorch based and gets really exceptional performance on condensed matter systems. Currently NEQUIP runs MD with LAMMPS which is a pain.

failed to equip `nnpops` with error

Hi developers of openmm-ml,

I was trying to run a small MD simulation completely based on ani2x potentials but the system fails to be created at the potential.createSystem(modeller.topology) step throwing this error:

/home/joshua/anaconda3/envs/openmm_torch_py39/lib/python3.9/site-packages/torchani/resources/
failed to equip nnpops with error: /home/joshua/anaconda3/envs/openmm_torch_py39/lib/python3.9/site-packages/NNPOps/libNNPOpsPyTorch.so: undefined symbol: _ZNK3c106IValue12isDoubleListEv

I am trying to run it on a small protein of 10 amino acids in length, with the default solvation parameters in the openmm tutorial.
modeller.addSolvent(forcefield=forcefield, padding=1.0*nanometers)

Is this error due to unrecognized elements in the protein PDB or possibly something else?

I have attached the protein PDB file here for reference.

Packages:
openmm 7.7.0
openmmml 1.0
torchani 2.2.2
nnpops 0.2

Chignolin.zip

Problem with running ANI simulations

Hi, I've opened an issue in openmm/openmm-torch#26 about running simulations with ANI potentials but I thought I'd submit a ticket here since it might be more appropriate. I tried following the example in the docstring for creating a mixed system and run the code below.

prmtop = AmberPrmtopFile("molecule.prmtop")
inpcrd = AmberInpcrdFile("molecule.rst7")

potential = MLPotential('ani2x')
ml_atoms  = [atom.index for atom in prmtop.topology.atoms() if atom.residue.name == "MOL"]
mm_system = prmtop.createSystem(nonbondedMethod=NoCutoff)
ml_system = potential.createMixedSystem(prmtop.topology, mm_system, ml_atoms)

integrator = LangevinIntegrator(
    298.15 * unit.kelvin, 
    1 / unit.picosecond, 
    1.0 * unit.femtosecond,
)
simulation = Simulation(
    prmtop.topology,
    ml_system,
    integrator,
    Platform.getPlatformByName("CUDA"),
)
simulation.context.setPositions(inpcrd.positions)
simulation.minimizeEnergy()

But, I get the following error:

---------------------------------------------------------------------------
OpenMMException                           Traceback (most recent call last)
<ipython-input-6-e4ae2128ac25> in <module>
     27 )
     28 simulation.context.setPositions(inpcrd.positions)
---> 29 simulation.minimizeEnergy()

~/anaconda3/envs/openmm-torch/lib/python3.9/site-packages/simtk/openmm/app/simulation.py in minimizeEnergy(self, tolerance, maxIterations)
    126             to how many iterations it takes.
    127         """
--> 128         mm.LocalEnergyMinimizer.minimize(self.context, tolerance, maxIterations)
    129 
    130     def step(self, steps):

~/anaconda3/envs/openmm-torch/lib/python3.9/site-packages/simtk/openmm/openmm.py in minimize(context, tolerance, maxIterations)
  18990             the maximum number of iterations to perform. If this is 0, minimation is continued until the results converge without regard to how many iterations it takes. The default value is 0.
  18991         """
> 18992         return _openmm.LocalEnergyMinimizer_minimize(context, tolerance, maxIterations)
  18993     __swig_destroy__ = _openmm.delete_LocalEnergyMinimizer
  18994 

OpenMMException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/torchani/aev.py", line 35, in forward
      _2, cell0 = False, unchecked_cast(Tensor, cell)
    if _2:
      aev0 = __torch__.torchani.aev.compute_aev(species, coordinates, self.triu_index, (self).constants(), (7, 16, 112, 32, 896), None, )
             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
      aev = aev0
    else:
  File "code/__torch__/torchani/aev.py", line 110, in compute_aev
  species120 = torch.index_select(species12, 1, even_closer_indices)
  vec1 = torch.index_select(vec, 0, even_closer_indices)
  central_atom_index, pair_index12, sign12, = _7(atom_index122, )
                                              ~~ <--- HERE
  _22 = torch.slice(species120, 0, 0, 9223372036854775807, 1)
  _23 = annotate(List[Optional[Tensor]], [None, pair_index12])
  File "code/__torch__/torchani/aev.py", line 267, in triple_by_molecule
  sorted_local_index12 = torch.index(_85, _86)
  _87 = torch.index_select(_78(counts, ), 0, pair_indices)
  sorted_local_index120 = torch.add_(sorted_local_index12, _87, alpha=1)
                          ~~~~~~~~~~ <--- HERE
  _88 = annotate(List[Optional[Tensor]], [sorted_local_index120])
  local_index12 = torch.index(rev_indices, _88)

Traceback of TorchScript, original code (most recent call last):
  File "/home/jsetiadi/anaconda3/envs/openmm-torch/lib/python3.9/site-packages/torchani/aev.py", line 301, in compute_aev

    # compute angular aev
    central_atom_index, pair_index12, sign12 = triple_by_molecule(atom_index12)
                                               ~~~~~~~~~~~~~~~~~~ <--- HERE
    species12_small = species12[:, pair_index12]
    vec12 = vec.index_select(0, pair_index12.view(-1)).view(2, -1, 3) * sign12.unsqueeze(-1)
  File "/home/jsetiadi/anaconda3/envs/openmm-torch/lib/python3.9/site-packages/torchani/aev.py", line 246, in triple_by_molecule
    mask = (torch.arange(intra_pair_indices.shape[2], device=ai1.device) < pair_sizes.unsqueeze(1)).flatten()
    sorted_local_index12 = intra_pair_indices.flatten(1, 2)[:, mask]
    sorted_local_index12 += cumsum_from_zero(counts).index_select(0, pair_indices)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE

    # unsort result from last part
RuntimeError: The size of tensor a (697) must match the size of tensor b (717) at non-singleton dimension 1

Is this a problem with how I set up my code above through the openmm-ml plugin or is this a bug in TorchANI?

molecule.zip

Errors while installing OpenMM 8.0

Hi openmm-ml developers,
I was trying to install the beta release of OpenMM 8.0 and I created a conda environment with compatible versions of all the packages as described here (conda env create mmh/openmm-8-beta-linux). When trying to run the openmm-ml tests (TestMLPotential.py) I got several error messages, eg.

E RuntimeError: Tried to instantiate class 'cuaev.CuaevComputer', but it does not exist! Ensure that it is registered via torch::class_

I also attach the full error log here: error_log.txt
Is there something I can do to fix this?

`TorchForce` inside `CustomCVForce` fails with `nnpops` implementation

I have been trying to create a minimally-modified implementation to get nnpops working on a conda-installable openmm-ml environment, and i am encountering an error in TestMLPotential.py with the implementation.

If you drop in/replace the anipotential.py with this and run the test, (making sure to set the platform to CUDA), you should see that the test fails on line 26,

======================================================================
ERROR: testCreateMixedSystem (__main__.TestMLPotential)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/lila/home/rufad/github/openmm-ml/test/TestMLPotential.py", line 26, in testCreateMixedSystem
    interpEnergy1 = interpContext.getState(getEnergy=True).getPotentialEnergy().value_in_unit(unit.kilojoules_per_mole)
  File "/home/rufad/miniconda3/envs/omm_dev2/lib/python3.9/site-packages/openmm/openmm.py", line 12318, in getState
    state = _openmm.Context_getState(self, types, enforcePeriodicBox, groups_mask)
openmm.OpenMMException: Error invoking kernel: CUDA_ERROR_INVALID_HANDLE (400)

meaning that the createMixedSystem(..., interpolate=False) is compatible with nnpops, but interpolate=True is not.
I have found a way around this (by taking the TorchForce out of the customCVForce and adding a scale parameter to the ANIForce manually), but I don't want to open a PR attempting to merge my modifications unless the compatibility issue with nnpops can't be fixed an easier way.

Allow use of NNPOps when creating an ANIForce

`TestMLPotential.py` fails with `nnpops` implementation

I'm not too familiar with torch tracebacks, but it seems like Torch isn't robust to the placement of arrays onto different devices:

ERROR: testCreateMixedSystem (__main__.TestMLPotential)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/lila/home/rufad/github/openmm-ml/test/TestMLPotential.py", line 27, in testCreateMixedSystem
    mixedEnergy = mixedContext.getState(getEnergy=True).getPotentialEnergy().value_in_unit(unit.kilojoules_per_mole)
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/openmm/openmm.py", line 14580, in getState
    state = _openmm.Context_getState(self, types, enforcePeriodicBox, groups_mask)
openmm.OpenMMException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/openmmml/models/anipotential/___torch_mangle_14.py", line 36, in forward
      _5 = torch.mul(boxvectors1, 10.)
      pbc0 = self.pbc
      _6, energy1, = (model0).forward(_4, _5, pbc0, )
                      ~~~~~~~~~~~~~~~ <--- HERE
      energy = energy1
    energyScale = self.energyScale
  File "code/__torch__/NNPOps/OptimizedTorchANI.py", line 19, in forward
    species_aevs = (aev_computer).forward(species_coordinates0, cell, pbc, )
    neural_networks = self.neural_networks
    species_energies = (neural_networks).forward(species_aevs, )
                        ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    energy_shifter = self.energy_shifter
    species_energies0 = (energy_shifter).forward(species_energies, None, None, )
  File "code/__torch__/NNPOps/BatchedNN.py", line 10, in forward
    species_aev: Tuple[Tensor, Tensor]) -> __torch__.NNPOps.EnergyShifter.SpeciesEnergies:
    _0 = getattr(self, "0")
    return (_0).forward(species_aev, )
            ~~~~~~~~~~~ <--- HERE
  def __len__(self: __torch__.NNPOps.BatchedNN.TorchANIBatchedNN) -> int:
    return 1
  File "code/__torch__/NNPOps/BatchedNN.py", line 33, in forward
    layer0_weights = self.layer0_weights
    layer0_biases = self.layer0_biases
    vectors0 = ops.NNPOpsBatchedNN.BatchedLinear(vectors, layer0_weights, layer0_biases)
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    vectors1 = __torch__.torch.nn.functional.celu(vectors0, 0.10000000000000001, False, )
    layer2_weights = self.layer2_weights

Traceback of TorchScript, original code (most recent call last):
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/openmmml-1.0-py3.9.egg/openmmml/models/anipotential.py", line 135, in forward
                    self.pbc = self.pbc.to(positions.device)
                    boxvectors = boxvectors.to(torch.float32)
                    _, energy = self.model((self.species, positions), cell=10.0*boxvectors, pbc=self.pbc)
                                ~~~~~~~~~~ <--- HERE

                return energy * self.energyScale # Hartree --> kJ/mol
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/NNPOps/OptimizedTorchANI.py", line 53, in forward
        species_coordinates = self.species_converter(species_coordinates)
        species_aevs = self.aev_computer(species_coordinates, cell=cell, pbc=pbc)
        species_energies = self.neural_networks(species_aevs)
                           ~~~~~~~~~~~~~~~~~~~~ <--- HERE
        species_energies = self.energy_shifter(species_energies)

  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/NNPOps/BatchedNN.py", line 122, in forward
    def forward(self, species_aev: Tuple[Tensor, Tensor]) -> SpeciesEnergies:
        return self[0].forward(species_aev)
               ~~~~~~~~~~~~~~~ <--- HERE
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/NNPOps/BatchedNN.py", line 99, in forward
        vectors = aev.unsqueeze(-2).unsqueeze(-1)

        vectors = batchedLinear(vectors, self.layer0_weights, self.layer0_biases) # Linear 0
                  ~~~~~~~~~~~~~ <--- HERE
        vectors = F.celu(vectors, alpha=0.1)                                      # CELU   1
        vectors = batchedLinear(vectors, self.layer2_weights, self.layer2_biases) # Linear 2
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper__bmm)


----------------------------------------------------------------------
Ran 1 test in 32.967s

FAILED (errors=1)

@peastman , any idea what is going wrong here? or perhaps @raimis knows what is wrong.

alternatively, if i try to run this without GPUs, it throws a runtime error:

======================================================================
ERROR: testCreateMixedSystem (__main__.TestMLPotential)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/lila/home/rufad/github/openmm-ml/test/TestMLPotential.py", line 17, in testCreateMixedSystem
    mixedSystem = potential.createMixedSystem(pdb.topology, mmSystem, mlAtoms, interpolate=False)
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/openmmml-1.0-py3.9.egg/openmmml/mlpotential.py", line 265, in createMixedSystem
    self._impl.addForces(topology, newSystem, atomList, forceGroup, **args)
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/openmmml-1.0-py3.9.egg/openmmml/models/anipotential.py", line 91, in addForces
    model = OptimizedTorchANI(model, species).to(device)
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/torch/nn/modules/module.py", line 899, in to
    return self._apply(convert)
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/torch/nn/modules/module.py", line 570, in _apply
    module._apply(fn)
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/torch/nn/modules/module.py", line 616, in _apply
    self._buffers[key] = fn(buf)
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/torch/nn/modules/module.py", line 897, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
  File "/home/rufad/miniconda3/envs/openmm-dev/lib/python3.9/site-packages/torch/cuda/__init__.py", line 214, in _lazy_init
    torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available

do we generally want to make this package robust to the platform type? or only to CUDA?

Issue with StateDataReporter

Hi,

Wonderful tool! I have been using it to run simulations of ligands in aqueous solution and so far so good.

There is one small issue, however, that I have been facing, which has been preventing me to obtain important data from the simulations. Whenever I include the following line to print properties of the system:

sim.reporters.append(mm.app.StateDataReporter(stdout, 1000, step=True, time=True, potentialEnergy=True, kineticEnergy=True, totalEnergy=True, temperature=True, volume=True, density=True))

I get this error:

Traceback (most recent call last): File "ani.py", line 52, in <module> sim.step(10000) File "/home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/app/simulation.py", line 132, in step self._simulate(endStep=self.currentStep+steps) File "/home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/app/simulation.py", line 234, in _simulate self._generate_reports(wrapped, True) File "/home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/app/simulation.py", line 252, in _generate_reports state = self.context.getState(getPositions=getPositions, getVelocities=getVelocities, getForces=getForces, File "/home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/openmm.py", line 4888, in getState state = _openmm.Context_getState(self, types, enforcePeriodicBox, groups_mask) simtk.openmm.OpenMMException: The autograd engine was called while holding the GIL. If you are using the C++ API, the autograd engine is an expensive operation that does not require the GIL to be held so you should release it with 'pybind11::gil_scoped_release no_gil;'. If you are not using the C++ API, please report a bug to the pytorch team. Exception raised from execute at /tmp/pip-req-build-2handpz9/torch/csrc/autograd/python_engine.cpp:111 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6a (0x2b9fba3dcdba in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xd8 (0x2b9fba3d9338 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libc10.so) frame #2: torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator<torch::autograd::Edge> > const&) + 0x122 (0x2ba019cb5452 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_python.so) frame #3: <unknown function> + 0x2e900f2 (0x2b9f798150f2 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #4: torch::autograd::backward(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, c10::optional<bool>, bool, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) + 0x6a (0x2b9f79815afa in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #5: <unknown function> + 0x3497f56 (0x2b9f79e1cf56 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #6: at::Tensor::_backward(c10::ArrayRef<at::Tensor>, c10::optional<at::Tensor> const&, c10::optional<bool>, bool) const + 0x1c9 (0x2b9f77ed9db9 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #7: TorchPlugin::OpenCLCalcTorchForceKernel::execute(OpenMM::ContextImpl&, bool, bool) + 0x8c1 (0x2ba00ad71941 in /home/jm4g18/miniconda3/envs/torchml/lib/plugins/libOpenMMTorchOpenCL.so) frame #8: OpenMM::ContextImpl::calcForcesAndEnergy(bool, bool, int) + 0xca (0x2b9f5c39ac8a in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/../../../../libOpenMM.so.7.5) frame #9: OpenMM::Context::getState(int, bool, int) const + 0x15a (0x2b9f5c39978a in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/../../../../libOpenMM.so.7.5) frame #10: <unknown function> + 0x169148 (0x2b9f5c168148 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/_openmm.cpython-39-x86_64-linux-gnu.so) <omitting python frames> frame #34: __libc_start_main + 0xf5 (0x2b9f553c4555 in /lib64/libc.so.6)

Something seems to go wrong when using the getState method of Context, independently of the platform I use. Any idea of what is going on?

Thank you!

Best,
João

Dynamics failing for pure ML system

Hi everyone!

I have been following the example outlined in the test and wanted to run a short MD simulation with these systems.
This works great with the potential.createMixedSystem, but I ran into issues using a pure ANI system created with potential.createSystem since the simulation blows up within a few fs.

Am I missing something? Is this not the intended use case for createSystem?
I have attached a sample script to reproduce the error.

Supported NNP architectures

An initial goal is to support the NNP architectures provided by these projects:

TorchANI (https://github.com/aiqm/torchani/)
- Available on conda-forge (https://anaconda.org/conda-forge/torchani)
SchNetPack (https://github.com/atomistic-machine-learning/schnetpack/)
- Available on conda-forge (https://anaconda.org/conda-forge/schnetpack)
TorchMD-Net (https://github.com/torchmd/torchmd-net)
- Available on conda-forge

OpenMM-ML 1.1

A bug (#55) had been fixed (#56). It is time for the next release.

Related issues: there are several releases (https://anaconda.org/conda-forge/openmm-ml/files) but no corresponding release notes or even tags in this repo.

Ping: @peastman @mikemhenry

How to properly set up Periodic Box?

Hi, I'm trying to run molecular simulation using ML potential. Here is the code I used:

from sys import stdout
from openmmml import MLPotential
from openmm.app import *
from openmm import *
from openmm.unit import *
from openmm.unit import picosecond, picoseconds

pdb = PDBFile('input_new.pdb')

potential = MLPotential('ani2x')
system = potential.createSystem(pdb.topology)
integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 0.002*picoseconds)
simulation = Simulation(pdb.topology, system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.context.setPeriodicBoxVectors(Vec3(3.0, 0, 0), Vec3(0, 3.0, 0), Vec3(0, 0, 3.0))
simulation.minimizeEnergy()
simulation.reporters.append(PDBReporter('output.pdb', 1000))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True, potentialEnergy=True, temperature=True))
simulation.step(10000)

input_new.pdb.txt

However, the following error occurred:

/data/miniconda3/envs/opm8/lib/python3.9/site-packages/torchani/__init__.py:55: UserWarning: Dependency not satisfied, torchani.ase will not be available
  warnings.warn("Dependency not satisfied, torchani.ase will not be available")
Warning: importing 'simtk.openmm' is deprecated.  Import 'openmm' instead.
/data/miniconda3/envs/opm8/lib/python3.9/site-packages/torchani/resources/
Traceback (most recent call last):
  File "/data/opm8/bb.py", line 16, in <module>
    simulation.minimizeEnergy()
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/openmm/app/simulation.py", line 137, in minimizeEnergy
    mm.LocalEnergyMinimizer.minimize(self.context, tolerance, maxIterations)
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/openmm/openmm.py", line 8544, in minimize
    return _openmm.LocalEnergyMinimizer_minimize(context, tolerance, maxIterations)
openmm.OpenMMException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/openmmml/models/anipotential.py", line 31, in forward
      _5 = torch.mul(boxvectors1, 10.)
      pbc = self.pbc
      _6, energy1, = (model0).forward(_4, _5, pbc, )
                      ~~~~~~~~~~~~~~~ <--- HERE
      energy = energy1
    energyScale = self.energyScale
  File "code/__torch__/NNPOps/OptimizedTorchANI.py", line 17, in forward
    species_coordinates0 = (species_converter).forward(species_coordinates, None, None, )
    aev_computer = self.aev_computer
    species_aevs = (aev_computer).forward(species_coordinates0, cell, pbc, )
                    ~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    neural_networks = self.neural_networks
    species_energies = (neural_networks).forward(species_aevs, )
  File "code/__torch__/NNPOps/SymmetryFunctions.py", line 37, in forward
      cell0 = cell
    holder = self.holder
    _4 = ops.NNPOpsANISymmetryFunctions.operation(holder, torch.select(positions, 0, 0), cell0)
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    radial, angular, = _4
    features = torch.unsqueeze(torch.cat([radial, angular], 1), 0)

Traceback of TorchScript, original code (most recent call last):
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/openmmml/models/anipotential.py", line 125, in forward
                else:
                    boxvectors = boxvectors.to(torch.float32)
                    _, energy = self.model((self.species, 10.0*positions.unsqueeze(0)), cell=10.0*boxvectors, pbc=self.pbc)
                                ~~~~~~~~~~ <--- HERE
            
                return self.energyScale*energy
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/NNPOps/OptimizedTorchANI.py", line 52, in forward
    
        species_coordinates = self.species_converter(species_coordinates)
        species_aevs = self.aev_computer(species_coordinates, cell=cell, pbc=pbc)
                       ~~~~~~~~~~~~~~~~~ <--- HERE
        species_energies = self.neural_networks(species_aevs)
        species_energies = self.energy_shifter(species_energies)
  File "/data/miniconda3/envs/opm8/lib/python3.9/site-packages/NNPOps/SymmetryFunctions.py", line 124, in forward
                    raise ValueError('Only fully periodic systems are supported, i.e. pbc = [True, True, True]')
    
        radial, angular = operation(self.holder, positions[0], cell)
                          ~~~~~~~~~ <--- HERE
        features = torch.cat((radial, angular), dim=1).unsqueeze(0)
    
RuntimeError: Encountered error cudaErrorNotSupported at /home/conda/feedstock_root/build_artifacts/nnpops_1658858275941/work/src/ani/CudaANISymmetryFunctions.cu:43

It seems to be telling me that periodic is not set up correctly, but I've set it by

simulation.context.setPeriodicBoxVectors(Vec3(3.0, 0, 0), Vec3(0, 3.0, 0), Vec3(0, 0, 3.0))

following the example here.

Any idea? Thanks!

Create entry point for potentials

OpenMM-ML uses a plugin architecture for defining potential functions. It supports a few potentials out of the box. Other packages can add support for additional potentials. To do that, the potential has to be registered by calling MLPotential.registerImplFactory(). The user doesn't need to do that directly. Instead, the package can do it automatically when you import it.

This provides a mostly seamless experience for users, but it does have one minor issue: your script has to import the outside library before you can create a MLPotential from it. If you leave out the import, the name of the potential function won't be recognized.

We can fix this by defining an entry point for potential functions. That would let MLPotential automatically discover all available potentials without any other packages needing to by imported first.

Use the MolSSI cookiecutter to initialize the structure of this repo

We should use the MolSSI CMS cookiecutter to initialize the structure of this repo. This will make it easy to integrate CI, documentation, and development environments compliant with best practices.

@peastman : If you haven't used this before, @jaimergp (or potentially @Lnaden) can walk you through the process.

Add installation instructions to README

@peastman @mikemhenry : We should add installation instructions for what/how to conda install to the README, which currently only contains info on how to use (but not install) openmm-ml.

Error with addExclusion()

For certain molecules, like the phenylformate example here:
phenylformate.zip
I get an error when creating a mixedSystem:
ml_system = potential.createMixedSystem(psf.topology, mm_system, ml_atoms, interpolate=True)

TypeError: addExclusion() takes 3 positional arguments but 4 were given

When I look in the code of openmmm-ml:

openmm-ml/openmmml/mlpotential.py

Line 262 in 5b577c2

force.addExclusion(i, j, True)

I see that the particles i and j are passed to the function together with a boolean. Is this really necessary, because the corresponding addExclusion function is only expecting the two particles? (I'm using openmm version 7.7.0)

    def addExclusion(self, particle1, particle2):
        r"""
        addExclusion(self, particle1, particle2) -> int
        Add a particle pair to the list of interactions that should be excluded.

        In many cases, you can use createExclusionsFromBonds() rather than adding each exclusion explicitly.

        Parameters
        ----------
        particle1 : int
            the index of the first particle in the pair
        particle2 : int
            the index of the second particle in the pair

        Returns
        -------
        int
            the index of the exclusion that was added
        """
        return _openmm.CustomNonbondedForce_addExclusion(self, particle1, particle2)

When removing True everything seems to work fine...

ML intermolecular interactions between the ML and MM subregions of the system

We're interested in enhancing the ML/MM method by including intermolecular interactions between the ML and MM regions of the system. There is some discussion on the slack channel for this about which pitfalls we might run into by complicating the method beyond intramolecular interactions, so we should build a few different methods to find the most accurate and cost effective way we can do this without running the whole system in ML.

The easiest and cheapest method is to just include the intermolecular interactions between ML atoms with nearby MM atoms, but not the reverse for MM atoms with nearby ML atoms. The energy of the system is given by

U_{ML/MM}(x_ML, x_MM) = U_{MM}(x_MM) + U_{ML}(x_ML; x_MM) + 0.5 U_{MM}(x_ML; x_MM)

where U_{MM}(x_ML; x_MM) is divided in half to account for the ML interactions that have been included. This should be the cheapest method to include any ML intermolecular interactions because the only difference is including MM atoms in the AEVs of the ML atoms and dividing MM interactions in half. The downside is we are limiting ourselves to half of the accuracy boost that ML could provide.

For the second method, if we assume the ML model learns to split the energy of atom pair interactions evenly between the pair, then we can get the total ML intermolecular interaction energy by building AEVs for the ML region both with and without the MM subregion considered in the local environments of each atom. If we double the difference in energy between those two we get the full interaction energy between the ML/MM region and we can ignore the corresponding MM interactions. John wrote this in a flexible form where we can mix the ML and MM intermolecular interactions by interpolating between the two with a parameter s

U_{ML/MM}(x_ML, x_MM; s) = U_{MM}(x_MM) + U_{ML}(x_ML) + (1-s) U_{MM}(x_ML; x_MM) + 2*s [ U_{ML}(x_ML; x_MM) - U_{ML}(x_ML) ]

where s=0 is the full MM interaction case and s=1 is the full ML interaction case. This will probably be the best method in my opinion. If anything we can regularize the ML model by explicitly training these interactions to be split evenly to correct the model for this if need be.

The last method is to build the AEVs for the minimal number of atoms we need to in the MM subregion to get their ML interaction energies with the ML region. This will be the most costly because we have to build AEVs for the ML atoms, and a subset of the MM atoms. The bonus is we don't need to get interactions for the ML region this time, we can just compute that once with the full local environment. For this method we have to collect the set of atoms in the MM subregion which fall into the neighbor list of all the ML atoms at least once, and then build the AEVs for those MM atoms with and without the ML atoms in their local environments. The energy for the system can be given by

U_{ML/MM}(x_ML, x_MM) = U_{MM}(x_MM) + U_{ML}(x_ML; x_MM) + [ U_{ML}(x_MM; x_ML) - U_{ML}(x_MM) ]

This last method will be the most accurate, but may not be much better than the second method while costing significantly more.

Particle position is NaNs with hybrid system

Dear openmm-ml developers,

I am trying to use this tool to run a Hybrid MM/ML simulation of a single protein chain in solvent with the protein atoms modelled by torchani and NNPOps, and the solvent with tip3pfb, but the simulation seems to be giving errors even after running energy minimisation. I have attached my code and the input file for the protein PDB which also has been energy minimised in Schrodinger.

Hoping to get some help on this and possibly figure out if the issue is due to the PDB itself or a bug in my code...

import sys
from time import perf_counter

import openmm.unit as unit
from openmm import LangevinMiddleIntegrator
from openmm.app import Simulation, StateDataReporter
from openmm.app.dcdreporter import DCDReporter
from openmm.app.forcefield import ForceField
from openmm.app.pdbfile import PDBFile
from openmm.app.pdbreporter import PDBReporter
from openmm.unit import femtosecond, kelvin, kilojoule_per_mole, picosecond
from openmmml import MLPotential
from pdbfixer import PDBFixer



# Load in the PDB strucure
fixer = PDBFixer(filename="./systems/4KD1_protein_only.pdb")
fixer.addSolvent(padding=10 * unit.angstrom)
fixer.removeHeterogens(True)
PDBFile.writeFile(fixer.topology, fixer.positions, open("./systems/4KD1_protein_only_solvated.pdb", "w"))
pdb = PDBFile(file="./systems/4KD1_protein_only_solvated.pdb")



# Setup force field
forcefield = ForceField("amber14-all.xml", "amber14/tip3pfb.xml")
pdb.system = forcefield.createSystem(pdb.topology)



# Select protein atoms
amino_acid_residues = {
    "ALA", "ASN", "CYS", "GLU", "HIS", "LEU", "MET", "PRO", "THR", "TYR", "ARG", "ASP", "GLN", "GLY", "ILE", "LYS", "PHE",
    "SER", "TRP", "VAL", "ACE", "NME", "NMA"
}
protein_atoms = set()

for residue in pdb.topology.residues():
    if residue.name in amino_acid_residues:
        for atom in residue.atoms():
            protein_atoms.add(atom.index)



# Create ML potential (single model to fit into GPU VRAM)
potential = MLPotential('ani2x')
pdb.system = potential.createMixedSystem(pdb.topology, pdb.system, protein_atoms, modelIndex=0)



# Setup simulation
n_fs = 1

temperature = 298.15 * kelvin
frictionCoeff = 1 / picosecond
timeStep = n_fs * femtosecond
integrator = LangevinMiddleIntegrator(temperature, frictionCoeff, timeStep)

simulation = Simulation(pdb.topology, pdb.system, integrator)
simulation.context.setPositions(pdb.positions)
simulation.context.setVelocitiesToTemperature(temperature)

reporter = StateDataReporter(file=sys.stdout, reportInterval=1000, step=True, time=True, potentialEnergy=True, temperature=True)
simulation.reporters.append(reporter)

trajectory = DCDReporter(file="./trajectories/4KD1_hybrid_1fs.dcd", reportInterval=10, enforcePeriodicBox=True)
simulation.reporters.append(trajectory)



# Run the simulation
n_steps = int(10_000 / n_fs)
simulation.minimizeEnergy()
simulation.step(n_steps)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[12], line 8
      4 simulation.minimizeEnergy()
      6 start = perf_counter()
----> 8 simulation.step(n_steps)
     10 end = perf_counter()
     12 print(end - start)

File ~/miniforge3/envs/nnff_py310/lib/python3.10/site-packages/openmm/app/simulation.py:141, in Simulation.step(self, steps)
    139 def step(self, steps):
    140     """Advance the simulation by integrating a specified number of time steps."""
--> 141     self._simulate(endStep=self.currentStep+steps)

File ~/miniforge3/envs/nnff_py310/lib/python3.10/site-packages/openmm/app/simulation.py:241, in Simulation._simulate(self, endStep, endTime)
    238 # Generate the reports.
    240 if len(wrapped) > 0:
--> 241     self._generate_reports(wrapped, True)
    242 if len(unwrapped) > 0:
    243     self._generate_reports(unwrapped, False)

File ~/miniforge3/envs/nnff_py310/lib/python3.10/site-packages/openmm/app/simulation.py:263, in Simulation._generate_reports(self, reports, periodic)
    259 state = self.context.getState(getPositions=getPositions, getVelocities=getVelocities, getForces=getForces,
    260                               getEnergy=getEnergy, getParameters=True, enforcePeriodicBox=periodic,
    261                               groups=self.context.getIntegrator().getIntegrationForceGroups())
    262 for reporter, next in reports:
--> 263     reporter.report(self, state)

File ~/miniforge3/envs/nnff_py310/lib/python3.10/site-packages/openmm/app/dcdreporter.py:107, in DCDReporter.report(self, simulation, state)
    102 if self._dcd is None:
    103     self._dcd = DCDFile(
    104         self._out, simulation.topology, simulation.integrator.getStepSize(),
    105         simulation.currentStep, self._reportInterval, self._append
    106     )
--> 107 self._dcd.writeModel(state.getPositions(), periodicBoxVectors=state.getPeriodicBoxVectors())

File ~/miniforge3/envs/nnff_py310/lib/python3.10/site-packages/openmm/app/dcdfile.py:125, in DCDFile.writeModel(self, positions, unitCellDimensions, periodicBoxVectors)
    123     positions = positions.value_in_unit(nanometers)
    124 if any(math.isnan(norm(pos)) for pos in positions):
--> 125     raise ValueError('Particle position is NaN.  For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#nan')
    126 if any(math.isinf(norm(pos)) for pos in positions):
    127     raise ValueError('Particle position is infinite.  For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#nan')

ValueError: Particle position is NaN.  For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#nan

4KD1_protein_only.zip

Atomic Cluster Expansion Force Fields in OpenMM

I am trying to implement the Atomic Cluster Expansion (ACE) force fields into OpenMM using this library. For details of the method see https://doi.org/10.1021/acs.jctc.1c00647

I have the following functions implemented in C:

double energy(char*, double* X, int* Z, double* cell, int* pbc, int Nat);

where
X : position of atoms (flattened)
Z: Atomic number of atoms
cell: simulation cell vectors
pbc: whether to use periodic boundary conditions
Nat: Total number of atoms

void forces(char*, double *F, double* X, int* Z, double* cell, int* pbc, int Nat);

where the arguments are analogous to the energy function, F is just an empty array that is pre-allocated for the results.

I have implemented Python wrappers for these functions, so I would need to get the positions of atoms, the atomic numbers, cell vectors all as numpy arrays.

I would really appreciate if someone with experience in OpenMM data structures could help me a little.

When I am implementing MLPotential , MLPotentialImpl and MLPotentialImplFactory I don't really see how I can obtain the above data needed and how I should return the energies and forces. I made very little progress so far, you can find it in the following repo: https://github.com/ACEsuit/ACEinterfaces/tree/OpenMM/OpenMM

Particle coordinate is NaN

Hi developers:
I'm trying to simulate a ligand in the solvent with the ML potential, but it fails immediately with the error Particle coordinate is NaN., the system runs fine under MM forcefield. Any idea? Thanks!

Code and input:

from sys import stdout
from openmmml import MLPotential
from openmm.app import *
from openmm import *
from openmm.unit import *
from openmm.unit import picosecond, picoseconds

prmtop = AmberPrmtopFile('box.prm7')
inpcrd = AmberInpcrdFile('box.rst7')

mm_system = prmtop.createSystem(nonbondedMethod=PME, nonbondedCutoff=1*nanometer, constraints=None)

res = list(prmtop.topology.residues())
ml_atoms = [atom.index for atom in res[0].atoms()]

potential = MLPotential('ani2x')
system = potential.createMixedSystem(prmtop.topology, mm_system, ml_atoms)

integrator = LangevinIntegrator(300*kelvin, 1/picosecond, 0.002*picoseconds)
simulation = Simulation(prmtop.topology, system, integrator)
simulation.context.setPositions(inpcrd.positions)
simulation.minimizeEnergy()
simulation.reporters.append(PDBReporter('output.pdb', 1000))
simulation.reporters.append(StateDataReporter(stdout, 1000, step=True, potentialEnergy=True, temperature=True))
simulation.step(300000)

temp.zip

Here is the full output:

Traceback (most recent call last):
File "test.py", line 40, in <module>
simulation.step(300000)
File "/root/mambaforge/envs/opm8/lib/python3.9/site-packages/openmm/app/simulation.py", line 141, in step
self._simulate(endStep=self.currentStep+steps)
File "/root/mambaforge/envs/opm8/lib/python3.9/site-packages/openmm/app/simulation.py", line 206, in _simulate
self.integrator.step(10) # Only take 10 steps at a time, to give Python more chances to respond to a control-c.
File "/root/mambaforge/envs/opm8/lib/python3.9/site-packages/openmm/openmm.py", line 12696, in step
return _openmm.LangevinIntegrator_step(self, steps)
openmm.OpenMMException: Particle coordinate is NaN. For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#nan

Interface to AIMNet2

AIMNet2 looks like a good candidate for another model to include. It supports 14 elements and charged molecules. It includes an explicit Coulomb term, which should help with accuracy on large molecules. The current implementation has O(N^2) time and memory in the number of atoms. They say a linear scaling version is coming soon.

Failure to remove CMAPTorsionForce from the ML region

Hi,

When removing all bonded interactions between atoms within the ML region, it seems that the case involving CMAPs is not considered. This situation arises, for example, when using the Amber ff19SB force field. If I execute the following script:

diala_solvated.zip

import openmm
import openmm.unit as unit
import openmm.app as app
from openmmml import MLPotential

inpcrd = app.AmberInpcrdFile("diala_solvated.inpcrd")
prmtop = app.AmberPrmtopFile("diala_solvated.prmtop")

mm_system = prmtop.createSystem(nonbondedMethod=app.PME, nonbondedCutoff=0.9*unit.nanometer, constraints=app.HBonds)
ml_atoms = [atom.index for res in list(prmtop.topology.residues())[:3] for atom in list(res.atoms())]

potential = MLPotential("ani2x")
ml_system = potential.createMixedSystem(prmtop.topology, mm_system, ml_atoms)

I get the the following KeyError:

Traceback (most recent call last):
  File "/home/joaomorado/software/test/test/ani_simulation.py", line 23, in <module>
    ml_system = potential.createMixedSystem(prmtop.topology, mm_system, ml_atoms)
  File "/home/joaomorado/software/openmm/openmmml/mlpotential.py", line 247, in createMixedSystem
    newSystem = self._removeBonds(system, atoms, True, removeConstraints)
  File "/home/joaomorado/software/openmm-ml/openmmml/mlpotential.py", line 390, in _removeBonds
    torsionAtoms = [int(torsion.attrib[p]) for p in ('p1', 'p2', 'p3', 'p4')]
  File "/home/joaomorado/software/openmm-ml/openmmml/mlpotential.py", line 390, in <listcomp>
    torsionAtoms = [int(torsion.attrib[p]) for p in ('p1', 'p2', 'p3', 'p4')]
KeyError: 'p1'

In principle, resolving this issue should involve changing from:

openmm-ml/openmmml/mlpotential.py

Lines 388 to 392 in bc414b4

    
           for torsions in root.findall('./Forces/Force/Torsions'): 
        
               for torsion in torsions.findall('Torsion'): 
        
                   torsionAtoms = [int(torsion.attrib[p]) for p in ('p1', 'p2', 'p3', 'p4')] 
        
                   if shouldRemove(torsionAtoms): 
        
                       torsions.remove(torsion)

to:

for torsions in root.findall('./Forces/Force/Torsions'):
    for torsion in torsions.findall('Torsion'):
        torsionAtomsLabels =  ('p1', 'p2', 'p3', 'p4') if all(p in torsion.attrib for p in ('p1', 'p2', 'p3', 'p4')) else ('a1', 'a2', 'a3', 'a4', 'b1', 'b2', 'b3', 'b4')
        torsionAtoms = [int(torsion.attrib[p]) for p in torsionAtomsLabels]
        if shouldRemove(torsionAtoms):
            torsions.remove(torsion)

I am happy to do a PR with this fix, but I am wondering if there are any other cases to consider.

Many thanks.

Challenges with Hybrid OpenMM-ML Simulation: Unable to Parametrize Ligand with Force Field

Context: As a biology student from Brazil, I'm not typically involved in programming but rather in using bioinformatics tools. I am seeking to learn how to conduct hybrid simulations with ligands/proteins using ANI in OpenMM-ML, where these ligands can be carbohydrates or small molecules. The challenge I face is that to apply ANI to the ligand, the entire system must be parametrized with a classical force field, which has been problematic. Additionally, the PDB files I use are complexes of the protein with the ligand. I would greatly appreciate guidance on how to properly set up the force field for the ligand in such simulations.

Problem Description

I am attempting to conduct a hybrid simulation using OpenMM, where the ligand is treated with the ANI potential (through OpenMM-ML) and the protein with a classical force field (I've tried both CHARMM and AMBER). However, I am consistently encountering issues in parametrizing the force field for the ligand. Whether it's a small molecule or a carbohydrate, the force field (including GAFF, Glycam, Amber or Charmm) fails to recognize the ligand.

Approaches Attempted

Using GAFF with OpenFF Toolkit: I tried generating templates for the ligand using GAFF and the OpenFF Toolkit, but faced issues with ValueError: No template found for residue 1 (LIG).
code: molecule = Molecule.from_file('/home/user/Documentos/ligand.sdf', file_format='sdf')
gaff = GAFFTemplateGenerator(molecules=molecule)
forcefield = ForceField('amber14-all.xml','amber/tip3p_HFE_multivalent.xml')
forcefield.registerTemplateGenerator(gaff.generator)
pdbfile = PDBFile('/home/user/Documentos/test.pdb')
system = forcefield.createSystem(pdbfile.topology, nonbondedMethod=PME, nonbondedCutoff=1*nanometer, constraints=HBonds)

-The ligand.sdf file is the .sdf version of the ligand alone from the test.pdb complex (which has ligand + protein).

Using Classical Force Fields (CHARMM and AMBER): I attempted to create systems with classical force fields for the protein but kept running into the same template issue for the ligand.
Using GLYCAM: Parametrizing carbohydrates for the GLYCAM force field in PDB files has been challenging. Modifying the file manually to suit GLYCAM seems intricate, and I haven't found a tool that simplifies this process. The GLYCAM force field doesn't recognize the carbohydrates in the PDB files I've used.

My goal is not to run a specific system in OpenMM-ML (though I was initially advised to try with the PDB complex: 3QUD), but to learn how to perform hybrid simulations of ligands/proteins with ANI. Any guidance on how to properly parametrize the entire system with a classical force field before applying ANI to the ligand, and how to overcome the current challenges, would be immensely helpful.

My system: Ubuntu 22.04
All packages installed in a conda environment with python 3.10, including: cudatoolkit 11.8.0 ; pytorch 2.0.0 with cuda support; openmm 8.0.0; openmm-ml 1.1; openmm-setup 1.3; openmm-torch 1.4; openmmforcefields 0.12.0; torchani 2.2.3

Models requiring more information

createSystem() takes a Topology as its input. That's fine for ANI, but some models will require more information. We'll definitely need formal charges, and we might need hybridization states or bond orders. We need to extend the API in some way to allow this information to be determined. Here are some ideas for approaches we might take. These aren't exclusive. We could do more than one of them.

We could extend Topology to store more information. It already can store bond orders, and it would be easy to add formal charges. On its own, this is only one piece of a solution. It still leaves the problem of where to get the information from. None of the standard file formats OpenMM imports contains it. That's why in practice Topologies never specify bond orders, even though in principle they can.

We could copy the approach used by ForceField. It would be easy to create a pseudo-forcefield that would fill in chemical information for standard residues by matching templates. Nonstandard ones could be specified manually similar to the way SMIRNOFFTemplateGenerator works.

Another possibility is to allow createSystem() to accept an OpenFF Topology in place of the OpenMM Topology. It already provides mechanisms for building descriptions of the relevant information. Models that don't need additional information would work with either type of topology.

We also could try to determine the missing information automatically based on what we do have (elements, bonds, positions). RDKit can do this. In practice I find it isn't very robust, though, so this probably isn't a good idea.

waterbox translation

Hi everyone,

We have been running waterbox simulations and noticed that the CMMotionRemover force is not set by default in the createSystem() method. While calculating diffusion coefficient we noticed that the mean square displacement of waters in the simulation increases exponentially (this happens in simulations with MACE and ANI).

I was expecting some drift, but not necessarily a drift that is steadily increasing. Anyways, adding the CMMotionRemover force to the simulation resulted in expected mean squared displacement. I have added a script to reproduce the issue. Would it make sense to add the CMMotionRemover as default?

mace_msd.zip

`createMixedSystem` can yield unstable dynamics due to CVForce's `mmForce1` (HarmonicBondForce)

I've been playing around with CustomCVForce-equipped openmm.System object returned by openmm_ml.mlpotential.MLPotential.createMixedSystem with ani2x and I'm seeing some nans in my simulations; I dug through the offending force objects that were contributing to unusually large forces (>1e5 kJ/nm*mol). I found that the ML-treated subsystem has atoms with some of these unusually large forces. specifically, I found that the mmForce1 (HarmonicBondForce) see here is contributing to the large forces when lambda_interpolate=0.88888888. since this is a harmonic bond, it suggests that the stiff spring constant is to blame for some reason?

given this, would it make sense to add another global lambda parameter that allows for the softening of these spring constants?

The serialized state of the nanned simulations is attached; however, I cannot attach the TorchForce.pt since it is too large for github
xml.tar.gz

cc: @jchodera @peastman

	for torsions in root.findall('./Forces/Force/Torsions'):
	for torsion in torsions.findall('Torsion'):
	torsionAtoms = [int(torsion.attrib[p]) for p in ('p1', 'p2', 'p3', 'p4')]
	if shouldRemove(torsionAtoms):
	torsions.remove(torsion)