Coder Social home page Coder Social logo

open an sdf file about molly.jl HOT 10 OPEN

maartensandbox avatar maartensandbox commented on June 21, 2024
open an sdf file

from molly.jl.

Comments (10)

jgreener64 avatar jgreener64 commented on June 21, 2024 1

Yes there should, I think this a problem with Chemfiles only reading bonds for non-standard residues when there is a CONECT record in the PDB file. I can add a mention to the docs, but in general non-standard residues (including ACE/NME terminal caps) aren't tested yet.

See https://github.com/noeblassel/SINEQSummerSchool2023/blob/main/notebooks/dipeptide_nowater.pdb for an alanine dipeptide that reads in okay:

using Molly
ff_dir = joinpath(dirname(pathof(Molly)), "..", "data", "force_fields")
ff = MolecularForceField(joinpath.(ff_dir, ["ff99SBildn.xml", "tip3p_standard.xml"])...)
sys = System("dipeptide_nowater.pdb", ff; rename_terminal_res=false)

from molly.jl.

maartensandbox avatar maartensandbox commented on June 21, 2024 1

that clarifies some stuff! I'm really struggling with basic things but from testing it looks like molly is at least a factor 10 faster than openmm (for my admittedly weird use case).

from molly.jl.

maartensandbox avatar maartensandbox commented on June 21, 2024 1

The plan is to wrap it up in a paper (though the draft is progressing slowly), so molly will definitely be cited! We really only need a way to evaluate energies and manipulate positions. Molly is nice because unlike pythonic packages that bind to optimized C code, I can directly profile and manipulate the system structure.

At the moment I'm only interested in all-to-all, but there are two other projects I want to try molly on, one which requires mixing periodic boundary conditions in one direction with open boundary conditions in another, and another where I'll be using conventional periodic boundary conditions. Thanks a lot for the hard work that went into the package! It was bit rough to get started simulating my first molecule, but after that it's really only been smooth sailing!

from molly.jl.

jgreener64 avatar jgreener64 commented on June 21, 2024

We use Chemfiles for the file reading and residues are not currently supported there for SDF (https://chemfiles.org/chemfiles/latest/formats.html#list-of-supported-formats), so the setup fails.

If you want to read in data from the PDB you can use PDB format, see the example at https://juliamolsim.github.io/Molly.jl/stable/docs/#Simulating-a-protein, though currently atom names must exactly match the residue templates. This is a topic of future work.

from molly.jl.

maartensandbox avatar maartensandbox commented on June 21, 2024

And with a PDB file, how can I make the atom names match exactly while also indicating that multiple hydrogen atoms are connected to the same molecule? For example, if we follow the last example from https://www.biostat.jhsph.edu/~iruczins/teaching/260.655/links/pdbformat.pdf we see that certain hydrogen atoms will be named as "1HG1". I take it that's not supported, and I should manually place the atoms and fix up the topology? Is there yet another format that is better supported? Do hydrogen atoms follow a different convention in molly.jl?

from molly.jl.

jgreener64 avatar jgreener64 commented on June 21, 2024

The residue template in the force field xml file defines the atom names (which currently must be matched exactly) and the connectivity between them. For example for alanine in ff99SBildn.xml:

    <Residue name="ALA">
      <Atom name="N" type="N" charge="-0.4157"/>
      <Atom name="H" type="H" charge="0.2719"/>
      <Atom name="CA" type="CT" charge="0.0337"/>
      <Atom name="HA" type="H1" charge="0.0823"/>
      <Atom name="CB" type="CT" charge="-0.1825"/>
      <Atom name="HB1" type="HC" charge="0.0603"/>
      <Atom name="HB2" type="HC" charge="0.0603"/>
      <Atom name="HB3" type="HC" charge="0.0603"/>
      <Atom name="C" type="C" charge="0.5973"/>
      <Atom name="O" type="O" charge="-0.5679"/>
      <Bond atomName1="N" atomName2="H"/>
      <Bond atomName1="N" atomName2="CA"/>
      <Bond atomName1="CA" atomName2="HA"/>
      <Bond atomName1="CA" atomName2="CB"/>
      <Bond atomName1="CA" atomName2="C"/>
      <Bond atomName1="CB" atomName2="HB1"/>
      <Bond atomName1="CB" atomName2="HB2"/>
      <Bond atomName1="CB" atomName2="HB3"/>
      <Bond atomName1="C" atomName2="O"/>
      <ExternalBond atomName="N"/>
      <ExternalBond atomName="C"/>
    </Residue>

The PDB file would have to have ALA as the residue name and every atom name appearing exactly once for a residue to match this template. Future work will allow flexibility in residue names and atom names by searching for the best template.

At the minute you are probably best off removing all hydrogens and adding them back with OpenMM, which will give atom names consistent with OpenMM XML force field files. All residues will need a template available, though that is the case with most software.

from molly.jl.

maartensandbox avatar maartensandbox commented on June 21, 2024

I think I correctly implemented alanine for use with ff99

ATOM      1  H1  ACE     1       2.000   1.000  -0.000
ATOM      2  CH3 ACE     1       2.000   2.090   0.000
ATOM      3  H2  ACE     1       1.486   2.454   0.890
ATOM      4  H3  ACE     1       1.486   2.454  -0.890
ATOM      5  C   ACE     1       3.427   2.641  -0.000
ATOM      6  O   ACE     1       4.391   1.877  -0.000
ATOM      7  N   ALA     2       3.555   3.970  -0.000
ATOM      8  H   ALA     2       2.733   4.556  -0.000
ATOM      9  CA  ALA     2       4.853   4.614  -0.000
ATOM     10  HA  ALA     2       5.408   4.316   0.890
ATOM     11  CB  ALA     2       5.661   4.221  -1.232
ATOM     12 HB1  ALA     2       5.123   4.521  -2.131
ATOM     13 HB2  ALA     2       6.630   4.719  -1.206
ATOM     14 HB3  ALA     2       5.809   3.141  -1.241
ATOM     15  C   ALA     2       4.713   6.129   0.000
ATOM     16  O   ALA     2       3.601   6.653   0.000
ATOM     17  N   NME     3       5.846   6.835   0.000
ATOM     18  H   NME     3       6.737   6.359  -0.000
ATOM     19  CH3 NME     3       5.846   8.284   0.000
ATOM     20 HH31 NME     3       4.819   8.648   0.000
ATOM     21 HH32 NME     3       6.360   8.648   0.890
ATOM     22 HH33 NME     3       6.360   8.648  -0.890
TER
END

However, when I create the system and look at the interactions:
sys = System("alanine.pdb",ff,boundary = CubicBoundary(4.0u"nm"), loggers=(coords=CoordinateLogger(50),),rename_terminal_res = false)

then I only find sys.specific_inter_lists to contain interactions between atoms in the ALA group. Shouldn't there also be a harmonicbond force between for example the C and O atoms in the ACE group?

from molly.jl.

maartensandbox avatar maartensandbox commented on June 21, 2024

One more issue:

using Molly
ff = MolecularForceField("ff14SB.xml");

sys = System("big_alanine.pdb",ff,boundary = CubicBoundary(1000.0u"nm"),rename_terminal_res = false)
total_energy(sys,find_neighbors(sys))

prints 1128.45113918646 kJ mol⁻¹

the python equivalent

import openmm
from openmm import app, unit
forcefield = app.ForceField('amber/ff14SB.xml')
f = app.PDBFile('big_alanine.pdb')
system = forcefield.createSystem(f.getTopology(), nonbondedMethod=app.NoCutoff)
integrator = openmm.LangevinMiddleIntegrator(310 * unit.kelvin,1.0 / unit.picosecond,1 * unit.femtosecond)
sim = app.Simulation(f.getTopology(), system, integrator)
sim.context.setPositions(f.getPositions())
state = sim.context.getState(getEnergy=True)
state.getPotentialEnergy()

prints Quantity(value=547.6336059570312, unit=kilojoule/mole)

Other quantities are also incorrect, so presumably my file is still being read in incorrectly? Or was a different convention used between openmm and molly?

big_alanine.pdb.txt

from molly.jl.

jgreener64 avatar jgreener64 commented on June 21, 2024

It seems okay, the equivalent OpenMM call would be:

import openmm
from openmm import app, unit

forcefield = app.ForceField('amber/ff14SB.xml')
f = app.PDBFile('big_alanine.pdb')

system = forcefield.createSystem(
    f.getTopology(),
    nonbondedMethod=app.CutoffPeriodic,
    nonbondedCutoff=1*unit.nanometer,
    constraints=None,
    rigidWater=False,
    switchDistance=None,
    useDispersionCorrection=False,
)

integrator = openmm.LangevinMiddleIntegrator(310 * unit.kelvin,1.0 / unit.picosecond,1 * unit.femtosecond)
sim = app.Simulation(f.getTopology(), system, integrator)
sim.context.setPositions(f.getPositions())
state = sim.context.getState(getEnergy=True)
state.getPotentialEnergy()
Quantity(value=1128.4508056640625, unit=kilojoule/mole)

Which is much better, there is still a 3E-4 kJ/mol discrepancy.

Note you need a line like

CRYST1  100.000  100.000  100.000  90.00  90.00  90.00 P 1           1

in the PDB file for CutoffPeriodic to work.

from molly.jl.

jgreener64 avatar jgreener64 commented on June 21, 2024

Cool, I'd be interested to hear roughly what you are doing if you are able to share, we might be able to add more features and docs to help.

Btw make sure you are on Molly.jl v0.18.2 or later as that version had a significant performance improvement for periodic boundary conditions.

from molly.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.