Coder Social home page Coder Social logo

amber-md / cpptraj Goto Github PK

View Code? Open in Web Editor NEW
134.0 14.0 64.0 131.59 MB

Biomolecular simulation trajectory/data analysis.

License: Other

Makefile 0.30% C++ 70.32% C 19.24% Fortran 3.64% Shell 3.06% Awk 0.11% Cuda 0.63% CMake 2.09% Batchfile 0.03% Roff 0.58%

cpptraj's Introduction

CPPTRAJ

Fast, parallelized molecular dynamics trajectory data analysis.

Build Status

  • GitHub Actions: GitHub Actions Status
  • AppVeyor: Windows Build Status
  • Jenkins: Jenkins Build Status
  • CodeQL: CodeQL

Description

CPPTRAJ is a program designed to process and analyze molecular dynamics trajectories and relevant data sets derived from their analysis. CPPTRAJ supports many popular MD software packages including Amber, CHARMM, Gromacs, and NAMD.

CPPTRAJ is also distributed as part of the freely available AmberTools software package. The official AmberTools release version of CPPTRAJ can be found at the Amber website.

For those wanting to use CPPTRAJ in their Python scripts, see Pytraj.

See what's new in CPPTRAJ. For those just starting out you may want to check out some CPPTRAJ tutorials or Amber-Hub which contains many useful "recipes" for CPPTRAJ.

For more information (or to cite CPPTRAJ) see the following publication:

Daniel R. Roe and Thomas E. Cheatham, III, "PTRAJ and CPPTRAJ: Software for Processing and Analysis of Molecular Dynamics Trajectory Data". J. Chem. Theory Comput., 2013, 9 (7), pp 3084-3095.

For more information regarding trajectory/ensemble parallelism via MPI in CPPTRAJ see the following publication:

Daniel R. Roe and Thomas E. Cheatham, III, "Parallelization of CPPTRAJ enables large scale analysis of molecular dynamics trajectory data". J. Comp. Chem., 2018, DOI: 10.1002/jcc25382.

Disclaimer and Copyright

CPPTRAJ is Copyright (c) 2010-2023 Daniel R. Roe. The terms for using, copying, modifying, and distributing CPPTRAJ are specified in the file LICENSE.

Documentation

The /doc subdirectory contains PDF and LyX versions of the CPPTRAJ manual. The latest version of the manual is available for download here. An HTML version can be found here. There is also limited help for commands in interactive mode via help [<command>]; help with no arguments lists all known commands.

Code documentation generated by Doxygen can be generated with the command make docs. A limited developers guide is available here and limited HTML-formatted documentation is available here.

Some examples are available in the examples subdirectory.

Installation & Testing

Run ./configure --help for a short list of configure options. ./configure --full-help will list all available configure options. For full functionality, CPPTRAJ makes use of the following libraries:

  • NetCDF
  • BLAS
  • LAPACK
  • Gzip
  • Bzip2
  • Parallel NetCDF (-mpi build only, for NetCDF trajectory output in parallel)
  • CUDA (-cuda build only)
  • HIP (-hip build only)
  • FFTW (mostly optional; required for PME functionality and very large FFTs)

CPPTRAJ also makes use of the following libraries that are bundled with CPPTRAJ. External ones can be used in place of these if desired.

  • ARPACK; without this diagonalization of sparse matrices in diagmatrix will be slow.
  • helPME by Andy Simmonett, required for PME functionality.
  • XDR for reading GROMACS XTC trajectories.
  • TNG for reading GROMACS TNG trajectories.

C++11 support is required to enable particle mesh Ewald (PME) calculation support via helPME. CPPTRAJ also uses the PCG32 and Xoshiro 128++ pseudo-random number generators.

./configure gnu should be adequate to set up compilation for most systems. For systems without BLAS/LAPACK, FFTW, and/or NetCDF libraries installed, CPPTRAJ's configure can attempt to download and install any enabled library into $CPPTRAJHOME. By default CPPTRAJ will ask if these should be installed; the '--buildlibs' option can be specified to try to automatically install any missing enabled library. For example, ./configure -fftw3 --buildlibs gnu will tell CPPTRAJ to build missing libraries including FFTW (if it is not available). To prevent CPPTRAJ from asking about building external libraries, use the '--nobuildlibs' option.

If Amber is installed and $AMBERHOME is properly set, the -amberlib flag can be specified to use the libraries already compiled in an AmberTools installation, e.g. ./configure -amberlib gnu.

For multicore systems, the -openmp flag can be specified to enable OpenMP parallelization, e.g. ./configure -openmp gnu. An MPI-parallelized version of CPPTRAJ can also be built using the -mpi flag. CPPTRAJ can be built with both MPI and OpenMP; when running this build users should take care to properly set OMP_NUM_THREADS if using more than 1 MPI process per node (the number of processes * threads should not be greater than the number of physical cores on the machine).

A CUDA build is now also available via the -cuda configure flag, a HIP build is available via the -hip flag, they are mutually exclusive. However, currently only a few commands benefit from this (see the manual for details). By default CPPTRAJ will be configured for multiple shader models; to restrict the CUDA build to a single shader model set the SHADER_MODEL environment variable before running configure.

Any combination of -cuda (or -hip), -mpi, and -openmp may be used. The configure script by default sets everything up to link dynamically. The -static flag can be used to force static linking. If linking errors are encountered you may need to specify library locations using the --with-LIB= options. For example, to use NetCDF compiled in /opt/netcdf use the option --with-netcdf=/opt/netcdf. Alternatively, individual libraries can be disabled with the -no<LIB> options. The -libstatic flag can be used to static link only libraries that have been specified.

CPPTRAJ can also be built with support for OpenMM by specifying '--with-openmm=PATH', where PATH is the OpenMM directory containing the OpenMM library, i.e. PATH/lib/libOpenMM.so. Currently the only command that uses OpenMM is emin, so compiling with OpenMM is typically not required at this time.

After configure has been successfully run, make install will compile and place the cpptraj binary in the $CPPTRAJHOME/bin subdirectory. Note that on multithreaded systems make -j X install (where X is an integer > 1 and less than the max # cores on your system) will run much faster. After installation, It is highly recommended that make check be run as well to test the basic functionality of CPPTRAJ.

There is an independently-maintained VIM syntax file for CPPTRAJ by Emmett Leddin available here.

CPPTRAJ Authors

Lead Author: Daniel R. Roe ([email protected]) Laboratory of Computational Biology National Heart Lung and Blood Institute National Institutes of Health, Bethesda, MD.

CPPTRAJ began as a C++ rewrite of PTRAJ by Thomas E. Cheatham, III (Department of Medicinal Chemistry, University of Utah, Salt Lake City, UT, USA) and many routines from PTRAJ were adapted for use in CPPTRAJ, including code used in the following classes: Analysis_CrankShaft, Analysis_Statistics, Action_DNAionTracker, Action_RandomizeIons, Action_Principal, Action_Grid, GridAction, Action_Image, and ImageRoutines.

Contributors to CPPTRAJ

  • James Maier (Stony Brook University, Stony Brook, NY, USA) Code for calculating J-couplings (used in Action_Jcoupling).

  • Jason M. Swails (University of Florida, Gainesville, FL, USA) Action_LIE, Analysis_RunningAvg, Action_Volmap, Grid OpenDX output.

  • Jason M. Swails (University of Florida, Gainesville, FL, USA) Guanglei Cui (GlaxoSmithKline, Upper Providence, PA, USA) Action_SPAM.

  • Mark J. Williamson (Unilever Centre for Molecular Informatics, Department of Chemistry, Cambridge, UK) Action_GridFreeEnergy.

  • Hannes H. Loeffler (STFC Daresbury, Scientific Computing Department, Warrington, WA4 4AD, UK) Action_Density, Action_OrderParameter, Action_PairDist.

  • Crystal N. Nguyen (University of California, San Diego) Romelia F. Salomon (University of California, San Diego) Original Action_Gist.

  • Pawel Janowski (Rutgers University, NJ, USA) Normal mode wizard (nmwiz) output, original code for ADP calculation in Action_AtomicFluct.

  • Zahra Heidari (Faculty of Chemistry, K. N. Toosi University of Technology, Tehran, Iran) Original code for Analysis_Wavelet.

  • Chris Lee (University of California, San Diego) Support for processing force information in NetCDF trajectories.

  • Steven Ramsey (CUNY Lehman College, Bronx, NY) Enhancements to entropy calculation in original Action_Gist.

  • Amit Roy (University of Utah, UT) Code for the CUDA version of the 'closest' Action.

  • Andrew Simmonett (National Institutes of Health) Code for the reciprocal part of the particle mesh Ewald calculation (electrostatic and Lennard-Jones).

  • Christina Bergonzo (National Institute of Standards and Technology, Gaithersburg, MD) Fixes and improvements to nucleic acid dihedral angle definitions (DihedralSearch).

  • David S. Cerutti (Rutgers University, Piscataway, NJ, USA) Original code for the 'xtalsymm' Action.

  • Johannes Kraml, Franz Waibl & Klaus R. Liedl (Department of General, Inorganic, and Theoretical Chemistry, University of Innsbruck) Improvements and enhancements for GIST.

Various Contributions

  • David A. Case (Rutgers University, Piscataway, NJ, USA)
  • Hai Nguyen (Rutgers University, Piscataway, NJ, USA)
  • Robert T. McGibbon (Stanford University, Stanford, CA, USA)

Code in CPPTRAJ that originated in PTRAJ

  • Holger Gohlke (Heinrich-Heine-University, Düsseldorf, Germany) Alrun N. Koller (Heinrich-Heine-University, Düsseldorf, Germany) Original implementation of matrix/vector functionality in PTRAJ, including matrix diagonalization, IRED analysis, eigenmode analysis, and vector time correlations.

  • Michael Crowley (University of Southern California, Los Angeles, CA, USA) Original code for dealing with truncated octahedral unit cells.

  • Viktor Hornak (Merck, NJ, USA) Original code for mask expression parser.

  • John Mongan (UCSD, San Diego, CA, USA) Original implementation of the Amber NetCDF trajectory format.

  • Hannes H. Loeffler (STFC Daresbury, Scientific Computing Department, Warrington, WA4 4AD, UK) Diffusion calculation code adapted for use in Action_STFC_Diffusion.

External code/libraries bundled with CPPTRAJ

  • CPPTRAJ makes use of the GNU readline library for the interactive command line.

  • CPPTRAJ uses the ARPACK library to calculate eigenvalues/eigenvectors from large sparse matrices.

  • CPPTRAJ uses the xdrfile library for reading XTC files; specifically a somewhat updated version from MDTRAJ that includes some bugfixes and enhancements. See src/xdrfile/README for details.

  • CPPTRAJ uses the GROMACS TNG library for reading TNG files. See sec/tng/README for details.

  • The reciprocal part of the PME calculation is handled by the helPME library by Andy Simmonett.

  • Support for reading DTR trajectories uses the VMD DTR plugin.

  • CPPTRAJ uses code for the permuted congruent pseudo-random number generator PCG32 by Melissa O'Neill and the Xoshiro 128++ pseudo-random number generator by David Blackman and Sebastino Vigna.

  • The code for quaternion RMSD calculation was adapted from code in qcprot.c originally written by Douglas L. Theobald and Pu Lio (Brandeis University).

  • The code for reading numpy arrays in src/libnpy is from libnpy written by Leon Merten Lohse et al. (Universität Göttingen).

cpptraj's People

Contributors

acruzpr avatar agoetz avatar amit56r avatar ctlee avatar drroe avatar ericchen521 avatar ex-rzr avatar fwaibl avatar gosldorf avatar hainm avatar halx avatar hanatok avatar jokr91 avatar mjw99 avatar multiplemonomials avatar rmcgibbo avatar rosswalker avatar slochower avatar swails avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cpptraj's Issues

git stuff

So you (Dan) are in github now. These are several things you might want to know (if not yet).

basically you can write code nicely like this

print("hello world")

amberpic_gpu_smooth

turn off verbose for clustering and my questions about clustering

hi,

this is pytraj got when calling kmeans in cpptraj

In [4]: from pytraj.cluster import kmeans

In [5]: kmeans(traj, n_clusters=3)
#Clustering: 3 clusters 10 frames
#Cluster 0 has average-distance-to-centroid 5.643003
#Cluster 1 has average-distance-to-centroid 5.645873
#Cluster 2 has average-distance-to-centroid 4.141970
#DBI: 1.298084
#pSF: 3.236341
#Algorithm: Kmeans nclusters 3 maxit 100
#Representative frames: 9 5 1

Out[5]: array([2, 2, 1, 1, 1, 1, 0, 0, 0, 0], dtype=int32)

I although pytraj turn off cpptraj's verbose but still get the stdout.

  • Can we turn it off (if yes, I can make PR, or it's ok if it's quick for you to do).
  • Does you intend to include clustering info (average-centroid-distance, refp frame number, ...) in Dataset? (currently they are printed to stdout).
  • I have an idea that try to delay writing cluster trajs to disk (only do when calling Print). What do you think about performance.
  • I would like to construct pytraj workflow like this
    • loading traj
    • perform clustering and getting frame indcies for each cluster, getting rep frame number
    • after playing with other stuff, calling write_cluster for specific clusters (for example, top 5)

What do you think?

Examples code: mostly I just need to write to pdb files to view in VMD. So for rep frames, I can write all to a single file like this.

rep_frame_indices = get_from_clustering...
traj[ref_frame_indices].save('reps.pdb', mode='model')

SetupX and SetX

do you think it's good idea to have better names for those two routines

this is from cpptraj's doc in Frame class

  * In addition to the constructors, there are two classes of routine that
  * can be used to set up Frames. The SetupX routines do any memory allocation,
  * and assign masses, and the SetX routines assign coordinates/velocities. The
  * SetX routines will dynamically adjust the size of the frame up to maxnatom,
  * but no reallocation will occur so the frame should be set up for the largest
  * possible # of atoms it will hold. This avoids expensive reallocations.
  * The representation of coordinates (X) and velocities (V) are double*
  * instead of STL vectors so as to easily interface with the FileIO routines
  * which tend to be much faster than iostream ops.

this is only minor thing since they are used internally.

PCA flip sign

In the process of satisfying my curiosity, I compare PCA between cpptraj and sklearn.

Both give the same absolute result for the projection values. However, the sign is opposite.

Please check the plot here
pca_cpptraj

pca

is it your intention? (already tried to search PCA flip sign cpptraj on google)

wrapping Trajin or DataSet_Coord_TRJ

Dan,

pytraj used Trajin class in cpptraj for its TrajectoryIterator. Now I have time to look closer at DataSet_Coord_TRJ and find it more interesting.

  • it can have multiple trajectories and can randomly access frame.
    • so in pytraj, I just need to create
traj = DataSet_Coords_TRJ()
traj.top = io.load("./myparm.top)
traj.load(a_list_of_file_names)
  • it has "void GetFrame(int idx, Frame& fIn, AtomMask const& mIn);" so I can move mask selection from pytraj level to cpptraj level for frame iterator (for frame in traj(mask='@CA').

What do you think about using DataSet_Coord_TRJ for TrajectoryIterator? any technical issue I should expect if I implement?

Hai

AddSet(DataSet*): how can I get this right?

hi @Mojyt,

I am playing with AddSet methods in DataSetList class in pytraj. I have three tests here and expected got size=1 for DataSetList after using AddSet(DataSet*) method (where size is the number DataSet in DataSetList). However, two first tests resulted size=0. I got lost here. Can you tell me about this? thanks

def test_0(self):
    dset_traj = DataSet_Coords_TRJ()
    dslist = DataSetList()

    # wrapper of "AddSet(DataSet*)
    dslist.add_existing_set(dset_traj)
    print (dslist.size) # = 0, but I expected "=1"

def test_1(self):
    dset_traj = DataSet_MatrixDbl()
    dslist = DataSetList()

    # wrapper of "AddSet(DataSet*)
    dslist.add_existing_set(dset_traj)
    print (dslist.size) # = 0, but I expected "=1"

def test_2(self):
    dslist = DataSetList()

    # wrapper of "AddSet(DataType, name, default_name)"
    dslist.add_set("coords", "name", "funny_name")
    print (dslist[0])
    print (dslist.size) # = 1 (is what I expected)

(you can pull Amber master, recompile pytraj and go to tests folder to run python ./test_DataSetList_add_set_question.py

add `parm` to cpptraj input for testing

I prefer to have cpptraj testing input to have consistent syntax. It's easier to parse.

for example, this is a good traj.in that pytraj can uses to compare data to cpptraj.

cat traj.in

parm 2koc.parm7 # need to have parm here
trajin traj.x
....

pyraj will use above input to create CpptrajState

(dir: cpptraj/test/Test_DRMSD)

In [8]: state = pt.load_cpptraj_file('./drmsd.in')

In [9]: state.run()
Out[9]: 0

In [10]: state.datasetlist.values
Out[10]:
array([[  0.00000000e+00,   2.95729211e+00,   4.49201771e+00, ...,
          5.18656700e+00,   5.37325298e+00,   4.82324321e+00],
       [  0.00000000e+00,   4.02238772e+00,   6.57384971e+00, ...,
          9.72083983e+00,   1.02589014e+01,   9.38592381e+00],
       [  2.43182129e-07,   4.01623189e+00,   6.41421043e+00, ...,
          8.27504991e+00,   8.19405473e+00,   7.77917637e+00],
       [  0.00000000e+00,   2.95729211e+00,   4.49201771e+00, ...,
          5.18656700e+00,   5.37325298e+00,   4.82324321e+00]])

For example:

  • this test does not parm in input: cpptraj/test/Test_IRED
  • this one does: cpptraj/test/Test_DRMSD

If you think this is an ok proposal, I will remind you in the future if you add more test folder.

add more test for cpptraj

I (again) encourage to upload more cpptraj's tests

We could make 'Amber-MD/CpptrajAdditionalTestSuite' (or any interesting name).

and add to .travis.yml file

git clone https://github.com/Amber-MD/CpptrajAdditionalTestSuite
cd CpptrajAdditionalTestSuite && make test

Benefit

  • pytraj can pull and test
  • minimize break cpptraj's code when someones make PR
  • travis does all the tests, so we don't 'forget'

updated
Some of tests I would like to have on github

  • lifetime

conda build for libcpptraj on travis

What do you think about adding 2nd build on travis for libcpptraj.

My idea is to build libcpptraj when you successfully merge to upstream repo, then push to https://anaconda.org/ambermd.

Whenever I build pytraj, I only need to conda install -c ambermd libcpptraj. pytraj will be pushed to https://anaconda.org/ambermd too.

If you're ok with this, I can make a PR. This is a sample build script in pytraj for libcpptraj but I prefers to have it on cpptraj.

https://github.com/Amber-MD/pytraj/tree/master/devtools/conda_recipe/libcpptraj

Let me know if you are not clear.

reorganize cpptraj folder

I got lost when starting to look at cpptraj's code in the begining. Mostly because there are so many files in a single folder. I think it's good idea to reorganize it. Just give example of plumed, which has similar coding style as cpptraj.

Another advantage is much easier to browse code with phone. (it took very long thumb swiping for me to read Trajout_Single.cpp file (near the end).

PS: to avoid issues out of your (Dan) radar, you can create different labels for different topics, just like in pandas's repo. In my opinion (and Jason too?), issues in github are much more convenient than g-doc.

Action_Density, write to DataSets

I am working on this Action to write data to Dataset. Any suggestion which Dataset I should use?

(it's ok if saying figure it out yourself).

Action_Mask: dataset

What do you think about adding Datasets for this Action? If you think it's ok, I can spend time exploring. :D

how to tag ensemble run for Dispatch?

So I have two situations with identical input but giving different result

input

parm ala2.99sb.mbondi2.parm7
trajin rem.nc.000 remdtraj remdtrajtemp 300.
rms out test.dat

If using Command::ProcessInput(CpptrajState&, std::string const&) to load to CpptrajState, I got what I am expecting (rmsd calculation for only 300-K frames).

however, I am using 'Command:Dispatch` to dispatch each line of above input (with CpptrajState), I got rmsd for rem.nc.000 instead.

Looking at cpptraj's code, TrajinList need to be tagged with 'ENSEMBLE' to get correct result.

So my question is

  • how to tag it with current CpptrajState
  • is it possible to make RunEnsemble so we don't need to tag at all.
  • what's your best solution?

vmd style for atom selection?

hi @swails @Mojyt

what do you guys think about adding AtomMask parsing like vmd style? ("water and CA", ...) into cpptraj/pytraj/ParmEd? I guess both of you already though about this and there must a reason not to do this in AMBER (much more concise syntax?).

Hai

lifetime stdout

Dan,

I implemented lifetime (from cpptraj analysis) in pytraj now. It seems that the statistics will be printed to stdout

In [14]: pdb = pt.load_pdb_rcsb("1l2y")

In [15]: dslist = pdb.search_hbonds()

In [22]: pt.common_actions.lifetime(dslist[1])
#Set       Nlifetimes      MaxLT      AvgLT  TotFrames SetName
         0          2          1     1.0000          2 d0
Out[22]: array([ 1.])

it seems that cpptraj's manual does not have info about dumping to dataset.

In [17]: pt.info("lifetime")
        [out <filename>] <dsetarg0> [ <dsetarg1> ... ]
        [window <windowsize> [name <setname>]] [averageonly]
        [cumulative] [delta] [cut <cutoff>] [greater | less] [rawcurve]
        [fuzz <fuzzcut>] [nosort]
  Calculate lifetimes for specified data set(s), i.e. time that data is
  either greater than or less than <cutoff> (default: > 0.5). If <windowsize>
  is given calculate lifetimes over windows of given size.

add set_frame_pointer too?

I am trying this issue (#44) from your branch.
So far it works great. So I am re-designing in-memory Trajectory as a wrapper of numpy array (instead of Frame vector). The iterating time is still a bit slow compared to iterating the raw numpy array

In [58]: t0
Out[58]: <pytraj.api.Trajectory with 1000 frames, 17443 atoms>

In [59]: %timeit  for frame in t0: pass
100 loops, best of 3: 8 ms per loop

In [60]: %timeit for xyz in t0.xyz: pass
1000 loops, best of 3: 315 µs per loop

In [61]: 8000 / 315.
Out[61]: 25.396825396825395

this is ok for single loop for frame in traj, but quite slow for nested loop for frame0 in traj: for frame1 in traj (about 8 seconds for above example).

This a bit slow because pytraj need to create a new Frame for a new iteration (as a view). (I think)

So I am propose adding set_frame_pointer (or any name you prefer).

Frame::set_frame_pointer(double *ptr) {
    X_ = ptr;
    memIsExternal_ = True;
}

With this method, pytraj only need to allocate Frame once

xyz0 = traj.xyz
# create a frame view pointing to 1st element of memory block
frame = Frame(n_atoms, xyz0[0])
for xyz in traj.xyz:
    frame.set_frame_pointer(xyz)

What do you think?

Internal Error: Adding DataSet test copy to invalid list.

So I am having trouble with adding Analysis class to CpptrajState.

I just created a State and add a MatrixDouble dataset with name 'mat'

In [105]: s2
Out[105]:
CpptrajState, include:
<datasetlist: 1 datasets>

In [106]: s2.datasetlist[0]
Out[106]: <pytraj.datasets.DatasetMatrixDouble: size=666, key=mat>

I just want to add Analysis_Matrix class to analyze the matrix:

In [109]: s2.add_analysis(Analysis_Matrix(), ArgList('matrix mat name test'))
Internal Error: Adding DataSet test copy to invalid list.
Error: Could not setup analysis [matrix]
Out[109]: 1

It's clearly that I don't understand how cpptraj works at all. Can you point me where I can read? thanks.

new Frame constructor: wrong rmsd result for ActionList`

Dan,

I am using ActionList to calculate rmsd for a series of masks. It works fine (got I expected) with immutable TrajecotryIterator but I got 0.0 results if using mutable Trajectory (numpy). Any idea why?

Note that if I am using Action_Rmsd directly, the result is correct.

Here is my python code

This give correct results for both mutable and immutable Trajectory.

        def test_rmsd(input_traj):
            from pytraj.actions.CpptrajActions import Action_Rmsd
            from pytraj.datasets import DataSetList
            dslist = DataSetList()
            act = Action_Rmsd()
            act.read_input('first @CA', top=input_traj.top, dslist=dslist)
            act.process(input_traj.top)

            for frame in input_traj:
                act.do_action(frame)
            print(dslist.values)

This give correct results for immutable Trajectory but wrong result for mutable Trajectory

        def test_rmsd_actlist(input_traj):
            from pytraj.actions.CpptrajActions import Action_Rmsd
            from pytraj.core.ActionList import ActionList
            from pytraj.datasets import DataSetList

            alist = ActionList()
            dslist = DataSetList()
            act = Action_Rmsd()
            alist.add_action(act, 'first @CA', top=input_traj.top, dslist=dslist)

            for frame in input_traj:
                alist.do_actions(frame)
            print(dslist.values)

turn verbose for ensemble

I am playing with remd stuff and CpptrajState.

For trajin keyword, I got not verbose (since pytraj turns it off)

state = pt.datafiles.load_cpptraj_state('''
        parm ala2.99sb.mbondi2.parm7
        trajin rem.nc.000
        rms
        ''')

state.run()
print(state.data)

--> output <pytraj.datasets.DatasetList - 1 datasets>

but when using, I got a bunch of lines:
Input code

state = pt.datafiles.load_cpptraj_state('''
        parm ala2.99sb.mbondi2.parm7
        ensemble rem.nc.000 remdtraj remdtrajtemp 300.
        rms
        ''')
state.run()
print(state.data)

output:

TIME: Run Initialization took 0.0001 seconds.

BEGIN ENSEMBLE PROCESSING:
        ENSEMBLE: OPENING 4 REMD TRAJECTORIES
.....................................................
ACTION SETUP FOR PARM 'ala2.99sb.mbondi2.parm7' (1 actions):
  0: [rms]
        Target mask: [*](22)
        Reference mask: [*](22)
----- rem.nc.000 (1-10, 1) -----
 0% 11% 22% 33% 44% 56% 67% 78% 89% 100% Complete.

Read 10 frames and processed 10 frames.
TIME: Trajectory processing: 0.0008 s
TIME: Avg. throughput= 12626.2626 frames / second.

ENSEMBLE ACTION OUTPUT:

DATASETS:
  4 data sets:
        RMSD_00000%0 "RMSD_00000" (double, rms), size is 10
        RMSD_00001%1 "RMSD_00001%1" (double, rms), size is 10
        RMSD_00002%2 "RMSD_00002%2" (double, rms), size is 10
        RMSD_00003%3 "RMSD_00003%3" (double, rms), size is 10
---------- RUN END ---------------------------------------------------
---------- RUN BEGIN -------------------------------------------------
Warning: No actions/output trajectories specified.

DATASETS:
  4 data sets:
        RMSD_00000%0 "RMSD_00000" (double, rms), size is 10
        RMSD_00001%1 "RMSD_00001%1" (double, rms), size is 10
        RMSD_00002%2 "RMSD_00002%2" (double, rms), size is 10
        RMSD_00003%3 "RMSD_00003%3" (double, rms), size is 10
---------- RUN END ---------------------------------------------------
<pytraj.datasets.DatasetList - 4 datasets>

better (and nicer) data label?

currently pytraj use cpptraj's Dataset's legends as keys for python dictionary.

for example

In [3]: d = pt.multidihedral(traj, resrange=[0, 3 ,5])

In [4]: d.keys()
Out[4]:
['psi:1',
 'phi:4',
 'psi:4',
 'chip:4',
 'omega:4',
 'phi:6',
 'psi:6',
 'chip:6',
 'omega:6']

It's nice that when converting to panda's DataFrame, we can access the key by class's attribute

In [5]: df = pt.multidihedral(traj, resrange=[0, 3 ,5], dtype='dataframe')

In [6]: df.psi_1
Out[6]:
0    176.615564
1    166.821296
2    168.795100
3    167.425619
4    151.183350
5    134.176110
6    160.992079
7    165.112697
8    147.943321
9    145.429014
Name: psi_1, dtype: float64

So my question to discuss is:

Should we spend a bit time to name the labels to make them nicer? Or I just go ahead changing them in pytraj to what I want and cpptraj still keep its name's convention? If Dan agrees that we can change cpptraj' side, I can spend more time make suggestions about naming (although I know that @swails is really good at naming stuff). :D

For example, hbond in cpptraj use 'UU', "UV" for legends. To my view, those are hard to understand and remember. Especially cpptraj prints those to output.

And to use pandas like above, I need to replace : to _ (psi:1 to 'psi_1) and - to _ in python.

segmentation fault for TRJ dataset

This directly related to this issue in pytraj: Amber-MD/pytraj#807 (comment)

pytraj is using DataSet_Coords_TRJ as engine for TrajectoryIterator. I double-check pytraj code and don't see anything wrong obviously yet.

I tried with cpptraj and got segmentation fault:

parm system.prmtop
loadtraj ./trunc.nc name test
crdaction test radgyr out test.dat
cpptraj -i water_issue807.in 

CPPTRAJ: Trajectory Analysis. V16.00b
    ___  ___  ___  ___
     | \/ | \/ | \/ | 
    _|_/\_|_/\_|_/\_|_

| Date/time: 09/16/15  18:26:38
| Available memory: 1686.41 MB

INPUT: Reading Input from file water_issue807.in
  [parm system.prmtop]
    Reading 'system.prmtop' as Amber Topology
  [loadtraj ./trunc.nc name test]
    Reading './trunc.nc' as Amber NetCDF
  [crdaction test radgyr out test.dat]
    Using set 'test'
    RADGYR: Calculating for atoms in mask *.
    * (5808 atoms).
 0% Segmentation fault

Note: it's OK with CRD dataset and regular cpptraj workflow

parm system.prmtop
trajin trunc.nc
radgyr out test.dat

cpptraj/pytraj todolist

I create this issue here to take note of new codes/features should be in cpptraj for pytraj's convenience. Hopefully Dan does not mind.

CpptrajState.h

add GetTrajinList(), GetActionList(), GetAnalysisList(), GetTrajoutList

adding --shared and libcpptraj

I am going to release pycpptraj-0.1 and will include in AMBER in near future. It's really great if you can add option to install libcpptraj for dynamic link.

Thanks

Hai

could not load pdb from `reduce` program

both cpptraj and parmed could not open SAM_addH.pdb file (output from reduce program). I already opened an issue in parmed repo and open a new one here (just change the message a bit)

cpptraj -p SAM_addH.pdb

CPPTRAJ: Trajectory Analysis. V16.00b
    ___  ___  ___  ___
     | \/ | \/ | \/ |
    _|_/\_|_/\_|_/\_|_

| Date/time: 08/17/15  19:10:37
| Available memory: 2196.01 MB

Error: Could not determine format of topology 'SAM_addH.pdb'
Error: Could not open topology 'SAM_addH.pdb'

This is how I got the pdb file.

import parmed as pmd

p = pmd.download_PDB('3gx5')
p[':SAM'].save('SAM.pdb')

then using reduce to get SAM_addH.pdb

$AMBERHOME/bin/reduce $pdbname.pdb > ${pdbname}_addH.pdb

all files are here: https://github.com/hainm/fftools/tree/master/sam

notes both cpptraj and parmed don't recognize the pdb file but vmd can.

in-memory reference

like we discussed a while ago about reference structure (via email), currently cpptraj use reference from disk.

But it's more convenient to use in-memory reference (especially in pytraj). For example, we can in the fly create reference structure from any source (by loading xyz coordinate, like from Gaussian output) and use the reference for extra actions.

(PS: I raise this issue again when looking at Dan's example

parm ../tz2.parm7
reference ../tz2.rst7
trajin pp2.rst7.save
# Apply backbone dihedrals from reference structure residues 1-13 to residues 1-13
makestructure "ref:1-13:tz2.rst7"

PS2: (@ Dan: you can search 'how to use 'reference' without loading from file' in your email)

sync with Amber master?

Hi @Mojyt

can you please sync github version with Amber master? I guess you made significantly changed (like Trajout class, ...) but have not seen in github yet. I need to follow your change too.

Hai

Expand mask syntax

This is related to ParmEd/ParmEd#209.

I would like to expand the mask parser to be able to select molecules, chain IDs, etc.

Current proposals from @swails

A couple ideas:

Do what Chimera does (see https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/frameatom_spec.html)
Use :: for chain IDs (in the same way that @% indicates atom type names)
An example of 1:

# Select residues 1-10 in chain A
:1-10.A
# Select residues 1-10 in chains A, B, and C
:1-10.A-C # I *think*; not sure if you can use - for string range, maybe .A,B,C
# etc...

An example of 2:

# Select chain A
::A
# Select residues 1-10 of chains A, B, and C
::A,B,C:1-10
# Select protein backbone atoms of residues 2-20 in chains A and B
::A,B:2-20@CA,C,O,N

Another idea for flexibility is to say that chains are strings, and use the same "chain" syntax to denote molecules when the chain is given a numerical value.

My Thoughts

I do like 2) for chain IDs - I think we could potentially get into trouble with the . syntax. I don't know why but for some reason I feel I've come across atom names with . in them, although nothing is coming to mind now.

For molecule selection, personally I like the idea of having a separate character that can represent molecules, the way there is currently @ for atoms and : for residues. Based on what is currently used in the mask parser, I think that the most likely candidates for "molecule" are ; and %. Both have drawbacks: ; looks enough like : that things could easily get confused (and it's on the same key), while % is already used in a subset of @ to denote "type", so it would have different meanings in different contexts.

Whatever is decided, it should be consistent between ParmEd, Cpptraj, and the rest of Amber.

remlog command

Can we try to make the remold command work with ph remd style remlogs ?

wrong number of residue when reading from babel file

pdb file is here: https://github.com/hainm/pytraj/blob/master/tests/data/A.pdb

> resinfo
#Res  Name First  Last Natom #Orig  #Mol
    1 A        1     1     1     1     1 A
    2 A        2    35    34     1     2 A
    3 A       36    36     1     1     3 A

full output

$ cpptraj -p A.pdb

CPPTRAJ: Trajectory Analysis. V16.00b
    ___  ___  ___  ___
     | \/ | \/ | \/ | 
    _|_/\_|_/\_|_/\_|_

| Date/time: 08/05/15  11:27:42
| Available memory: 688.141 MB

    Reading 'A.pdb' as PDB File
Warning: Malformed CONECT record: CONECT    3                                                           
Warning: Malformed CONECT record: CONECT    7                                                           
Warning: Malformed CONECT record: CONECT    8                                                           
Warning: Malformed CONECT record: CONECT   10                                                           
Warning: Malformed CONECT record: CONECT   12                                                           
Warning: Malformed CONECT record: CONECT   14                                                           
    A.pdb: determining bond info from distances.
Warning: A.pdb: Determining default bond distances from element types.
Warning: 2 or more molecules share residue numbers.
Warning:   Either residue information is incorrect or molecule determination was inaccurate.
Warning:   Basing residue information on molecules.
Warning:   Old # residues= 1, new # residues = 3

Note: parmed correctly detect it.

In [2]: p = pmd.load_file('./data/A.pdb')

In [3]: p.residues
Out[3]: 
ResidueList([
    <Residue A[1]; chain=A>
])

simple pytraj test for cpptraj

Hi @Mojyt

I've just finished updating pytraj to catch up with cpptraj-dev. Please following this to run the tests.

  1. export CPPTRAJHOME (pytraj will find the header files and libcpptraj in $CPPTRAJHOME/src and $CPPTRAJHOME/lib/)
  2. Install pytraj (work well with python2.7, 3.3, 3.4)
  3. add libcpptraj to LD_LIBRARY_PATH before running pytraj
  4. Run simple test
    • (from anywhere: python -c 'import pytraj; pytraj.run_tests()'
    • from pytraj root folder:
      • cd tests
      • python ./run_simple_test.py

Note: pytraj's compiling time (linking to libcpptraj.so) is currently few times slower than compiling libcpptraj. :D

Let me know if you have any question. thx.

Hai

arbitrarily iterate Trajectory with given frame indices: parallel benefit

Dan,

AFIAK, cpptraj does not have option to iterate trajectory with given frame indices (only support start, stop, stride).

I am thinking about parallelize CpptrajState for simple actions.

For example

parm 2koc.parm7
trajin md0.nc 0 1000 10
trajin md1.nc 0 1000 10
trajin md2.nc 0 1000 10
....
ref restart.nc
autoimage
rms reference @CA
distance :1 :2

In pytraj, above script will be like this

traj = pt.iterload('md*.nc', '2koc.parm7', frame_slice=[(0, 1000, 10),]*3)
data = pt.load_batch(traj, '''
ref restart.nc
autoimage
rms reference @CA
distance :1 :2
''', n_cores=8)

So my wish is to have frame_indices = range(traj.n_frames): if using n_cores=8, pytraj will create 8 CpptrajState in each core and lt each state perform calculation with a chunk of frame indices.

What's your advice here?

My idea is quite simple, just make the parallel run more abstract:

  • load a bunch of files by iterload with arbitrary frames
  • tell pytraj/cpptraj to run the job in N nodes
  • take the data

"Active reference" not passed to COORDS data sets

When a COORDS data set is created it creates a separate copy of the associated topology. However, this means that any selection by distance that relies on the COORDS internal topology reference coordinates will fail since these are not ever updated. It may be time to rework the "internal reference" coords of topology files used for distance-based mask selection.

multidihedral: suppose to calculate all supported dihtype but only few

hi,

this is my input

parm G5.pdb
trajin G5.pdb
multidihedral out test_all.out
multidihedral delta out test_delta.out

For the multidihedral out test_all.out line, cpptraj said

[multidihedral out test_all.out]
    MULTIDIHEDRAL: Calculating phi psi chip omega alpha beta gamma delta epsilon 
zeta nu1 nu2 chin chin dihedrals for all solute residues.
    Output to test_all.out
    Output range is -180 to 180 degrees.

But this is all I got

#Frame        gamma:1      delta:1        nu1:1        nu2:1       chin:1
       1      60.6213      96.8920     -35.6519      35.4184    -176.7765

To make sure I cal get delta value, I explicitly add multidihedral delta out test_delta.out and got

#Frame        delta:1
       1      96.8920

Is this your intention?

Full output is here: https://gist.github.com/hainm/54883c68d13f3d0cb3a7
pdb file is here: https://github.com/pytraj/pytraj/blob/master/tests/data/Test_NAstruct/G5.pdb

thanks

Hai

Improve SPAM functionality

Current functionality can be improved. From @swails: "​The version included in AmberTools 14 and 15 is more recent, but it is
still not well-automated -- it's work in progress. The suggested method
for running SPAM calculations are to use the "volmap" command in cpptraj to
identify peaks in solvent density, then feed those peaks to the "spam"
command which can be used to reorganize water molecule indices (so the same
water appears in the same site every frame) and compute an energy-shifted
electrostatic and van der Waals interaction for that water.

It then dumps a time series of energies for each water site, which you need
to post-process (using numpy and scipy, for instance) into a free energy
(from which you can back-out the enthalpy and entropy). The code in
SPAM.py provides code for turning this energy time-series into free
energies, although SPAM.py relies on a patched cpptraj from AmberTools12​. You can use that as a template. Hopefully SPAM.py can be updated to
work with the latest version of AmberTools and automate the part of the
procedure that cpptraj doesn't yet."

trajout: int Trajout_Single::SetupTrajWrite

Dan,

previously, with this method in cpptraj,
int Trajout_Single::SetupTrajWrite(Topology* tparmIn)

pytraj is able to open trajout by calling int Trajout_Single::SetupTrajWrite(Topology* tparmIn)
then iterate Frame and write, then close without knowing how many frames ahead.

Is there any reason to change this behavior?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.