Coder Social home page Coder Social logo

polymerist's Introduction

Polymer-Oriented LibrarY of Monomer Expression Rules and In-silico Synthesis Tools

GitHub Actions Build Status codecov

A unified set of tools for monomer template generation, topology building, force-field parameterization, and MD simulations of general organic polymer systems within the OpenFF framework

Source code for Davel, Connor M., Bernat, Timotej, Wagner, Jeffrey R., and Shirts, Michael R., "Parameterization of General Organic Polymers within the Open Force Field Framework"

abstract

Installation

Currently, this package only supports a "dirty" developer install using conda/mamba and pip.
First, you will need to install the conda environment manager (either the lightweight Miniconda Distribution (recommended) or the bulkier Anaconda Distribution) if you don't already have it. Further, it is highly recommended (but optional) that you also have the mamba package manager installed (either through Miniforge or Conda); this will greatly accelerate download times.

Once mamba is installed, you can proceed with the dirty polymerist install into a safe virtual environment (named "polymerist-env" here). To install, execute the following set of commands in a command line interface (CLI) in whichever directory you'd like the dev installation to live:

git clone https://github.com/timbernat/polymerist
cd polymerist
mamba env create -n polymerist-env -f devtools/conda-envs/polymerist-env.yml
mamba activate polymerist-env
pip install -e .

The third command will take at least a few minutes, and will make the CLI terminal quite busy; remain calm, that's normal!

Equivalent commands using just conda (in case mamba has not been installed) are below. These will perform the same installation, just much more slowly:

git clone https://github.com/timbernat/polymerist
cd polymerist
conda env create -n polymerist-env -f devtools/conda-envs/polymerist-env.yml
conda activate polymerist-env
pip install -e .

As an optional last step, it is recommended that you correctly set up paths to your OpenEye License, if you have access to one. Portions of conformer-generation and partial-charge assignment in polymerist will work more effectively with the OpenEye toolkit installed and licensed, but 'polymerist' is set up to not require these closed-source dependencies.

From here, you should be able to run polymerist-dependent scripts in the polymerist-env virtual environment active, either from the command line or from a Jupyter Notebook!

Copyright

Copyright (c) 2024, Timotej Bernat ([email protected])

Acknowledgements

Project based on the Computational Molecular Science Python Cookiecutter version 1.1.

polymerist's People

Contributors

timbernat avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

shirtsgroup

polymerist's Issues

Expansion of portlib

Several refinements and additions planned for the rdutils.amalgamation.portlib module:

  • Considering making part of rdutils directly
  • Deduplify enumeration of port pairs (preclude need for set operations)
  • Determine proper function for counting number of bonds in the general case (RDAtom.GetDegree(), RDAtom.GetTotalDegree(), len(RDAtom.GetBonds(), or RDAtom.GetExplicitDegree())
  • Deprecate support for flavor-0 ports as being able to bond with anything
  • Add support for using Enum to describe flavors (would still be ints and isotope number under the hood)

Expansion of port-bonding

Several refinements and additions planned for the rdutils.amalgamation.bonding module:

  • Considering making part of rdutils directly
  • Need to test and verify that saturated structures generate (under normal usage) valid RDKit Mols
  • Add support for labeling newly-created ports (as commented out here by decrease_bond_order uniquely via int_complement
  • Re-implement increase_bond_order to change bond in "one-shot" (rather than by iterating single-order upconversions). This would remove the need to relocate atom IDs by map number after each upconversion and would simplify sanitization
  • Canonicalize alternate implementation of _increase_bond_order backend method
  • Revisit implementations of dissolve_bond and saturate_ports

Stronger typing

Feel the need to strengthen typing and tyehints in a handful of places:

  • gentutils.typetools : Replace custom Numeric type with builtin numbers.Number
  • genutils.typetools.numpytypes : Implement arithmetic operation into array size generics (“N”, “M”, etc.) to allow arithmetic in annotations (e.g. NDArray[Shape[N/2, M**2], int] ). Relevant starting points can be found in PEP 646
  • genutils.decorators.meta : Add detailed type signature transfer to all meta-decorators
  • genutils.decorators.functional : Fix TypeErrors raised by Callables modified with optional_in_place decorator when passing positional-only args as kwargs
  • openfftools.boxvectors: most of the type-checking here could be migrated to genutils.typetools.numpytypes, while the unit checking could be migrated to unitutils

Improved checks to sequential parameters in OpenMM simulation schedules

Have documented a handful of issues with loading of serialized States and systems in the sequential openmm simulation runner that need to be resolved:

  • Ensemble-specific system force parameters (namely Barostats) are propogated where they shouldn;t be (e.g. a schedule which specifies NPT -> NVT will actually end up being run as
  • Currently no check for when prior state is empty / invalid upon read
  • No protocol for interruption if a simulation fails along the schedule
  • Integrators are currently not being serialized to XML as part of openmmtools.serialization, nor are they supported by SimulationPaths; plan to incorporate these into the workflow templates

Refinements to polymer_examples-derived code

Certain polymerist modules are essentially clones of scripts found in openff/polymer_examples, written by Connor Davel. These scripts would be greatly improved by being made more consistent with the rest of polymerist, in terms of functionality, consistency, and readability. The most obvoius of such changes are:

  • openfftools.partition : would be desirable to separate single-mol partitioning from whole topology partitioning
  • openfftools.partition : would like to have a separate method to merely check where a partition already exists (rather than just attempting to repartition always)
  • openfftools.partition : would like to migrate graph traversal to other, standardized library (perhaps rdutils.rdgraphs)
  • monomers.conversion : finish absorbing this module into monomers.specification
  • monomers..specification : implement support for SDF, RDKit Mol, and SMILES file I/O. Implement limit on replacements for similar environments (via recursive SMARTS)

Enhancements to reaction templates and assemblers

Numerous feature expansion are planned for rdutils.reactions and the relatedrdutils.bonding, including:

  • reactions.assembly : Implementing JSONification of reaction assemblers. This would involved implementing JOSNSerializers for RDKit Mols to SMILES and SMARTS
  • reactions.assembly : implementing product stereoisomer enumeration and probabilistic reaction pathways. Would require, as a subgoal, detailed option passing for rdChemReactions.ChemicalReaction sanitization flags (which are separate to the default SANITIZE_FLAGS)
  • reactions.fragment : implementing an IntermonomerBondIdentificationStrategy which supports ringed molecule cleavage via search for bridge edges coinciding with newly-formed bonds
  • bonding.permutation : add support for Cycle-based bond remap input (either via dict or directly via maths.cominbatorics.permutations.Cycle/Permutation) to reduce redundancy in current notation
  • bonding.substitution : implement capping such that newly-inserted atoms in cap groups are appended to the list of atom ids, rather than mixed in (i.e. reimplement hydrogenate_mol_ports and saturate_ports to preserve initial atom order via Chem.rdmolops.RenumberAtoms)

Support for OpenMM-style units in LatticeParameters

Currently, the maths.lattices.bravais.LatticeParameters class, while correctly implemented mathematically and supporting radians or degrees, does not provide support for OpenMM Quanity values for lattice parameters. Additionally, many legacy components of polymerist could be unified by reimplementation via LatticeParameters, rather than bespoke array methods as currently implemented. Namely:

  • LatticeParameters should support length-valued unit vector lengths and angle-valued axial angles. This should be reflected in the "in_degrees" parameter in the angles property
  • lammpstools.lammpseval.get_lammps_unit_cell(): should support return of a LatticeParameters object, rather than just a dict
  • openfftools.boxvectors: similarly, much of the functionality in this module could be supplanted by functionality LatticeParameters already provides

Fileutils extension improvement

Would like to encapsulate and better document file extensions

  • Want dedicated Extension class which implements from_path, dotless, path append (check that this hasn't already been implemented in an existing package!)
  • Provide database of extensions (likely based on mimetypes)
  • Consider moving fileutils outside of gentuils entirely (make parallel submodule)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.