openfreeenergy / gufe Goto Github PK

View Code? Open in Web Editor NEW

28.0 28.0 8.0 1.66 MB

grand unified free energy by OpenFE

Home Page: https://gufe.readthedocs.io

License: MIT License

Python 99.95% Xonsh 0.05%

gufe's People

Contributors

Stargazers

Watchers

Forkers

ijpulidos hmacdope mattwthompson ianmkenney mino01 lildojd jthorton slochower

gufe's Issues

Non-primitive types in GufeTokenizable

Originally here: #68 (comment)

It would be nice if this worked out of the box:

class Thing(gufe.tokenization.GufeTokenizable):
    def __init__(self, v):
        self.vals = v

    def _defaults(self):
        return {}

    def _to_dict(self):
        return {'vals': self.vals}

    @classmethod
    def _from_dict(cls, dct):
        return cls(**dct)


def test_thing():
    t = Thing(np.arange(5))

    dd = t.to_dict()

    t2 = Thing.from_dict(dd)

    assert t == t2

And also with pathlib and openff-units objects too. Then we also need to document properly how to extend this list of allowed types.

Currently most of the special casing in gufe tokenize is good at special casing gufe objects, but we also need to be able to handle foreign objects just as well. #68 probably adds a lot of the code we need for this (turns foreign objects into primitive)

Document protocol system for users

Remove openeye dependency?

So I know we have it lying around because of the old Molecule stuff, but since we can do gufe -> OFF -> OpenEye, can just get rid of this explicit dependency?

Failing old openff-toolkit

Settings (#37) was merged while it was failing on tests for the old version of openff-toolkit (0.10).

Are we removing that from the test matrix, or does that need to be fixed? Revert the merge for now so that the red doesn't make it harder to review PRs that aren't failing?

How much longer do we need to support the old toolkit?

`ChemicalState` naming

Related discussion

#3 (comment)

The container class for Components needs a good name. We currently landed on ChemicalState after Microstate was deemed too much of a loaded stat-mech term. @jchodera suggested ChemicalSystem, ComponentCollection, or ChemicalComponents all of which seem appropriate.

I think I like ComponentCollection as it describes what it does, with ChemicalSystem second as it describes what we're trying to express with the obejct.

`pip install git+https://github.com/openfe/gufe.git` hangs on Github username prompt

Running the following, I get:

$ pip install git+https://github.com/openfe/gufe.git
Collecting git+https://github.com/openfe/gufe.git
  Cloning https://github.com/openfe/gufe.git to /tmp/pip-req-build-6686emdb
  Running command git clone --filter=blob:none -q https://github.com/openfe/gufe.git /tmp/pip-req-build-6686emdb
Username for 'https://github.com':

The same occurs when I instead try to install openfe/openfe.git. This does not happen with other repos, such as datreant/datreant.git or openforcefield/openff-toolkit.git.

Note that doing a:

git clone https://github.com/OpenFreeEnergy/gufe.git

works as expected, but doing a:

git clone --filter=blob:none --quiet https://github.com/openfe/gufe.git

results in the username prompt.

Is there an org-level setting in OpenFreeEnergy that's causing this behavior?

Bug in `to_rdkit()` combined with RDKit's IPython display

In notebook,

Chem.MolFromSmiles("COC")

gives

However,

SmallMoleculeComponent(rdkit=Chem.MolFromSmiles("COC")).to_rdkit()

gives the output after raising the following:

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
File ~/mambaforge/envs/openfe-notebooks/lib/python3.9/site-packages/IPython/core/formatters.py:343, in BaseFormatter.__call__(self, obj)
    341     method = get_real_method(obj, self.print_method)
    342     if method is not None:
--> 343         return method()
    344     return None
    345 else:

File ~/mambaforge/envs/openfe-notebooks/lib/python3.9/site-packages/rdkit/Chem/Draw/IPythonConsole.py:130, in _toHTML(mol)
    127   else:
    128     content = Draw._moltoSVG(mol, molSize, [], nm, kekulize=kekulizeStructures,
    129                              drawOptions=drawOptions)
--> 130 res.append(f'<tr><td colspan=2 style="text-align:center">{content}</td></tr>')
    132 for i, (pn, pv) in enumerate(props.items()):
    133   if ipython_maxProperties >= 0 and i >= ipython_maxProperties:

UnboundLocalError: local variable 'content' referenced before assignment

This also happens with:

rdkit_mol = SmallMoleculeComponent(rdkit=Chem.MolFromSmiles("COC")).to_rdkit()
# cell break
rdkit_mol
# error here

Can't tell if this is a bug how to create the copy of the rdkit molecule, or a bug in rdkit itself.

conda list

# packages in environment at /Users/dwhs/mambaforge/envs/openfe-notebooks:
#
# Name                    Version                   Build  Channel
ambertools                21.12            py39hf80593e_0    conda-forge
amberutils                21.0                     pypi_0    pypi
appnope                   0.1.3              pyhd8ed1ab_0    conda-forge
argon2-cffi               21.3.0             pyhd8ed1ab_0    conda-forge
argon2-cffi-bindings      21.2.0           py39h89e85a6_1    conda-forge
arpack                    3.7.0                hefb7bc6_2    conda-forge
asttokens                 2.0.5              pyhd8ed1ab_0    conda-forge
astunparse                1.6.3              pyhd8ed1ab_0    conda-forge
attrs                     21.4.0             pyhd8ed1ab_0    conda-forge
backcall                  0.2.0              pyh9f0ad1d_0    conda-forge
backports                 1.0                        py_2    conda-forge
backports.functools_lru_cache 1.6.4              pyhd8ed1ab_0    conda-forge
beautifulsoup4            4.10.0             pyha770c72_0    conda-forge
biopython                 1.79             py39h89e85a6_1    conda-forge
bleach                    4.1.0              pyhd8ed1ab_0    conda-forge
blosc                     1.21.0               he49afe7_0    conda-forge
boost                     1.74.0           py39ha1f3e3e_5    conda-forge
boost-cpp                 1.74.0               h8b082ac_8    conda-forge
brotli                    1.0.9                h5eb16cf_7    conda-forge
brotli-bin                1.0.9                h5eb16cf_7    conda-forge
bzip2                     1.0.8                h0d85af4_4    conda-forge
c-ares                    1.18.1               h0d85af4_0    conda-forge
ca-certificates           2021.10.8            h033912b_0    conda-forge
cairo                     1.16.0               h8023c5d_1  
certifi                   2021.10.8        py39h6e9494a_2    conda-forge
cffi                      1.15.0           py39hc55c11b_1  
cftime                    1.6.0            py39h86b5767_0    conda-forge
click                     8.1.2            py39h6e9494a_0    conda-forge
colorama                  0.4.4              pyh9f0ad1d_0    conda-forge
coverage                  6.3.2            py39h63b48b0_2    conda-forge
curl                      7.82.0               h9f20792_0    conda-forge
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
cyjupyter                 0.2.0                    pypi_0    pypi
cython                    0.29.28          py39hfd1d529_2    conda-forge
debugpy                   1.5.1            py39h9fcab8e_0    conda-forge
decorator                 5.1.1              pyhd8ed1ab_0    conda-forge
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
entrypoints               0.4                pyhd8ed1ab_0    conda-forge
execnet                   1.9.0              pyhd8ed1ab_0    conda-forge
executing                 0.8.3              pyhd8ed1ab_0    conda-forge
expat                     2.4.8                h96cf925_0    conda-forge
fftw                      3.3.10          nompi_hf082fe4_102    conda-forge
flit-core                 3.7.1              pyhd8ed1ab_0    conda-forge
font-ttf-dejavu-sans-mono 2.37                 hab24e00_0    conda-forge
font-ttf-inconsolata      3.000                h77eed37_0    conda-forge
font-ttf-source-code-pro  2.038                h77eed37_0    conda-forge
font-ttf-ubuntu           0.83                 hab24e00_0    conda-forge
fontconfig                2.14.0               h676cef8_0    conda-forge
fonts-conda-ecosystem     1                             0    conda-forge
fonts-conda-forge         1                             0    conda-forge
fonttools                 4.31.2           py39h63b48b0_0    conda-forge
freetype                  2.10.4               h4cff582_1    conda-forge
gettext                   0.21.0               h7535e17_0  
giflib                    5.2.1                hbcb3906_2    conda-forge
glib                      2.68.4               he49afe7_0    conda-forge
glib-tools                2.68.4               he49afe7_0    conda-forge
greenlet                  1.1.2            py39hfd1d529_2    conda-forge
griddataformats           0.7.0              pyhd8ed1ab_0    conda-forge
gsd                       2.5.1            py39hc89836e_0    conda-forge
gufe                      0.2                pyhd8ed1ab_0    conda-forge
hdf4                      4.2.15               hefd3b78_3    conda-forge
hdf5                      1.12.1          nompi_ha60fbc9_104    conda-forge
icu                       70.1                 h96cf925_0    conda-forge
importlib-metadata        4.11.3           py39h6e9494a_1    conda-forge
importlib_resources       5.6.0              pyhd8ed1ab_0    conda-forge
iniconfig                 1.1.1              pyh9f0ad1d_0    conda-forge
ipykernel                 6.12.1           py39h71a6800_0    conda-forge
ipython                   8.2.0            py39h6e9494a_0    conda-forge
ipython_genutils          0.2.0                      py_1    conda-forge
ipywidgets                7.7.0              pyhd8ed1ab_0    conda-forge
jbig                      2.1               h0d85af4_2003    conda-forge
jedi                      0.18.1           py39h6e9494a_1    conda-forge
jinja2                    3.1.1              pyhd8ed1ab_0    conda-forge
joblib                    1.1.0              pyhd8ed1ab_0    conda-forge
jpeg                      9e                   h0d85af4_0    conda-forge
jsonschema                4.4.0              pyhd8ed1ab_0    conda-forge
jupyter_client            7.2.1              pyhd8ed1ab_0    conda-forge
jupyter_core              4.9.2            py39h6e9494a_0    conda-forge
jupyterlab_pygments       0.1.2              pyh9f0ad1d_0    conda-forge
jupyterlab_widgets        1.1.0              pyhd8ed1ab_0    conda-forge
khronos-opencl-icd-loader 2022.01.04           h0d85af4_0    conda-forge
kiwisolver                1.4.2            py39h7248d28_1    conda-forge
krb5                      1.19.3               hb49756b_0    conda-forge
lcms2                     2.12                 h577c468_0    conda-forge
lerc                      3.0                  he49afe7_0    conda-forge
libblas                   3.9.0           13_osx64_openblas    conda-forge
libbrotlicommon           1.0.9                h5eb16cf_7    conda-forge
libbrotlidec              1.0.9                h5eb16cf_7    conda-forge
libbrotlienc              1.0.9                h5eb16cf_7    conda-forge
libcblas                  3.9.0           13_osx64_openblas    conda-forge
libcurl                   7.82.0               h9f20792_0    conda-forge
libcxx                    13.0.1               hc203e6f_0    conda-forge
libdeflate                1.10                 h0d85af4_0    conda-forge
libedit                   3.1.20191231         h0678c8f_2    conda-forge
libev                     4.33                 haf1e3a3_1    conda-forge
libffi                    3.3                  h046ec9c_2    conda-forge
libgfortran               5.0.0           9_3_0_h6c81a4c_23    conda-forge
libgfortran5              9.3.0               h6c81a4c_23    conda-forge
libglib                   2.68.4               hd556434_0    conda-forge
libiconv                  1.16                 haf1e3a3_0    conda-forge
liblapack                 3.9.0           13_osx64_openblas    conda-forge
libllvm10                 10.0.1               h009f743_3    conda-forge
libnetcdf                 4.8.1           nompi_h6609ca0_101    conda-forge
libnghttp2                1.47.0               h942079c_0    conda-forge
libopenblas               0.3.18          openmp_h3351f45_0    conda-forge
libpng                    1.6.37               h7cec526_2    conda-forge
libsodium                 1.0.18               hbcb3906_1    conda-forge
libssh2                   1.10.0               h52ee1ee_2    conda-forge
libtiff                   4.3.0                h17f2ce3_3    conda-forge
libwebp                   1.2.2                h28dabe5_0    conda-forge
libwebp-base              1.2.2                h0d85af4_1    conda-forge
libxcb                    1.13              h0d85af4_1004    conda-forge
libxml2                   2.9.12               he03b247_2    conda-forge
libxslt                   1.1.33               h5bff336_4    conda-forge
libzip                    1.8.0                h8b0c345_1    conda-forge
libzlib                   1.2.11            h6c3fc93_1014    conda-forge
llvm-openmp               13.0.1               hcb1a161_1    conda-forge
llvmlite                  0.36.0           py39h798a4f4_0    conda-forge
lomap                     2.0a0                    pypi_0    pypi
lomap2                    2.0.0              pyhd8ed1ab_0    conda-forge
lxml                      4.8.0            py39h63b48b0_1    conda-forge
lz4-c                     1.9.3                he49afe7_1    conda-forge
markupsafe                2.1.1            py39h63b48b0_1    conda-forge
matplotlib-base           3.5.1            py39hb07454d_0    conda-forge
matplotlib-inline         0.1.3              pyhd8ed1ab_0    conda-forge
mdanalysis                2.1.0            py39hfd1d529_1    conda-forge
mdtraj                    1.9.7            py39h996af62_1    conda-forge
mistune                   0.8.4           py39h89e85a6_1005    conda-forge
mmtf-python               1.1.2                      py_0    conda-forge
mpiplus                   v0.0.1          pyhd8ed1ab_1003    conda-forge
mrcfile                   1.3.0              pyh44b312d_0    conda-forge
msgpack-python            1.0.3            py39h7248d28_1    conda-forge
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
nbclient                  0.5.13             pyhd8ed1ab_0    conda-forge
nbconvert                 6.4.5              pyhd8ed1ab_2    conda-forge
nbconvert-core            6.4.5              pyhd8ed1ab_2    conda-forge
nbconvert-pandoc          6.4.5              pyhd8ed1ab_2    conda-forge
nbformat                  5.3.0              pyhd8ed1ab_0    conda-forge
nbval                     0.9.6              pyh9f0ad1d_0    conda-forge
ncurses                   6.3                  he49afe7_0    conda-forge
nest-asyncio              1.5.5              pyhd8ed1ab_0    conda-forge
netcdf-fortran            4.5.4           nompi_h9ed14b0_100    conda-forge
netcdf4                   1.5.8           nompi_py39he7d1c46_101    conda-forge
networkx                  2.8                pyhd8ed1ab_0    conda-forge
nglview                   3.0.3              pyh8a188c0_0    conda-forge
nose                      1.3.7                   py_1006    conda-forge
notebook                  6.4.10             pyha770c72_0    conda-forge
numba                     0.53.1           py39h32e38f5_1    conda-forge
numexpr                   2.8.0            py39h4d6be9b_1    conda-forge
numpy                     1.22.3           py39hf56e92f_2    conda-forge
ocl_icd_wrapper_apple     1.0.0                hbcb3906_0    conda-forge
openfe                    0.1                      pypi_0    pypi
openff-forcefields        2.0.0              pyh6c4a22f_0    conda-forge
openff-toolkit            0.10.4             pyhd8ed1ab_0    conda-forge
openff-toolkit-base       0.10.4             pyhd8ed1ab_0    conda-forge
openff-units              0.1.5              pyh6c4a22f_0    conda-forge
openff-utilities          0.1.3              pyh6c4a22f_0    conda-forge
openjpeg                  2.4.0                h6e7aa92_1    conda-forge
openmm                    7.7.0           py39h8d72adf_0_khronos    conda-forge
openmmforcefields         0.11.0             pyhd8ed1ab_0    conda-forge
openmmtools               0.21.2             pyhd8ed1ab_0    conda-forge
openssl                   1.1.1n               h6c3fc93_0    conda-forge
packaging                 21.3               pyhd8ed1ab_0    conda-forge
packmol                   20.010               h508aa58_0    conda-forge
packmol-memgen            1.2.1rc0                 pypi_0    pypi
pandas                    1.4.2            py39hbd61c47_0    conda-forge
pandoc                    2.17.1.1             h694c41f_0    conda-forge
pandocfilters             1.5.0              pyhd8ed1ab_0    conda-forge
parmed                    3.4.3            py39h9fcab8e_1    conda-forge
parso                     0.8.3              pyhd8ed1ab_0    conda-forge
patsy                     0.5.2              pyhd8ed1ab_0    conda-forge
pcre                      8.45                 he49afe7_0    conda-forge
pdbfixer                  1.8.1              pyh6c4a22f_0    conda-forge
perl                      5.32.1          2_h0d85af4_perl5    conda-forge
pexpect                   4.8.0              pyh9f0ad1d_2    conda-forge
pickleshare               0.7.5                   py_1003    conda-forge
pillow                    9.1.0            py39hd2c7aa1_0    conda-forge
pint                      0.19.1             pyhd8ed1ab_0    conda-forge
pip                       22.0.4             pyhd8ed1ab_0    conda-forge
pixman                    0.40.0               hbcb3906_0    conda-forge
plugcli                   0.0.1              pyhd8ed1ab_0    conda-forge
pluggy                    1.0.0            py39h6e9494a_3    conda-forge
prometheus_client         0.13.1             pyhd8ed1ab_0    conda-forge
prompt-toolkit            3.0.29             pyha770c72_0    conda-forge
psutil                    5.9.0            py39h63b48b0_1    conda-forge
pthread-stubs             0.4               hc929b4f_1001    conda-forge
ptyprocess                0.7.0              pyhd3deb0d_0    conda-forge
pure_eval                 0.2.2              pyhd8ed1ab_0    conda-forge
py                        1.11.0             pyh6c4a22f_0    conda-forge
pycairo                   1.21.0           py39ha25c624_1    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pydantic                  1.9.0            py39h63b48b0_1    conda-forge
pygments                  2.11.2             pyhd8ed1ab_0    conda-forge
pymbar                    3.0.6            py39hc89836e_0    conda-forge
pyparsing                 3.0.7              pyhd8ed1ab_0    conda-forge
pyrsistent                0.18.1           py39h63b48b0_1    conda-forge
pytables                  3.7.0            py39hfd850c7_0    conda-forge
pytest                    7.1.1            py39h6e9494a_1    conda-forge
pytest-cov                3.0.0              pyhd8ed1ab_0    conda-forge
pytest-forked             1.4.0              pyhd8ed1ab_0    conda-forge
pytest-xdist              2.5.0              pyhd8ed1ab_0    conda-forge
python                    3.9.0           h4f09611_5_cpython    conda-forge
python-constraint         1.4.0                      py_0    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python-fastjsonschema     2.15.3             pyhd8ed1ab_0    conda-forge
python_abi                3.9                      2_cp39    conda-forge
pytraj                    2.0.6                    pypi_0    pypi
pytz                      2022.1             pyhd8ed1ab_0    conda-forge
pyyaml                    6.0              py39h63b48b0_4    conda-forge
pyzmq                     22.3.0           py39hc2dc7ec_2    conda-forge
rdkit                     2022.03.1        py39h1ae426c_0    conda-forge
readline                  8.1                  h05e3726_0    conda-forge
reportlab                 3.5.68           py39hf37cc50_1    conda-forge
sander                    16.0                     pypi_0    pypi
scikit-learn              1.0.2            py39hd4eea88_0    conda-forge
scipy                     1.8.0            py39h056f1c0_1    conda-forge
seaborn                   0.11.2               hd8ed1ab_0    conda-forge
seaborn-base              0.11.2             pyhd8ed1ab_0    conda-forge
send2trash                1.8.0              pyhd8ed1ab_0    conda-forge
setuptools                62.0.0           py39h6e9494a_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smirnoff99frosst          1.1.0              pyh44b312d_0    conda-forge
snappy                    1.1.8                hb1e8313_3    conda-forge
soupsieve                 2.3.1              pyhd8ed1ab_0    conda-forge
sqlalchemy                1.4.34           py39h63b48b0_1    conda-forge
sqlite                    3.37.1               hb516253_0    conda-forge
stack_data                0.2.0              pyhd8ed1ab_0    conda-forge
statsmodels               0.13.2           py39hc89836e_0    conda-forge
terminado                 0.13.3           py39h6e9494a_1    conda-forge
testpath                  0.6.0              pyhd8ed1ab_0    conda-forge
threadpoolctl             3.1.0              pyh8a188c0_0    conda-forge
tinydb                    4.7.0              pyhd8ed1ab_0    conda-forge
tk                        8.6.12               h5dbffcc_0    conda-forge
toml                      0.10.2             pyhd8ed1ab_0    conda-forge
tomli                     2.0.1              pyhd8ed1ab_0    conda-forge
tornado                   6.1              py39h63b48b0_3    conda-forge
tqdm                      4.64.0             pyhd8ed1ab_0    conda-forge
traitlets                 5.1.1              pyhd8ed1ab_0    conda-forge
typing-extensions         4.1.1                hd8ed1ab_0    conda-forge
typing_extensions         4.1.1              pyha770c72_0    conda-forge
tzdata                    2022a                h191b570_0    conda-forge
unicodedata2              14.0.0           py39h63b48b0_1    conda-forge
validators                0.18.2             pyhd3deb0d_0    conda-forge
wcwidth                   0.2.5              pyh9f0ad1d_2    conda-forge
webencodings              0.5.1                      py_1    conda-forge
wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
widgetsnbextension        3.6.0            py39h6e9494a_0    conda-forge
xmltodict                 0.12.0                     py_0    conda-forge
xorg-kbproto              1.0.7             h35c211d_1002    conda-forge
xorg-libice               1.0.10               h0d85af4_0    conda-forge
xorg-libsm                1.2.3             h0d85af4_1000    conda-forge
xorg-libx11               1.7.2                h0d85af4_0    conda-forge
xorg-libxau               1.0.9                h35c211d_0    conda-forge
xorg-libxdmcp             1.1.3                h35c211d_0    conda-forge
xorg-libxext              1.3.4                h0d85af4_1    conda-forge
xorg-libxt                1.2.1                h0d85af4_2    conda-forge
xorg-xextproto            7.3.0             h35c211d_1002    conda-forge
xorg-xproto               7.0.31            h35c211d_1007    conda-forge
xz                        5.2.5                haf1e3a3_1    conda-forge
yaml                      0.2.5                h0d85af4_2    conda-forge
zeromq                    4.3.4                he49afe7_1    conda-forge
zipp                      3.8.0              pyhd8ed1ab_0    conda-forge
zlib                      1.2.11            h6c3fc93_1014    conda-forge
zstd                      1.5.2                h582d3a0_0    conda-forge

Serialization mixin has unimplemented methods

The Serialization mixin we're using from openff-toolkit requires to_dict and from_dict to implement a lot of other serialisation (which is great) but also requires to_toml and from_xml. My IDE is constantly complaining that the abc isn't properly filled. Maybe we can strip the xml and toml out from this mixin?

include version in to_dict trips

I'd pop in a gufe.__version__ in the GufeTokenizable.to_dict() which would help if we ever make a break there.

Protein hashing concerns

Currently we're using rdkit's Chem.MolToSequence as a first pass of hashing a ProteinComponent. This might not cover everything, so some thought is required.

Move `openfe` `LigandAtomMapping` into `gufe`?

@ijpulidos and I are working on nonequilibrium cycling in perses, and were wondering if LigandAtomMapping from OpenFreeEnergy/openfe made more sense here in gufe?

The AtomMapping base class is an abstract base class, so we can't directly use that, and think it best if we avoid implementing our own child class in perses for the mappings used there.

newer mypy giving new errors

updated to v0.990 and got some new errors that must have snuck in

Looks like it's this:

https://github.com/hauntsaninja/no_implicit_optional

Add black check for PRs?

Should we add a black check for PRs? We have one in fah-alchemy, for example.

Protein-Protein mapping

execute_DAG causing problems when using tmpdirs

in execute_DAG when we don't specify a work directory, we default to a tmpdir: https://github.com/OpenFreeEnergy/gufe/blob/main/gufe/protocols/protocoldag.py#L236

this is causing errors (for me) along the lines of OSError: /tmp/tmpyggicnli/c8c410b8-040b-480a-b0c8-8ff0ca0207f6/complex_rbfe.nc does not exist or similar, when I try and then access files that were inside the shared directory (e.g. for dependent Units).

I think maybe a more sane default for shared could be os.getcwd()?

Move reduced potential Settings to gufe

It's possible to create a pydantic model for the ffxml spec which moves us very close to defining the reduced energy for a given ChemicalSystem.

Gufe, Single file serialisation dumps

ProteinComponent.to_openff should give offtk.Topology not Molecule

We're often using toolkit.Molecule but as this is meant to only hold a single covalently connected entity, and our definition of Component is looser than this, we should probably use Topology to pass around, or alternatively list of Molecules if Topology is too opinionated.

redundant label on Protocol create signature?

@dotsdl has been digging around on redundant labels. One place that possibly we can slice one off is the signature for Protocol.create here: https://github.com/OpenFreeEnergy/gufe/blob/main/gufe/protocols/protocol.py#L193 (and the matching Transformation.__init__ here: https://github.com/OpenFreeEnergy/gufe/blob/main/gufe/transformations/transformation.py#L47)

Currently this would look something like:

protocol.create(
    stateA=ChemicalSystem(components={'ligand': SmallMoleculeComponent(...), ...),
    stateB=ChemicalSystem(components={'ligand': SmallMoleculeComponent(...), ...),
    mapping = {'ligand': LigandAtomMapping(...)}
)

As there is a reference to either side of the Mapping within the LAM, we could probably reduce the labels here and allow:

protocol.create(
   stateA=ChemicalSystem(components={'ligand': SmallMoleculeComponent(...), ...),
   stateB=ChemicalSystem(components={'ligand': SmallMoleculeComponent(...), ...),
   mapping = [LigandAtomMapping(...)],
)

test_json_round_trip failing

This test is failing:
https://github.com/OpenFreeEnergy/gufe/blob/main/gufe/tests/test_models.py#L16

Looking at it, I'm not sure why it should pass, L28 will give a dict and L30 seems to want str/bytes

FAILED                              [100%]
gufe/tests/test_models.py:15 (test_json_round_trip)
>   ???

pydantic/main.py:539: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???

pydantic/parse.py:37: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

data = {'forcefield_file': 'note.xml', 'forcefield_settings': {'author': 'The Open Force Field Initiative', 'date': '2021-08-...1', 'solute_dielectric': 1.0, 'solvent_dielectric': 78.5}, ...}, 'protocol_settings': None, 'settings_version': 0, ...}

    def json_loader(data: str) -> dict:
        """Load JSON containing custom unit-tagged quantities."""
        # TODO: recursively call this function for nested models
>       out: Dict = json.loads(data)

../../../../miniconda3/envs/openfe/lib/python3.9/site-packages/openff/models/types.py:127: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

s = {'forcefield_file': 'note.xml', 'forcefield_settings': {'author': 'The Open Force Field Initiative', 'date': '2021-08-...1', 'solute_dielectric': 1.0, 'solvent_dielectric': 78.5}, ...}, 'protocol_settings': None, 'settings_version': 0, ...}
cls = None, object_hook = None, parse_float = None, parse_int = None
parse_constant = None, object_pairs_hook = None, kw = {}

    def loads(s, *, cls=None, object_hook=None, parse_float=None,
            parse_int=None, parse_constant=None, object_pairs_hook=None, **kw):
        """Deserialize ``s`` (a ``str``, ``bytes`` or ``bytearray`` instance
        containing a JSON document) to a Python object.
    
        ``object_hook`` is an optional function that will be called with the
        result of any object literal decode (a ``dict``). The return value of
        ``object_hook`` will be used instead of the ``dict``. This feature
        can be used to implement custom decoders (e.g. JSON-RPC class hinting).
    
        ``object_pairs_hook`` is an optional function that will be called with the
        result of any object literal decoded with an ordered list of pairs.  The
        return value of ``object_pairs_hook`` will be used instead of the ``dict``.
        This feature can be used to implement custom decoders.  If ``object_hook``
        is also defined, the ``object_pairs_hook`` takes priority.
    
        ``parse_float``, if specified, will be called with the string
        of every JSON float to be decoded. By default this is equivalent to
        float(num_str). This can be used to use another datatype or parser
        for JSON floats (e.g. decimal.Decimal).
    
        ``parse_int``, if specified, will be called with the string
        of every JSON int to be decoded. By default this is equivalent to
        int(num_str). This can be used to use another datatype or parser
        for JSON integers (e.g. float).
    
        ``parse_constant``, if specified, will be called with one of the
        following strings: -Infinity, Infinity, NaN.
        This can be used to raise an exception if invalid JSON numbers
        are encountered.
    
        To use a custom ``JSONDecoder`` subclass, specify it with the ``cls``
        kwarg; otherwise ``JSONDecoder`` is used.
        """
        if isinstance(s, str):
            if s.startswith('\ufeff'):
                raise JSONDecodeError("Unexpected UTF-8 BOM (decode using utf-8-sig)",
                                      s, 0)
        else:
            if not isinstance(s, (bytes, bytearray)):
>               raise TypeError(f'the JSON object must be str, bytes or bytearray, '
                                f'not {s.__class__.__name__}')
E               TypeError: the JSON object must be str, bytes or bytearray, not dict

../../../../miniconda3/envs/openfe/lib/python3.9/json/__init__.py:339: TypeError

During handling of the above exception, another exception occurred:

all_settings_path = '/home/richard/code/gufe/gufe/tests/data/all_settings.json'
tmp_path = PosixPath('/tmp/pytest-of-richard/pytest-3/test_json_round_trip0')

    def test_json_round_trip(all_settings_path, tmp_path):
        with open(all_settings_path) as fd:
            settings = Settings.parse_raw(fd.read())
    
        assert settings == Settings(**settings.dict())
    
        d = tmp_path / "test"
        d.mkdir()
        with open(d / "settings.json", "w") as fd:
            fd.write(settings.json())
    
        with open(d / "settings.json") as fd:
            settings_from_file = json.load(fd)
    
>       assert settings == Settings.parse_raw(settings_from_file)

test_models.py:30: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   pydantic.error_wrappers.ValidationError: 1 validation error for Settings
E   __root__
E     the JSON object must be str, bytes or bytearray, not dict (type=type_error)

pydantic/main.py:548: ValidationError

pypi package

We probably ought to offer installation via pip. Iirc openff-toolkit doesn't play nice with pypi, but we can probably make that optional and use rdkit as the fundamental toolkit (which now has a pypi install route).

Need to define `Component.le` for ordering

`OpenMMSystemGeneratorFFSettings` cleanups

There are a few things that we should follow up on from #110. This issue will track those so they can be dealt with in the future.

Use something other than string "app.HBonds"? https://github.com/OpenFreeEnergy/gufe/pull/110/files#r1097848589
Use actual unitted quantities for hydrogen mass, instead of strings? https://github.com/OpenFreeEnergy/gufe/pull/110/files#r1097848638

add reference to Settings to ProtocolResult

protein request fixture causing failures

https://github.com/OpenFreeEnergy/gufe/actions/runs/4074917726/jobs/7020642731

=================================== FAILURES ===================================
____________ TestProteinComponent.test_from_pdb_file[cmet_protein] _____________
[gw1] linux -- Python 3.9.16 /usr/share/miniconda3/envs/gufe/bin/python3.9

self = <gufe.tests.test_proteincomponent.TestProteinComponent object at 0x7fc3b0a0ac10>
in_pdb_path = 'cmet_protein'

    @pytest.mark.parametrize('in_pdb_path', ALL_PDB_LOADERS.keys())
    def test_from_pdb_file(self, in_pdb_path):
>       in_pdb_io = ALL_PDB_LOADERS[in_pdb_path]()

gufe/tests/test_proteincomponent.py:99: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
gufe/tests/conftest.py:36: in __call__
    req = urllib.request.urlopen(self.url)
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:214: in urlopen
    return opener.open(url, data, timeout)
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:523: in open
    response = meth(req, response)
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:632: in http_response
    response = self.parent.error(
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:561: in error
    return self._call_chain(*args)
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:494: in _call_chain
    result = func(*args)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib.request.HTTPDefaultErrorHandler object at 0x7fc3b1301b20>
req = <urllib.request.Request object at 0x7fc3adbf92b0>
fp = <http.client.HTTPResponse object at 0x7fc3adbf9070>, code = 500
msg = 'Internal Server Error'
hdrs = <http.client.HTTPMessage object at 0x7fc3b032d0d0>

    def http_error_default(self, req, fp, code, msg, hdrs):
>       raise HTTPError(req.full_url, code, msg, hdrs, fp)
E       urllib.error.HTTPError: HTTP Error 500: Internal Server Error

/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:641: HTTPError
__________ TestProteinComponent.test_to_pdb_round_trip[cmet_protein] ___________
[gw1] linux -- Python 3.9.16 /usr/share/miniconda3/envs/gufe/bin/python3.9

self = <gufe.tests.test_proteincomponent.TestProteinComponent object at 0x7fc3b084f490>
in_pdb_path = 'cmet_protein'
tmp_path = PosixPath('/tmp/pytest-of-runner/pytest-0/popen-gw1/test_to_pdb_round_trip_cmet_pr0')

    @pytest.mark.parametrize('in_pdb_path', ALL_PDB_LOADERS.keys())
    def test_to_pdb_round_trip(self, in_pdb_path, tmp_path):
>       in_pdb_io = ALL_PDB_LOADERS[in_pdb_path]()

gufe/tests/test_proteincomponent.py:182: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
gufe/tests/conftest.py:36: in __call__
    req = urllib.request.urlopen(self.url)
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:214: in urlopen
    return opener.open(url, data, timeout)
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:523: in open
    response = meth(req, response)
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:632: in http_response
    response = self.parent.error(
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:561: in error
    return self._call_chain(*args)
/usr/share/miniconda3/envs/gufe/lib/python3.9/urllib/request.py:494: in _call_chain
    result = func(*args)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib.request.HTTPDefaultErrorHandler object at 0x7fc3b1301b20>
req = <urllib.request.Request object at 0x7fc3adeec310>
fp = <http.client.HTTPResponse object at 0x7fc3b0293460>, code = 500
msg = 'Internal Server Error'
hdrs = <http.client.HTTPMessage object at 0x7fc3b02a3670>

    def http_error_default(self, req, fp, code, msg, hdrs):
>       raise HTTPError(req.full_url, code, msg, hdrs, fp)
E       urllib.error.HTTPError: HTTP Error 500: Internal Server Error

Flyweight pattern not working on Transformation

Looks like the flyweight pattern isn't working for a Transformation

https://github.com/OpenFreeEnergy/gufe/blob/main/gufe/tokenization.py#L252

Code to reproduce:

t1 = Transformation(...)

t1.dump('out.json')

t2 = Transformation.load('out.json')

t1 == t2  # great

t1 is t2  # isn't the same memory

Document transfomation dump

Failures as first class results

Currently there's not really a way for a ProtocolUnit to communicate if/how they failed. I imagine that as a subclass of ProtocolUnitResult we can have a failure class, which could encapsulate any logs/debug info. This should also properly hold up any dependency chains, so something like Result.ok() -> bool returning False for this failure class.

test_json_round_trip fails without openff-models pin

https://github.com/OpenFreeEnergy/gufe/actions/runs/4132763872/jobs/7141851982

Consider alternative names?

I'm excited by the rapid progress on this repo!

I was wondering if I could persuade you to consider a more informative name, perhaps in a unified Open Free Energy namespace. The Open Force Field folks eventually came to the conclusion that it was important to have a unified namespace where various component packages could live in a relevant name subspace:

conda install openff-toolkit would install the openff.toolkit package
conda install openff-evaluator would install the openff.evaluator package
conda install openff-forcefields would install the force fields
and so on. This made it very easy to identify what was produced by Open Force Field (the openff prefix) and the various packages installed cleanly into namespace subspaces.

As an alternative to the goofy name "gufe", perhaps OpenFE might consider adopting a common namespace (openfe is an obvious choice) and clear, succinct naming conventions? For example

conda install openfe-core could install the openfe.core common component object models; alternatives could be openfe.common or openfe.components
conda install openfe-toolkit could install openfe.toolkit with common toolkit components (currently in openfe/openfe)
conda install openfe-lomap could install openfe.networkmapper.lomap if you want to break up mappers by flavor, or you could unify everything under openfe.networkmapper to install one or more network mappers.
conda install openfe-perses could install openfe.engine.perses to manage perses calculations, while other simulation engine wrappers could be installed similarly (openfe-gromacs -> openfe.engine.gromacs)
conda install openfe-atommapper could install openfe.atommapper.leadoptmap as an atom mapper
and so on.

Fix issues with protein components & total charge

See: #17

Lossless serialisation of a Transformation (including all children)

`Settings` object not hashable

Our current Settings object does not appear to be hashable. This becomes a problem when trying to add a Transformation that includes a Settings object via its attached Protocol to an AlchemicalNetwork, as the AlchemicalNetwork deduplicates Transformations via a frozenset.

Currently we have individual __hash__ implementations for many GufeTokenizables in gufe, such as Transformation, ChemicalSystem and Protocol. These all should probably use the same __hash__ implementation as GufeTokenizable itself, and doing so should avoid the above.

This first surfaced in updating the example notebook for building an AlchemicalNetwork.

move business logic outside of Protocol.gather() methods

Instead do munging of data inside Result.__init__() methods

add support methods to LigandAtomMapping

perses has some handy methods attaching to its AtomMapping class, notably

versions of these should be migrated to gufe's LigandAtomMapping, preferably implemented using rdkit

Add docstrings to `LigandAtomMapping` methods

We would like docstrings on the methods and properties of LigandAtomMapping, in particular the componentX_to_componentY methods.

Change `shared` from directory to an object

This is an API-breaking change that I think should be included before releasing a 1.0. We're currently assuming that shared is a directory. The argument I'm making is that there are reasonable approaches where this will not be valid, e.g., if shared is accessed securely (scp, sftp, etc.). So shared, as a Protocol author uses it, should be abstracted out.

What is needed here is a subset of the API already implemented in externalresource. Therefore, the code shouldn't be hard to write, and for now we can just use the existing externalresource objects. Eventually we may split out the smaller, required API, from aspects of externalresource that will only be required for openfe-specific storage.

Number of replicates in `Settings` object

While working on the NEQ cycling protocol in perses it was noted that we might want to run many cycles/replicates of the simulation unit of the protocol. Say we want to run 100 cycles of a transformation and get a free energy estimate from it.

So far this is stored as a specific setting of the protocol, but this made me wonder if we may want to make this a setting at the gufe Settings object level instead of the settings specific to the protocol, since other protocols might benefit from having it.

I guess a benefit from this is that we would then have a standard way of accessing this information if needed and expect protocols implementations to use this setting. From an user perspective, it also avoids having to know what specific setting deals with this and reuse it in different protocols, if desired.

Could this be a setting that we want to have in gufe and have protocols to optionally use it?

Add a `DummyProtocol` or similar that functions as a placeholder for users, docs

We want a DummyProtocol or similar that allows users to prototype alchemical networks without having to make any actual Protocol choices upfront. This would be especially useful right now as actual protocol implementations are being actively developed.

rename ProteinComponent to PDBComponent?

gufe objects not giving informative signatures

Our objects are not returning useful signatures. They all return a <Signature (*args, **kwargs)>.

This is annoying when using our objects in IDEs, e.g., using shift-tab for quick help in Jupyter.

Expected behavior would be to see the actual signature coming from the __init__.

I suspect the source of this problem is our metaclass. Specifically, I think we're actually seeing the signature from _GufeTokenizableMeta.__call__. Not an urgent issue, but I wanted to document it since I'm pretty sure I've identified the reason, although I don't have a trivial solution.

In [1]: import inspect

In [2]: import gufe
LICENSE: Could not open license file "oe_license.txt" in local directory
LICENSE: N.B. OE_LICENSE environment variable is not set
LICENSE: N.B. OE_DIR environment variable is not set
LICENSE: No product keys!
LICENSE: No product keys!
LICENSE: No product keys!
LICENSE: No product keys!
g
In [3]: gufe.__version__
Out[3]: 'v0.2.post227+g69ad4d2'

In [4]: inspect.signature(gufe.ChemicalSystem)
Out[4]: <Signature (*args, **kwargs)>

In [5]: inspect.signature(gufe.SmallMoleculeComponent)
Out[5]: <Signature (*args, **kwargs)>

In [6]: inspect.signature(gufe.ProteinComponent)
Out[6]: <Signature (*args, **kwargs)>

In [7]: inspect.signature(inspect.Signature)
Out[7]: <Signature (parameters=None, *, return_annotation, __validate_parameters__=True)>

Propose object model(s) for representing free energies + uncertainties that generalize well across protocols

@jchodera will draft a proposal on this issue for representing free energies and uncertainties (of various flavors) that generalize well across different alchemical protocols. We'll use that proposal as the substrate for discussion, followed by any PR(s) for implementation.

Hash needs to work across python sessions

https://docs.python.org/3/reference/datamodel.html#object.__hash__

Probably need to go to md5

Make Conda-Forge Package

gufe settings need to check compatibility with transformations

We need to add some machinery so that when a user defines a transformation, we can ensure the two different settings objects that are included are compatible. For example, running at two different temperatures.

We should leverage the hierarchy of our settings to accomplish this. For example, we could compare a hash of the thermo settings and ensure that they are the same for both ends of the transformation.

make settings Readonly once inside Protocol

https://stackoverflow.com/questions/67078207/is-it-possible-to-dynamically-change-the-mutability-of-a-pydantic-class

nans in box in ChemicalSystem make round trip tests fail

This seems to only happen on CI rather than locally, see #53

Other options for not specifying a box dimension could be setting that dimension to 0? This would preserve the dtype of the array, compared to using None.

gufe tokenization stuff should use ducks not instance

in particular is_gufe_obj should do hasattr checks, then we can allow a deserialise and serialise ducktype

natively key mangle common non-json-friendly objects

for example, Path, Quantity / unit

`GufeTokenizableTestsMixin`: remove `.clear()` from teardown

We should be using patches to replace the tokenization registries when we need a clear registry, as opposed to clearing the registry with .clear() on teardown.

If the GufeTokenizableTestsMixin.instance (or something included in it) is a session-scoped fixture, then it is not recreated for each test -- so __new__ is not called, and it does not get re-registered in the cleared registry. This leads to unexpected KeyErrors on the keyed dict roundtrip test.

The solution is to use unitest.mock.patch.dict, and to patch with an empty dict when the test needs to be in isolation.