Coder Social home page Coder Social logo

oximachine_featurizer's Introduction

oximachine_featurizer

Actions Status Documentation Status DOI Binder codecov

Mine oxidation states for structures from the (MOF) subset of the CSD and calculate features for them. Runscripts are automatically installed for the most important steps. Some of these runscripts contain hard coded paths, that would need to be updated. This code generates inputs that can be used with the learnmofox package to replicate our work [1].

If you're just interested in using a pre-trained model, the oximachinerunner package.

โš ๏ธ Warning: For the mining of the oxidation states, you need the CSD Python API. You need to export the CSD_HOME path. Due to the licensing issues, this cannot be done automatically.

Installation

The commands below automatically install several command-line tools (CLI) which are detailed below.

The full process should take some seconds.

Latest version

To install the latest version of the software with all dependencies, you can use

pip install git+https://github.com/kjappelbaum/oximachine_featurizer.git

Stable release

pip install oximachine_featurizer

How to use it

To run the default featurization on one structure you can use the CLI

run_featurization <structure> <outdir>

for each metal center this should take seconds if there is no disorder.

Some output can be found on the MaterialsCloud Archive (doi: 10.24435/materialscloud:2019.0085/v1 ).

More details can be found in the documentation.

Example usage

The use of the main functions of this package is shown in the Jupyter Notebook in the example directory. It contains some example structures and the output, which should be produces in seconds.

Testing the installation

For testing, you can---as it is done for the continuous integration (CI)---use pytest and run the files in the test directory. For example

pip install pytest
pytest test/main

References

[1] Jablonka, Kevin Maik; Ongari, Daniele; Moosavi, Seyed Mohamad; Smit, Berend (2020): Using Collective Knowledge to Assign Oxidation States. ChemRxiv. Preprint. https://doi.org/10.26434/chemrxiv.11604129.v1

oximachine_featurizer's People

Contributors

dependabot[bot] avatar github-actions[bot] avatar kjappelbaum avatar mpougin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

oximachine_featurizer's Issues

package 'test' installed

you're trying to avoid this here
https://github.com/kjappelbaum/oximachine_featurizer/blob/master/setup.py#L63

but since the folder is called test , it slips through

$ pip uninstall oximachine-featurizer
Found existing installation: oximachine-featurizer 0.2.14
Uninstalling oximachine-featurizer-0.2.14:
  Would remove:
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/bin/run_featurization
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/bin/run_mine_mp
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/bin/run_parsing
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/bin/run_parsing_reference
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/lib/python3.8/site-packages/oximachine_featurizer-0.2.14.dist-info/*
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/lib/python3.8/site-packages/oximachine_featurizer/*
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/lib/python3.8/site-packages/run/*
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/lib/python3.8/site-packages/test/*
  Would not remove (might be manually added):
    /Users/leopold/Applications/miniconda3/envs/aiida-lsmo-2/lib/python3.8/site-packages/test/test_oximachine.py

matminer dependency now broken due to pymatgen update

Steps to reproduce the problem

noticed in aiida-lsmo https://github.com/lsmo-epfl/aiida-lsmo/runs/2459918160

/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
tests/test_oxidation_state.py:11: in <module>
    from aiida_lsmo.calcfunctions.oxidation_state import compute_oxidation_states
aiida_lsmo/calcfunctions/oxidation_state.py:5: in <module>
    import oximachinerunner
/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/site-packages/oximachinerunner/__init__.py:15: in <module>
    from oximachine_featurizer import featurize
/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/site-packages/oximachine_featurizer/__init__.py:9: in <module>
    from .featurize import FeatureCollector, GetFeatures, featurize
/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/site-packages/oximachine_featurizer/featurize.py:16: in <module>
    from matminer.featurizers.site import CrystalNNFingerprint, GaussianSymmFunc
/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/site-packages/matminer/featurizers/site.py:8: in <module>
    from matminer.utils.data import MagpieData
/opt/hostedtoolcache/Python/3.7.10/x64/lib/python3.7/site-packages/matminer/utils/data.py:17: in <module>
    from pymatgen import Element
E   ImportError: cannot import name 'Element' from 'pymatgen' (unknown location)

I guess you'll want to update the matminer dependency to a version that isn't broken (or restrict the pymatgen version <2021)

Move the exclusion list to separate module and better indicate why they are excluded

Currently there is one long list in the class and I started mixing up structures that are excluded because they are special test structures, due to EDA or due to errors found in some previous iterations of the model. We should clean this in separate lists that we import from a separate module. Generally, in the longer run, this part of the code also needs to be fixed as it is really stupid and slow to loop over all this list for each structure. One can easily improve it quite a lot using e.g. sets.

concurrent featurization

the issue is a bit that we're CPU bound and usually it is probably easier to parallelize over structures and not sites

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.