Coder Social home page Coder Social logo

molssi / qcarchiveexamples Goto Github PK

View Code? Open in Web Editor NEW
11.0 7.0 6.0 6.96 MB

Getting started docs, examples, tutorials, and use cases.

Home Page: https://docs.qcarchive.molssi.org/en/latest/

License: BSD 3-Clause "New" or "Revised" License

Makefile 0.01% Python 0.08% Jupyter Notebook 99.00% CSS 0.81% JavaScript 0.10% HTML 0.01%
sphinx compuational-chemistry quantum-chemistry

qcarchiveexamples's Introduction

QCArchiveTutorials

Tips and help getting started with the QCArchive ecosystem.

Build

The docs for this project are built with Sphinx. To compile the docs, first ensure that Sphinx, the ReadTheDocs theme, and nbsphinx are installed.

conda install sphinx sphinx_rtd_theme npsphinx 

Once installed, you can use the Makefile in this directory to compile static HTML pages by

make html

The compiled docs will be in the _build directory and can be viewed by opening index.html (which may itself be inside a directory called html/ depending on what version of Sphinx is installed).

Contributing

These jupyter notebooks should be able to be run from top to bottom using the MolSSI QCArchive dataset. This particular set of examples should require relatively minimal compute resources and should be limited to datasets of less then 10,000 rows.

qcarchiveexamples's People

Contributors

bennybp avatar dgasmith avatar lnaden avatar loriab avatar mattwelborn avatar sjayellis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

qcarchiveexamples's Issues

index.rst should prominently feature a link to the Binder

The landing page (index.rst) directs the user to install qcportal. The first tutorial then gives the user the choice to use a Binder instead of a local python environment. The choice between Binder and local install should be presented on the landing page.

Unexpected output from print(mol)

Hi! Following the tutorial at https://qcarchivetutorials.readthedocs.io/en/latest/cookbook/molecules.html I run the following code in a Jupyter-notebook:

    import qcportal as ptl
    client = ptl.FractalClient()
    mol = client.query_molecules(1234)[0]
    print(mol)

It should print the following table, but it doesn't.

     Geometry (in Angstrom), charge = 0.0, multiplicity = 1:
    
           Center              X                  Y                   Z       
        ------------   -----------------  -----------------  -----------------
        C                 0.776479871994     1.156134463385     0.121542591228
        C                 0.438429690334     0.679567908122    -1.141595091975
        C                 0.439577078821     0.423533055514     1.255585387764
    ...

Instead it prints

Molecule(name='C9H12', formula='C9H12', hash='572b510')

This behavior is reproduced on a Mac client installation and on the Binder.
It is not a big deal, but I guess you want to update the tutorial page accordingly.

ML cookbook ideas (suggestions welcome)

This issue proposes some ideas for ML tutorial content using the data on QCArchive. These examples should focus on use cases for quantum chemical data in ML, including e.g. supervised learning of relationships between:

  • structure and QC properties
  • QC properties and function

and e.g. unsupervised learning of:

  • molecule clusters on the basis of QC properties
  • interest in different classes of molecules or theory methods based on distributions found in the QCArchive data

These examples should demonstrate the key advantages of QCArchive as a distribution method for ML data versus the current model of SI and Figshare: uniform data formats, interoperability/composability, trusted provenance, and discovery of new datasets.

Some specific ideas for examples:

  • Train a model on QM7b to predict DFT energy from molecular geometries using Coulomb, SLATM, and SOAP features with a kernel method. Test the model on QM9 or GDB-13. Train a model with a combination of datasets (e.g. QM7b + QM9).
  • Fit a water model using THG's water cluster dataset to the TIP-4P functional form, perhaps using bayesian regression.
  • Placeholder for something using a generative model.

Issue with ds.status(["default"]) for optimization data sets

The error TypeError: unhashable type: 'OptimizationRecord' occurs when running ds.status(["default"]) for optimization datasets.

I noticed the error in two locations:
(1) optimization_datasets.ipynb located in QCArchiveExamples/basic_examples/optimization_datasets.ipynb
(2) in the Optimization Datasets & Exploring the Dataset page in QCA documentation (https://qcarchive.molssi.org/examples/). Under "Optimization datasets">"exploring datasets"

QCPortal Version: 0.13.1.

Internal server error

Running the molecules.ipynb locally results in a OSError: Server communication failure. Reason: Internal Server Error when executing the following cell (in the From a Dataset section):

molecules = ds.get_molecules()
molecules

The following code reproduces the issue outside of the Jupyter notebookL

import qcportal as ptl
client = ptl.FractalClient()

ds = client.get_collection("Dataset", "SMIRNOFF Coverage Set 1")

molecules = ds.get_molecules()

Full stack trace:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-7-9e5ab07f5cc6> in <module>
----> 1 molecules = ds.get_molecules()
      2 molecules

~/miniconda3/envs/chemgen/lib/python3.9/site-packages/qcportal/collections/dataset.py in get_molecules(self, subset, force)
   1548         """
   1549         indexer = self._molecule_indexer(subset=subset, force=force)
-> 1550         df = self._get_molecules(indexer, force)
   1551 
   1552         if isinstance(subset, str):

~/miniconda3/envs/chemgen/lib/python3.9/site-packages/qcportal/collections/dataset.py in _get_molecules(self, indexer, force)
   1074             molecules: List["Molecule"] = []
   1075             for i in range(0, len(molecule_ids), self.client.query_limit):
-> 1076                 molecules.extend(self.client.query_molecules(id=molecule_ids[i : i + self.client.query_limit]))
   1077             # XXX: molecules = pd.DataFrame({"molecule_id": molecule_ids, "molecule": molecules}) fails
   1078             #      test_gradient_dataset_get_molecules and I don't know why

~/miniconda3/envs/chemgen/lib/python3.9/site-packages/qcportal/client.py in query_molecules(self, id, molecule_hash, molecular_formula, limit, skip, full_return)
    407             "data": {"id": id, "molecule_hash": molecule_hash, "molecular_formula": molecular_formula},
    408         }
--> 409         response = self._automodel_request("molecule", "get", payload, full_return=full_return)
    410         return response
    411 

~/miniconda3/envs/chemgen/lib/python3.9/site-packages/qcportal/client.py in _automodel_request(self, name, rest, payload, full_return, timeout)
    272             raise TypeError(str(exc))
    273 
--> 274         r = self._request(rest, name, data=payload.serialize(self.encoding), timeout=timeout)
    275         encoding = r.headers["Content-Type"].split("/")[1]
    276         response = response_model.parse_raw(r.content, encoding=encoding)

~/miniconda3/envs/chemgen/lib/python3.9/site-packages/qcportal/client.py in _request(self, method, service, data, noraise, timeout)
    234 
    235         if (r.status_code != 200) and (not noraise):
--> 236             raise IOError("Server communication failure. Reason: {}".format(r.reason))
    237 
    238         return r

OSError: Server communication failure. Reason: Internal Server Error

The reminder of the notebook seems to run smoothly.

Error using ds.status(["default"])

Issue:
The error TypeError: unhashable type: 'OptimizationRecord' occurs when running ds.status(["default"]) for optimization datasets.

Location of Issue:
(1) optimization_datasets.ipynb located in QCArchiveExamples/basic_examples/optimization_datasets.ipynb
(2) Optimization Datasets & Exploring the Dataset page in QCA documentation (https://qcarchive.molssi.org/examples/).

Versions:
QCPortal: v0.13.1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.