Coder Social home page Coder Social logo

openff's Issues

Generate tests for developmental unit testing; use pytest

Unit testing is useful in limited situations for Open-FF, mostly for basic utilities. Probably more important are higher level functional tests that can be performed routinely

Functional tests: embedded in code and performed with every execution.

Unit type test: performed by pytest, usually before committing to repo.
If they are on this list, they have been started. Checked means completed.

  • build.core.cas_tools
  • common.nb_helper
  • common.file_handlers

Create draft of all documentation

Drafts in place and completed

  • README
  • Top
  • What is FracFocus
  • Open-FF overview
  • Resolving chemical ID
  • Calculating mass
  • Standardizing text fields
  • Proprietary records
  • Generating the Open-FF data set
  • The Open-FF data repository
  • Getting openFF data
  • external data in openFF
  • Browser overview
  • Generating the browser
  • Notebook overview
  • How to use notebooks

Incorporate the ChemInformatics data into the build process

  • in build_nb, add the necessary instructions and infrastructure to capture the ChemInformatics info for as many chemicals as possible. This should include Hazards (.xlsx and .sdf) and the safety data (.xlsx)
  • At the minimum, put the presence/absence of the CI analysis into the bgCAS table so that downstream functions can access it.
  • A more thorough treatment would be to create a third table, indexed by bgCAS but with all the info scraped from CI

Detecting and flagging erroneous duplicate records

While Open-FF has been able to find and flag these erroneous duplicates since 2019, a change in the data due to FracFocus V4 is undermining the ability to do that. The current workaround is to use archives to identify the extra records. That's not ideal for the long term. We are waiting for FracFocus to respond to the issue.

CAS/Ing table reanalyze

The CAS/Ing translation table has developed over years with much of it curated before I developed the current set of helper tools.

At some point it would be worthwhile to reanalyze the whole table with the new tools; especially those pairs that and not both matches.

Check new bulk download for unknown fields

FracFocus has occasionally announced that new features will be included, such as water source. This would be a change that we would want to detect as early as possible. Make a function to check if column names have changed at all after the initial bulk download is completed and flag if a change is detected.

To update Disclosure index and reports

  • Move disclosure meta table maker to common
  • Color values based on source (FF or OpenFF)
  • Show important flags
  • Explain why no mass was calculated, in those disclosures.
  • Include explanation tables (not directly in the html)
  • Can I make column-hiding work?
  • Can I make floating header work?
  • In DI of somewhere, make an API entry that will show and link to all disclosures
  • For DI, because it is so big, point to (and make) county-based versions.
  • Does it make sense to store the tables separately and 'include' the in a template?

Refactoring to integrated repo

In the process of moving to this more integrated version of Open-FF, we want to refactor code to make it cleaner and more centralized. We will do this over the course of the process by taking sections at a time:

  • file handling - builder tasks
  • file handling - builder core
  • file handling - notebooks
  • file handling - browser

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.