Coder Social home page Coder Social logo

biosimulators / biosimulators_test_suite Goto Github PK

View Code? Open in Web Editor NEW
7.0 8.0 2.0 55.6 MB

Tool for validating that biosimulation software tools implement the BioSimulators standards for simulators

Home Page: https://docs.biosimulators.org/Biosimulators_test_suite/

License: MIT License

Python 99.52% CSS 0.48%
systems-biology mathematical-modelling computational-biology sbml cellml sed-ml docker biosimulators combine-archive omex-metadata

biosimulators_test_suite's Introduction

Binder All Contributors

Logo

BioSimulators

More comprehensive and more predictive models have the potential to advance biology, bioengineering, and medicine. Building more predictive models will likely require the collaborative efforts of many investigators. This requires teams to be able to share and reuse model components and simulations. Despite extensive efforts to develop standards such as COMBINE/OMEX, SBML, and SED-ML, it remains difficult to reuse many models and simulations. One challenge to reusing models and simulations is the diverse array of incompatible modeling formats and simulation tools.

BioSimulators addresses this challenge by providing is a registry of simulation tools, many of which provide consistent interfaces. These standardized simulation tools make it easier to find and run simulations. These standardized simulation tools build upon BioSimulators' standard for command-line interfaces for simulation tools, standard structure for Docker images of simulation tools, and format for capturing the capabilities (e.g., supporting modeling frameworks, simulation algorithms, modeling formats) of a simulation tool.

The BioSimulators website provides a web application for browsing this registry. This website provides links to the individual simulators and their containers. Instructions for using the containers are available at https://docs.biosimulations.org/users/simulating-projects/. Information about how to containerize a simulation tool and submit it to the registry is also available at https://docs.biosimulations.org/users/creating-tools/.

runBioSimulations provides a simple web application for using the containerized simulation tools in the BioSimulators registry to execute simulations. This makes it easy to run a broad range of simulations without having to install any software. BioSimulations provides a platform for sharing modeling studies, modifying published studies, and executing published studies using runBioSimulations.

This repository serves several function:

  • This repository is a central place for users to contribute and discuss issues related to BioSimulators.
  • This repository is a central place for simulation software developers to submit simulation tools to the BioSimulators registry.
  • This repository contains code for automated verification of the capabilities of containerized simulation tools.
  • This repository contains a Pipenv configuration and a Dockerfile for a Docker image with a Python environment with most of the validated simulation tools and tests for this Docker image

The code for the BioSimulators web application, REST API, and database is in the Biosimulations repository. The code for verifying the capabilities and accuracy of containerized simulation tools is in the BioSimulators test suite repository. The code for the individual simulation tools is spread across numerous repositories, including several owned by the BioSimulators GitHub organization.

Getting started

Users

We recommend that users use the hosted versions of runBioSimulations at https://run.biosimulations.org to execute simulations.

Each validated simulation tool is available as Docker image. Most of the validated simulation tools are also available as Python APIs. See https://biosimulators.org for information about the interfaces available for each tool and where they can be obtained.

A Docker image with a Python environment with APIs for most of the validated simulation tools is available at https://github.com/orgs/biosimulators/packages/container/package/biosimulators. An iPython shell for this environment can be launched by installing Docker and running the commands below. Information about using the Python APIs in the image is available at https://docs.biosimulations.org/users/simulating-projects/.:

docker pull ghcr.io/biosimulators/biosimulators
docker run -it --rm ghcr.io/biosimulators/biosimulators

Interactive tutorials for the Python APIs for simulation tools and for BioSimulators' API are available from Binder here.

Simulation software developers

Information about how to containerize a simulation tool and information about how to submit simulation tools to the registry is available at https://docs.biosimulations.org/users/publishing-tools/. We encourage developers to containerize their tools. However, BioSimulators also acccepts simulation tools that don't support BioSimulators' standards.

Developers

We welcome contributions to BioSimulators! Please see the Guide to Contributing for information about how to get started.

Technical documentation

Please see the links below for additional technical documentation.

Known issues

Installation of individual simulation tools

  • Several simulation tools are not available from PyPI
    • There is an open issue to publish LibSBMLSim to PyPI.
    • The version of RBApy used by BioSimulators is a fork. This fork adds the ability to run simulation with GLPK and Gurobi, in addition to CPLEX. There is an open pull request to merge this fork. There is also an open issue to publish RBA to PyPI.
    • BioNetGen, VCell and XPP cannot be installed from PyPI because they are not Python packages.
    • Most simulation tools require dependencies which must be installed separately from pip. See each tool for its installation instructions.

BioSimulators consolidated Docker image

  • OpenCOR is not currently installed because OpenCOR is distributed as its own Python environment. A a result, OpenCOR is difficult to install into other environments, such as the consolidated BioSimulators environment. In addition, OpenCOR is also currently pinned to Python 3.7.
  • VCell is not currently installed because limited installation instructions are available. VCell also does not provide a compatible Python API.

Utilizing multiple simulation tools within a single Python environment

  • PySCeS and NEURON/NetPyNe cannot be imported into the same Python memory. This appears to be due to using different versions of SUNDIALS. Importing both causes segmentation faults. One workaround is to import the tools in separate forks that have separate memories.
  • CBMPy requires versions of SymPy that have microversion numbers. Currently, this is incompatible with the latest version of AMICI which requires SymPy >= 1.9, but no microversion of 1.9 is yet available.

License

This package is released under the MIT license.

Development team

This package was developed by the Karr Lab at the Icahn School of Medicine at Mount Sinai in New York and the Center for Cell Analysis and Modeling at UConn Health as part of the Center for Reproducible Biomodeling Modeling with assistance from the contributors listed here.

Funding

This package was developed with support from the National Institute for Bioimaging and Bioengineering (award P41EB023912).

Questions and comments

Please contact us at [email protected] with any questions or comments.

biosimulators_test_suite's People

Contributors

bilalshaikh42 avatar codebydrescher avatar eagmon avatar gmarupilla avatar jonrkarr avatar luciansmith avatar moraru avatar ryannjordan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

biosimulators_test_suite's Issues

Improve the details of simulator specification errors

The validator returns this

{
  "error": [
    {
      "status": "400",
      "title": "ValidatorError",
      "detail": "Algorithms must be annotated with unique KiSAO terms",
      "source": {
        "pointer": "/algorithms"
      }
    }
  ]
}

However, the test suite displays this

- Simulator specifications from https://raw.githubusercontent.com/biosimulators/Biosimulators_VCell/00d08a031a0788435b44989e73ae3f43f33e4cb0/biosimulators.json are not valid. Specifications must be adhere to the BioSimulators schema. Documentation is available at https://api.biosimulators.org/.
- 
-   400 Client Error: Bad Request for url: https://api.biosimulators.org/simulators/validate

The test suite should display the source and detail attributes from the validator's response.

Setup system for managing access to revising specifications of simulators

@bilalshaikh42 here's an outline of what we discussed today about how to implement permissions to revising simulators

  • Setup teams for the existing approved tools
    • Create teams
    • Add biosimulators-daemon to teams
    • Add collaborators to teams
    • Give teams admin access to their repos
  • Upon the first submission of a simulation tool
    • Use the GitHub API to create a GitHub team with id @biosimulators/{simulator-id}
    • Use the GitHub API to add the submitter to that team.
    • Use the GitHub API to make the submitter a maintainer of the team so they can add collaborators if needed.
    • Use the GitHub API to post a message to the GitHub issue for the submission that a team was created and that the submitter can manage the team
  • Upon subsequent submissions of a simulation tool
    • Use the GitHub API to check whether the submitter is a member of the team for the simulator
    • If not, use the GitHub API to post an error message
  • Add documentation about this permissions system to

This is probably easiest to implement in the Python package in this repo.

Add test case to check that simulators support their curated outputVariablePatterns

  1. Find a suitable example model encoded in XML
  2. Use dependentVariableTargetPatterns XPATH patterns to generate a list of variable targets for the example model.
    • Find all objects that match the XPATH pattern
    • Generate unique XPATHs for each matching object
  3. Create a data generator and data set for each matching object
  4. Simulate example model
  5. Check that data sets were produced for each matching object

(2) is likely difficult to automate because the specifications of dependentVariableTargetPaterns may not capture quite enough information to automatically generate XPATHs for specific objects.

[Bug]: Log validation did not catch recent errors with vcell image

What is the bug?

Recent versions of vcell have not been producing logs as expected on Biosimulations.
See biosimulations/biosimulations#4134 and biosimulations/biosimulations#4035
However, these versions (7.4.0.34 and 7.4.0.32) passed validation and were marked as valid in the database.

When does the bug occur?

No response

How should BioSimulators have behaved?

Ideally, these errors would be caught by the validation. It is possible that there is some sort of environment difference on the HPC that is causing different behavior between the HPC and the validation suite.

Screenshots

No response

Which types of devices does the bug occur in?

No response

Which operating systems does the bug occur in?

No response

Which browsers does the bug occur in?

No response

Add test case for simulation re-start-ability

One use case for re-start-ability is co-simulation (e.g., Vivarium). As a pre-requisite, simulation tools need to be re-start-able (i.e., a simulation can be instantiated from the result of a previous step, ideally with all simulation state fed in externally so that global state if a co-simulation could be managed externally).

We already have re-start-ability tests for this most relevant tools:

  • Kinetic
  • Logical:
    • BoolNet -- initial state cannot be set
    • GINsim -- initial state cannot be set
  • Flux balance: N/A because simulations don't really have state
    • CBMPy
    • COPBRApy
    • RBApy

In addition to tests for individual tools, a test here would be useful.

  • Execute a simulation
  • Compare to
    • Execute the first half of the time course
    • Independently instantiate a second simulation
    • Copy the output of the first simulation to the input to a second simulation
    • Execute the second simulation
    • Check that the end state of the second time course is the same as that of executing the entire time course all at once

This is the test case discussed with @eagmon and @prismofeverything.

Add additional test cases for validating simulators

Todo

  • Add tests for Docker image
    • User is root
    • Declares supported environment variables
    • Has expected OCI labels
    • Has expected BioContainers labels
  • Add more tests for simulator CLI interface
    • Describes supported environment variables
    • -h, --help arguments display inline help
    • -v, --version arguments display inline version information
  • Add more tests for COMBINE support
    • When a file is marked as "master", only this file is executed
    • When no file is master, all SED-ML files are executed
    • Multiple SED documents
      • Outputs saved to distinct locations
  • Add more tests for SED-ML support
    • Core elements
      • Tasks
        • Models without changes
        • Simulations without algorithm parameter changes
      • Reports
        • Datasets
          • data generators for individual variables
    • Model sources
      • Local file
      • Another model in the same SED-ML document
        • Changes inherited
    • Model changes -- requires many curated archives with model changes
      • changeAttribute
      • addXml
      • removeXml
      • changeXml
      • computeChange
    • algorithmParameter values
    • uniformTimeCourse with outputStartTime != 0
    • uniformTimeCourse with initialTime != 0
    • repeatedTask
      • UniformRange
      • VectoRange
      • FunctionalRange
      • FunctionalRange variables
      • Nested functional ranges
      • SetValue
      • Multiple subtasks
      • Nested repeated tasks
      • Subtasks mixed of Task and RepeatedTask
      • Subtasks with differently shaped results
      • Subtask sorting
    • plot2d
    • plot3d
    • multiple tasks
    • multiple reports
    • multiple plots
    • data generators whose variables have different shapes
    • outputs whose data generators have different shapes
  • Reports of simulation results
  • Documents of project execution status

In other issues

  • Model sources: #16
    • URL
    • MIRIAM URN for BioModels

Most of the above tests require COMBINE archives. To make this tractable, the tests need to be automatically generated. Below is one possible approach:

  1. Find 1 COMBINE archive that is compatible with the simulator (i.e., model format, simulation algorithm) from among the manually curated examples in this repository.
  2. Parse out the model and simulation from the archive.
  3. Use this information to automatically generate additional archives to probe the above issues

Testing that a simulation tool supports the algorithm parameters described in its specifications can be achieved similarly:

  1. Read specifications
  2. For each algorithm, computationally generate a SED-ML simulation task with an sedml:algorithmParameter element for each parameter. Set sedml:algorithmParameter/@value to the default value of the parameter annotated in the specifications. Skip parameters who default values are null because SED-ML can't represent null.

Remove Ciliberto discrete example

  • Remove because this isn't strictly valid as a discrete model; VCell picks up on this.
  • Check that COPASI, GillesPy2, tellurium still pass the test suite

Export test results to JSON

  • Save to JSON
    • test id
    • description
    • result (pass, fail, skip)
    • duration
    • std/stderr
    • exception
    • warnings
  • Add command-line option to save to file
  • Post to BioSimulators API during GH workflow
  • Enable BioSimulators API to accept test results
  • Display test results in simulator view component
  • Verify everything is working once deployed

Test suite documentation

Hello @jonrkarr,

Is there any kind of doc related to biosimulators_test_suite which can tell which test is shown in GH action logs checks for exactly what thing? Because it's not very much intuitive from the Test names themselves.

I tried looking into code for few minutes, but to map out the complete functioning of the code, I'd have to spend a lot of time because of the high level of inheritance among the classes, which would be even more for all the tests cases.

It'll help a great deal if we have a written test that checks for what (possibly with examples).

Incorporate tests of support for model languages, numerical accuracy, and reproducibility

The current tests focus on support for SED-ML, COMBINE, and the BioSImulators conventions. The current tests implicitly assume that the simulation tools have already been tested for correctness, ideally using something like the SBML test suite. However, this is not guaranteed.

Part of the reason why the test suite doesn't currently test numerical results more deeply is that this requires tests for each model language, and BioSimulators is intentionally architected to support any language. As a result, its not trivial to try to build rigorous tests for each model language.

Nevertheless, the test suite could be more powerful if merged with something like the SBML test suite.

  • Reproducibility (multiple executions produce the same results)

Add test cases for support of environment variables

Test cases

  • ALGORITHM_SUBSTITUTION_POLICY
  • VERBOSE

The attributes of the Docker image can be used to determine which environment variables the simulator supports.

However, I can't think of a way to automatically formulate test archives to probe support because we're not asking simulators to declare the alternative algorithms that they recognize. This would need to be added to the simulator specs. However, we want this to be implemented through the hierarchy of KiSAO terms, rather than something that individual simulation tools need to curate.

Wrong warning when running the test suite.

For published_project.SimulatorCanExecutePublishedProject:sbml-core/Ciliberto-J-Cell-Biol-2003-morphogenesis-checkpoint-Fehlberg

Plots were not produced:
simulation.sedml/Figure_3a
Extra plots were produced:
simulation_1.sedml/Figure_3a

The expected folder "simulation.sedml" is wrong, the correct folder name is "simulation_1.sedml"

invalid test suite published model template

The test case published_project.SimulatorCanExecutePublishedProject:sbml-core/Parmar-BMC-Syst-Biol-2017-iron-distribution requests an invalid algorithm [KISAO_000019; non-existent], apparently a typo [should be KISAO_0000019; CVODE]. Thus any simulator will skip.

Increase rigor of tests for model changes

The tests for support for model changes basically just test that the simulation tool doesn't have a run-time when a SED model has SED changes.

The test suite should more rigorously verify that simulation tools implement model changes correctly.

  • changeAttribute
  • addXML
  • removeXML
  • changeXML
  • computeChange
  • setValue

Since the test suite can't directly observe model specifications, the test suite has to verify this by checking that simulation tools produce one or more data sets that demonstrate the impact of the model change. For example, a change could set an initial value of a model variable to zero, and then the test suite could verify that the first value of a data set for a data generator for the variable is zero, indicating that the change was correctly applied. This is challenging to do in a model language and algorithm agnostic way as this test suite aims to do.

Wrong warning when running the test suite.

For published_project.SimulatorCanExecutePublishedProject:sbml-core/Caravagna-J-Theor-Biol-2010-tumor-suppressive-oscillations

Plots were not produced:
BIOMD0000000912_sim.sedml/plot_1
Extra plots were produced:
BIOMD0000000912_sim.sedml/Figure_1_bottom_left

The expected "plot_1" doesn't exist in the sedml file, the correct name is "Figure_1_bottom_left"

Use GitHub API to programmatically make new Docker images public

Currently, images for new simulators have to manually be made public (the first time a simulator is submitted to BioSimulators). The GitHub API doesn't currently provide a mechanism to make images public. This can only be done through the GitHub web app. When GitHub provides the capability to set this via the API, this should be used to programmatically make new Docker images public.

Make repository private

If this is just for testing, it might be best to make it private to avoid cluttering the organization view

Add metadata about examples following OMEX Metadata guidelines

  • OMEX Meta guidelines: https://biosimulators.dev/conventions/metadata

  • Complete example: https://github.com/biosimulators/Biosimulators_utils/blob/dev/tests/fixtures/omex-meta/biosimulations-with-file-annotations.rdf

  • sbml-core/

    • Caravagna-J-Theor-Biol-2010-tumor-suppressive-oscillations
    • Ciliberto-J-Cell-Biol-2003-morphogenesis-checkpoint-continuous
    • Ciliberto-J-Cell-Biol-2003-morphogenesis-checkpoint-Felhberg
    • Edelstein-Biol-Cybern-1996-Nicotinic-excitation
    • Parmar-BMC-Syst-Biol-2017-iron-distribution
    • Szymanska-J-Theor-Biol-2009-HSP-synthesis
    • Tomida-EMBO-J-2003-NFAT-translocation
    • Varusai-Sci-Rep-2018-mTOR-signaling-LSODA-LSODAR-SBML
    • Vilar-PNAS-2002-minimal-circardian-clock
    • Vilar-PNAS-2002-minimal-circardian-clock-continuous (metadata is the same as Vilar-PNAS-2002-minimal-circardian-clock)
    • Vilar-PNAS-2002-minimal-circardian-clock-discrete-NRM (metadata is the same as Vilar-PNAS-2002-minimal-circardian-clock)
    • Vilar-PNAS-2002-minimal-circardian-clock-discrete-SSA (metadata is the same as Vilar-PNAS-2002-minimal-circardian-clock)
  • sbml-fbc/

    • Escherichia-coli-core-metabolism
  • sbml-qual/

    • Chaouiya-BMC-Syst-Biol-2013-EGF-TNFa-signaling
  • cellml/

    • Lorenz-system
  • neuroml-lems/

    • Hodgkin-Huxley-cell-CVODE
    • Hodgkin-Huxley-cell-Euler (annotation will be the same as for Hodgkin-Huxley-cell-CVODE)
  • bngl/

    • Dolan-PLoS-Comput-Biol-2015-NHEJ
    • test-bngl
  • smoldyn/

    • Lotka-Volterra

Move results and vega figures into folders?

Would it make sense to include the results and the vega figures into the folders that are zipped into the combine archive? Would make the repo a bit easier to navigate to keep everything in one folder.
The figure is also something that we are planning to have included in the omex archive, and the results hdf file can be included as a "reference" result set for the test-suite/other uses

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.