Coder Social home page Coder Social logo

complexify's Introduction

complexify

Toolbox of compute resources for the Specify Network.

complexify's People

Contributors

cjgrady avatar zzeppozz avatar

Watchers

 avatar Jim Beach avatar

complexify's Issues

Update diagrams to show updated design and storage volumes

We decided to skip the controller container for the first version and just have daemon processes on the makeflow containers.

We will also need to store job configuration files somewhere as well as job data like user uploads and environment data for modeling

Diagram what makeflow container looks like

We settled on the makeflow container (maybe rename) running a daemon process that gets jobs and runs one or more makeflows. Create a diagram showing these interactions and making it clear how the process runs a job.

Implement first version of complexify web services

Needed services:

  • Submit job
  • Get job status
  • Retrieve outputs

I am not sure if the first version needs to take uploads or not or if we want to add that later. This may need to expand depending on how / if we want to handle users to start or if this will just be for us to begin with.

Implement a test case for Caryophyllales

Implement a test case for the complexify framework. This test case starts with a list of species names to be included and will need to incorporate data from a variety of occurrence sources. Models should use maxent or rare species modeling. Utilize masks and end with a set of SDMs grouped by taxonomic groups.

Document all tasks available in Complexify v1

Add documentation for all available tasks currently available so we can start putting together test job configurations

  • README.md file in tasks directory
  • lmpy/build_shapegrid
  • lmpy/clean_occurrences
  • lmpy/convert_csv_to_lmm
  • lmpy/convert_lmm_to_csv
  • lmpy/create_rare_species_model
  • lmpy/create_tree_matrix
  • lmpy/encode_layers
  • lmpy/randomize_pam
  • lmpy/split_occurrence_data
  • lmpy/wrangle_matrix
  • lmpy/wrangle_tree
  • biotaphypy/ancestral_distribution
  • biotaphypy/phylo_beta_diversity
  • lmtools/create_sdm

Create a biotaphy webinar infrastructure diagram

We may need to iterate over this a few times but, as of now, we plan on producing a docker container or set of containers that are accessible via a web service so that someone could connect to them from an R or Python client library.

Some likely components to this diagram:

  • R client
  • Python client
  • Flask layer
  • lmpy layer
  • biotaphypy layer
  • Storage volume?

Document all data types available in complexify via API

Document all of the data types we are exposing via our tasks. This includes primative types and derrived types and should include at least all of the input and output types we are exposing via our current tasks and may include some future planned tasks

Implement a worker container

This container should include all of our computational tools (lmpy, lmtools, biotaphy, syftr) in whatever form they are currently in (but easy to update). It should probably run a daemon process like the work queue worker factory

Update container diagram and documentation

Note where containers are evolutions of old Lifemapper components (like MattDaemon). Update documentation and add new container types as the previous iteration is being split.

Create a demo makeflow for Heuchera SDMs

The first demo makeflow we need is one that can run SDMs from known occurrence data. Use Heuchera as a test case and generate a demo makeflow that cleans the occurrence records before running them through maxent.

Create diagrams and documentation for archive / syftorium workflow

There will most likely be at least two workflow types. A first workflow will process the input occurrences, altering them slightly, and creating one more more output files of grouped (probably by species) occurrences as well as a file, or files, indicating which species are present in those occurrence files. This is necessary so that we know what files will be generated by the various tasks.

The other workflow(s) will process the occurrence data in groups to assess the records, create SDMs, create species manifests and syftorium files for cataloging as well as create an output package.

It is important to also figure out where we can include multi-species processing in the workflow.

Create a Complexfy API client

This client should match the API as it currently stands. I don't know if this will ever be widely distributed but there should still be some protections and forethought in case we do decide to distribute for some reason (in entirety or just a portion).

Create a multi-species demo makeflow for Heuchera models

This demo makeflow is more for testing complexify than for testing lmpy scripts. Aimee can help debug any problems found within lmpy or logic errors encountered. This establishes a workflow to build off of for multi-species operations and statistics.

Add pre-commit hooks to complexify repository

Add whatever hooks we think that we need for the complexify repository to ensure we have acceptable code quality and consistency as well as automated testing and any appropriate CI / CD

Implement a makeflow container

There may be a few questions left to answer before this can be done, but at its core, this container needs to run a makeflow or multiple makeflows.

It may need to do some job configuration processing and do some things outside of makeflow as well.

It also needs to be determined if this is a "one shot" (or single run or whatever else it is called) container or if there will be some daemon process so it continuously runs.

Create job processing tool

This tool should take a job configuration as an argument and processes the job appropriately including one or more makeflows

Create initial job daemon process

The first version of the job daemon process can be pretty simple and be focused on running our established job workflows. Try to make it easy to extend to more generic workflows but getting something that we can use quickly is important.

Will run as the daemon process on makeflow container instances.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.