Coder Social home page Coder Social logo

sdtaylor / pyphenology Goto Github PK

View Code? Open in Web Editor NEW
31.0 5.0 11.0 1.92 MB

Plant phenology models in python with a scikit-learn inspired API

Home Page: https://pyphenology.readthedocs.io

License: MIT License

Python 97.02% TeX 2.98%
ecology ecology-modelling science plants

pyphenology's Introduction

pyPhenology

PyPI test-package License Documentation Status codecov DOI

Plant phenology models in python with a scikit-learn inspired API

Full documentation

http://pyphenology.readthedocs.io/en/master/

Installation

Requires: scipy, pandas, joblib, and numpy

Install via pip

pip install pyPhenology

Or install the latest version from Github

pip install git+git://github.com/sdtaylor/pyPhenology

Usage

A Thermal Time growing degree day model:

from pyPhenology import models, utils
observations, predictors = utils.load_test_data(name='vaccinium')
model = models.ThermalTime()
model.fit(observations, predictors)
model.get_params()
{'t1': 85.704951490688927, 'T': 7.0814430573372666, 'F': 185.36866570243012}

Any of the parameters in a model can be set to a fixed value. For example the thermal time model with the threshold T set to 0 degrees C

model = models.ThermalTime(parameters={'T':0})
model.fit(observations, predictors)
model.get_params()
{'t1': 26.369813953905265, 'F': 333.76534368004388, 'T': 0}

Citation

If you use this software in your research please cite it as:

Taylor, S. D. (2018). pyPhenology: A python framework for plant phenology modelling. Journal of Open Source Software, 3(28), 827. https://doi.org/10.21105/joss.00827

Bibtex:

@article{Taylor2018,
author = {Taylor, Shawn David},
doi = {10.21105/joss.00827},
journal = {Journal of Open Source Software},
mendeley-groups = {Software/Data},
month = {aug},
number = {28},
pages = {827},
title = {{pyPhenology: A python framework for plant phenology modelling}},
url = {http://joss.theoj.org/papers/10.21105/joss.00827},
volume = {3},
year = {2018}
}

Acknowledgments

Development of this software was funded by the Gordon and Betty Moore Foundation's Data-Driven Discovery Initiative through Grant GBMF4563 to Ethan P. White.

pyphenology's People

Contributors

arfon avatar chilipp avatar ethanwhite avatar sdtaylor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pyphenology's Issues

documentation

  • badges
  • all primary functions/classes documented
  • short vignette/site
  • one more short example
    rmse of all models on held out data
  • seed optimizer parameter
  • graph showing timings
  • julien date in the data structure needs completion
  • basin hopping optimizer

setup sensible optimizer defaults

Sensible defaults have been setup for for the differential evolution and basin hopping for various scenarios.

  • testing A quick method meant for testing code
  • practical Should produce realistic results on desktop systems in a short period. (name of this open to suggestions)
  • intensive Designed to find the absolute optimal solution. Can potentially take hours to days when there is a large number of observations.

improve testing

-have a single fit/predict test for the individual models to test their only real method (_apply_model)

-pick one model to do most of the testing with, each within a test__ function for pytest

-use @pytest.mark.parametrize to iterate over known value testing

  • 1 test_ script for each of those ^

other optimizer methods

  • differential evolution
  • simulated annealing
  • basin hopping
    BH in scipy doesn't constrain to (low,high), so will have to do that manually
  • full search /brute force

sequential model issue

triangle response should work the same with the following two setups but it doesn't

    temperature[right_side] -= t_opt
    temperature[right_side] /= (t_max - t_opt)
    temperature[right_side] -= t_max
    temperature[right_side] /= (t_opt - t_max)

spring warming and parallel models

spring warming model is essentially a uniforc model w/ some fixed params

parallel model is mix of the alternating and parallel model (ie. triangle response for chilling and an exponential function for F)

see implementation in Melaas et al. 2016
but note these are implemented slightly differently in basler 2016 (and probably other papers)

configurable optimization

differential evolution, simulated annealing, full search, etc, with configurable parameters for all

change saved model format to json

Need to be able to store the model type in the file, and whether it's a bootstrap model or no.

would give the ability to load a model with a generic function like

model = utils.load_saved_model('params_file.json')

To that end it would also be good to have a descriptor in the model class available, such as

model.model_metadata
{'model_type':'ThermalTime',
'parameters_fit':True,
etc

or

model.model_type
{'model_type':'Bootstrap',
'core_model':'ThermalTime',
'num_bootstraps':50,
'parameters_fit':True

parallelization

for the ensemble models, make internal parallelization an option by passing the number of cores to use. but also make more complex parallel stuff (MPI, DASK) and option by having a method which just passes each job as an iter object.

general data cleaning methods

potentially methods to clean some raw data and put it in a format used by the package

  • takes dates of phenophases (a la NPN) and convert to DOY with some filter

update docs

lots of updates needed for new data structure, and new ways to save/load files

fix bootstrap prediction

bootstrapmodel.predict() without any arguments should return the prediction of the input data. but after some thought I realized that isn't actually working since the input data is being shuffled underneath. so I need to fix this.

boostrapmodel.predict(newstuff = blah blah), ie. with new data should be working.

models to add

  • uniforc
  • unichill
  • alternating
  • macro scale budburst
  • m1 - requires daylength calculation and associated site_info
  • linear
  • sequential
  • parallel
  • DORMPHOT

add schwartz spring index models

Like what the NPN map use. These would be "fixed" in that they couldn't be fit on new data, but can only be used for predictions.

ensemble model

need an ensemble model to combine different core models. probably with a few different methods. simple means, bayesian model averaing, AIC weights.

make a base ensemble class

There are a lot of duplicated things in ensemble_models.py. especially with the newly added Ensemble()

use TypeError

I think a lot of my assert statements can be replaced with raise TypeError to be more descriptive

add in model fitting time

would be useful for debugging or optimizing
this is mostly dependent on the timing for _apply_model()

be more clear about data structure

I've pulled the data structures straight from my original dataset study models, so I need to clarify (and probably provide code for) how things get converted to that structure.

make missing temperature check more robust

If there are unexpected columns in the obs data.frame with NA's, they will be dropped due to "lack" of temp data, cause the super long temp data.frame will end up with NA's

incorporate weather data retrieval

accept a list of lat/long/ years, check to see if a nearby station is available, and download and format weather data to pyPhenology format

utils.get_weather_data('site_info.csv', data_file='', temp_folder='', accept_all=False, distance_buffer)

distance_buffer:
allowed distance from a site to a weather station

accept_all:
be prompted before downloading large files (can exceed 100mb). or proceeding if

data_file:
final weather data file to be written. will be in the format used in this package.

make linear model a bit better

instead of having start date and end date for spring, do start date and length. that way it's be easier to have them as parameters to estimate.

put in check for bad fits

If fitted parameters produce a bunch of non-predictions(doy 999), then put in a warning saying so.

maybe:
"Model did not converge well, doy values of 999 in prediction output. Perhaps try with optimizer_params='intensive', or with fewer parameters to estimate".

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.