leotrs / decu Goto Github PK

3.0 2.0 0.0 1.7 MB

decu stands for "Decu is a Experimental Computation Utility"

License: MIT License

Python 100.00%

decu's Introduction

decu

decu stands for "Decu is a Experimental Computation Utility". decu is a suite of command line tools to automate the menial tasks involved in the development of experimental computation projects.

Experimental Computation

We define a "experimental computation" script as a script that reads some data, performs an experiment (run an algorithm, train a model, plot a figure, etc), saves some results to disk, and then quits. Some tasks that are usually involved in this are:

file I/O: choosing file names for results and plots
multiprocessing: running many experiments/scripts in parallel
timekeeping: how long algorithms take to run
cloud service integration
logging

decu was born from the realization that none of these tasks have anything to do with (and in fact get in the way of) the actual experimentation being done. Furthermore, experimental computation scripts tend to mix together code dedicated to these tasks with code intended to run the actual algorithms of interest, thus making said algorithms harder to maintain and debug.

The main goal of decu is to provide standardization and automation of these tasks, with end code that clearly separates experimental computation from bookkeeping and other auxiliary code.

Installation

Clone this repo, cd to the decu directory and do

$ pip install .

Now you have a local installation of decu. If you are going to make edits to decu, don't forget to use the -e flag.

Usage

For a simple example, please see the quick start page. For more, see the tutorial and the documentation page.

Best practices

decu is built with Best Practices for Data Science in mind. For more, see

Wilson, Greg, et al. Good Enough Practices in Scientific Computing. PLOS Computational Biology, vol. 13, no. 6, 2017, doi:10.1371/journal.pcbi.1005510.

Wilson, Greg, et al. Best Practices for Scientific Computing. PLoS Biology, vol. 12, no. 1, July 2014, doi:10.1371/journal.pbio.1001745.

decu's People

Contributors

Stargazers

Watchers

decu's Issues

we don't need the Script class

can we add a pytest-discoverable test that runs `python -m decu src/testscript.py` inside the test/testscript directory?

leave the init.py file empty and separate everything in files

Reason being that now init.py and main.py use the same code to load the config parser. If in the future we need to change that code (for example we may change the interpolation kw arg), we will probably enter a bug by changing one and not the other. One of the new files this should have a read_config function that both use.

add "saved figure to file" message to log

Would using string templates make the configuration file more readable?

All the {} in the strings in config.ini make no sense. Would using string templates improve this?

shorten the result file name in log output

add a test case that checks that a @experiment or @figure have been saved to disk correctly

Implement the run_parallel that works on @experiment decorated methods

should we have named tuples for experimental results?

add a `project init` executable

add a CLI option for the executable that reads some data and runs a @figure-decorated method on it

Right now, main creates a Script object by passing NOW. Should the Script class itself handle this?

add "data registering" functionality

log experimental results

in results/

add config file options to change the log file, figures file, results file (filenames, not dir names)

add a test case for testing the show and save arguments of @figure

The experiment decorator needs to know the NOW time to generate the result file name. Where is it going to get it from?

define a all

add a test case for command line options

for example, if we call decu inspect or decu exec without a file parameter, decu should print an error.

return decorated methods that receive self as first argument, instead of doing obj=args[0]

add a test case checking that the dictionary returned by run_parallel has the elements in the correct order

that is, check that every value corresponds to the key it was associated to
equivalently, make sure that pool.starmap doesn't change the order of the results with respect to the order of the iterable