stettberger / versuchung Goto Github PK

versuchung - a toolbox for experiments

License: GNU General Public License v3.0

Python 98.55% Makefile 0.23% Jupyter Notebook 1.22%

versuchung's Introduction

versuchung - a toolbox for experiments

In some fields of science it is common practice to not publish any experiment descriptions or the raw measurement results of the experiment. This is not only opposed to the idea of science, where results should be reproducible to verify them by a larger audience, but also harmful for other researchers, who want to extend the results in the field.

This lack of experiment descriptions and raw data is not necessary a result of mean intention by the original researcher, more a lack of tools for documenting and specifying experiements and the raw data analysis. This is especially true for some fields of computer science. But as these people are experts in using formal descriptions (also known as programming languages) it is surprising that there is no easy-to-use tool for specifying experiments and analyses.

Versuchung tries to fill this gap, by providing such a tool, where experiments can be specified with proper input and output parameter declarations, versioned result sets and the possibility to process the result sets further.

Documentation

For the documentation, please refer to https://versuchung.readthedocs.io

Getting the latest version

versuchung is hosted at github: https://github.com/stettberger/versuchung

versuchung's People

Contributors

Stargazers

Watchers

Forkers

siccegge sl33k derf bjfiedler iostapyshyn marzelpan halbuer nfuhler gerion0

versuchung's Issues

Tests fail for Python3 only systems and do not respect the Python3 version

Appearently, invoking ./setup.py test triggers tests for Python 2 and Python 3 (actually for the python2 and python3 symlinks).

On my system the package is built multiple times for different python versions. This is done by invoking setup.py multiple times with the specific Python version, e.g.: pypy3 setup.py test --verbose (and alias the standard symlinks to exactly this version). This results in (for pypy3):

>>> Test phase: dev-python/versuchung-9999
 * pypy3: running distutils-r1_run_phase python_test
pypy3 setup.py test --verbose
running test
make -C tests PYTHON=/usr/bin/pypy3
make: Entering directory '/var/tmp/portage/dev-python/versuchung-9999/work/versuchung-9999/tests'
Running test python2: two_experiments...: _python_wrapper_setup: python2 is not supported by pypy3 (PYTHON_COMPAT)
make: *** [Makefile:22: py2-two_experiments] Error 127
make: Leaving directory '/var/tmp/portage/dev-python/versuchung-9999/work/versuchung-9999/tests'
error: command 'make' failed with exit status 2

Would be nice if the Makefile would use (only) the same interpreter as the one that executes setup.py.

Allow running experiments from ipython REPL

versuchung's search utilities actually are quite useful from within an ipython style environment. For cases where hardcoding searches in the input parameters is not good enough and manually grepping through metadata is not a solution either.

This should already kind of work but requires constructing an argv array. a solution with *args whould though be a lot nicer!

should only issue same experiment hash for same experiment

relative paths are canonicalized for metadata hasing .. but the non-canonicalized version is used in the experiment. Together with the chdir on with statements the resulting experiments are totally different but get hashed the same

handling of filesystem objects inconsistent

When using an executable as input, the has over this executable is calculated.

When using a tar or directory input the identifying characteristics hashed into the ID seems to only be a (canonicalized) filename.

Especially when copying experiments between hosts, the actualy tar / directory / whatever is the important thing that needs to stay the same for the experiment to be same, not some canonicalized file location

Argument parsing fails with unrecognized arguments

Following cfb784e (#27), the argument parsing appears to be broken:

(.venv) user@host:~/dir/data $ ../latency.py --variant syscall
usage: %prog <options> [-h] [-d BASE_DIR] [--dummy] [-s] [-v] [--title TITLE] [--variant VARIANT] [--runs RUNS] [--csv CSV]
%prog <options>: error: unrecognized arguments: ../latency.py
(.venv) user@host:~/dir/data $ ../latency.py --variant=syscall
usage: %prog <options> [-h] [-d BASE_DIR] [--dummy] [-s] [-v] [--title TITLE] [--variant VARIANT] [--runs RUNS] [--csv CSV]
%prog <options>: error: unrecognized arguments: ../latency.py
(.venv) user@host:~/dir/data $ ../latency.py --help
usage: %prog <options> [-h] [-d BASE_DIR] [--dummy] [-s] [-v] [--title TITLE] [--variant VARIANT] [--runs RUNS] [--csv CSV]

options:
  -h, --help            show this help message and exit
  -d BASE_DIR, --base-dir BASE_DIR
                        Directory which is used for storing the experiment data
  --dummy               Use dummy result directory
  -s, --symlink         symlink the result dir (as newest)
  -v, --verbose         increase verbosity (specify multiple times for more)
  --title TITLE         custom title of the experiment (default: Experiment class-name)
  --variant VARIANT     (default: syscall)
  --runs RUNS           (default: 32000000)
  --csv CSV             (default: False)

For the following definition of inputs:

inputs = {
    "variant": String("syscall"),
    "backend": get_backend,
    "runs": Integer(32000000),
    "csv": Bool(False),

    "arch":   lambda self: String(uname().machine),
    "host":   lambda self: String(uname().node),
    "kernel": lambda self: String(" ".join([
        uname().system, uname().release, uname().version
    ])),
}

tar archives and using experiments as inputs

ahoi

fwiw if you use an tar archive in an experiment (as input!) and use that experiment as input for a second experiment (A inputs TarArchive() ... B inputs A) running versuchung will fail:

the tar archive will try to initialize itself and find the canonical path of the tar file (which has no value and can't be given one when running B) so it gets None as canonical path and then calls a .startswith("/") on that

does not allow spaces in String

Hi!

inputs = { 'labels' : List(String) }

now on the commandline if I do a --labels "foo bar" I get an entry spelled "foo" in the labels list inside the experiment. the " bar" part of the string seems to be silently ommited

Allow searching for class hirarchy

One may want to find all experiments inheriting from some baseclass as input

Should err on invalid input experiment

When having an input of type List(Experiment) one can specify a non-existing, invalid experiment name and versuchung does not complain or otherwise signal a problem. instead one gets an empty experiment class initialized

Impossible to perform a search based on function-initialized inputs

It is impossible to perform a search using versuchung.search.search_experiment_results based on input parameters which are initialized using a function.

When instantiating the experiment, the LambdaType inputs are reset to None:

versuchung/src/versuchung/experiment.py

Lines 192 to 201 in 844baf3

    
           if hasattr(inp, "__reinit__"): 
        
               try: 
        
                   inp.__reinit__(metadata[name]) 
        
               except: 
        
                   logging.debug('Cannot reinit field %s. Setting it to None', name) 
        
                   self.inputs[name] = None 
        
           else: 
        
               # We cannot reinit this input from metadata. Therefore it is better to clear it. 
        
               logging.debug('Cannot reinit field %s. Setting it to None', name) 
        
               self.inputs[name] = None

The selector is then receiving an instance with the input being None.

However, since the type of the input is not known ahead of time, we cannot simply __reinit__ it from the metadata. Any ideas what would constitute a good fix?

Behaviour with -d is awkward

moin!

apart from the recent PR,the behavior with -d is still somewhat awkward. For example, inputs need to be specified relative to the directory in -d. I'd say it's more intuitive if all parameters are relative to the place of command invocation. At least it should be documented better

	if hasattr(inp, "__reinit__"):
	try:
	inp.__reinit__(metadata[name])
	except:
	logging.debug('Cannot reinit field %s. Setting it to None', name)
	self.inputs[name] = None
	else:
	# We cannot reinit this input from metadata. Therefore it is better to clear it.
	logging.debug('Cannot reinit field %s. Setting it to None', name)
	self.inputs[name] = None

stettberger / versuchung Goto Github PK

versuchung's Introduction

versuchung - a toolbox for experiments

Documentation

Getting the latest version

versuchung's People

Contributors

Stargazers

Watchers

Forkers

versuchung's Issues

Recommend Projects

Recommend Topics

Recommend Org