Coder Social home page Coder Social logo

tillahoffmann / summaries2 Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 175 KB

License: BSD 3-Clause "New" or "Revised" License

Python 99.26% Stan 0.74%
approximate-bayesian-computation likelihood-free-inference simulation-based-inference summary-statistics

summaries2's Introduction

Minimizing the Expected Posterior Entropy Yields Optimal Summaries

This repository contains code and data to reproduce the results presented in the manuscript Minimizing the Expected Posterior Entropy Yields Optimal Summaries.

Figures and tables can be regenerated by executing the following steps:

  • Ensure a recent Python version is installed; this code has been tested with Python 3.10 on Ubuntu and macOS.
  • Optionally, create a new virtual environment.
  • Install the Python requirements by executing pip install -r requirements.txt from the root directory of the repository.
  • Install CmdStan by executing python -m cmdstanpy.install_cmdstan --version 2.31.0. Other recent versions of CmdStan may also work but have not been tested.
  • Optionally, verify the installation by executing pytest -v.
  • Execute cook exec "*:evaluation" which will run all experiments and generate evaluation metrics which are saved at workspace/[experiment name]/evaluation.csv.
  • Execute each of the Jupyter notebooks (saved as markdown files) in the notebooks folder to generate the figures.

Results Structure

After running the experiments (see above), the workspace folder contains all results. It is structured as follows, and the folder structure is repeated for each experiment.

benchmark-large  # One folder for each experiment.
    data  # Train, validation, and test split as pickle files; other temp files may also be present.
        test.pkl
        train.pkl
        validation.pkl
        ...
    samples  # (Approximate) posterior samples as pickle files.
        [sampler configuration name].pkl
        ...
    transformers  # Trained transformers, e.g., posterior mean estimators, as pickle files.
        [transformer configuration name]-[digits].pkl  # One of three replications with diff. seeds.
        [transformer configuration name].pkl  # Best transformer amongst the three replications.
    evaluation.csv  # Evaluation of different summary statistic extraction methods.
benchmark-small
    ...
coalescent
    ...
tree-large
    ...
tree-large
    ...
figures  # Contains PDF figures after executing notebooks.

Each evaluation.csv file has seven columns:

  • path which refers to one of the methods used to extract summaries.
  • three columns {nlp,rmise,mise} which are best estimates of negative log probability loss, root mean integrated squared error, and mean integrated squared error, respectively. The estimates are obtained by averaging over all samples in the corresponding test set.
  • three columns {nlp,rmise,mise}_err which are standard errors obtained as sqrt(var / (n - 1)), where var is the variance of the metric in the test set, and n is the size of the test set.

summaries2's People

Contributors

tillahoffmann avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.