Coder Social home page Coder Social logo

watchme-sklearn's Introduction

Watchme Sklearn

This is an example of using watchme, specifically the psutils decorator, to monitor resource usage for various functions run within Python. Since we build the dependencies into a Singularity container, and since Singularity has access to our home, the watcher and data are saved on the host with no extra work needed.

Note I created the watcher repository with watchme first, and then added the extra files for the README.md and container. If you use a decorator, you don't technically need to do this - the Python files being decorated can be separate from the watchme base with results. I wanted to keep them together, so I chose to add these files after.

1. Build the Container

First, build the Singularity container with Python dependencies installed:

sudo singularity build watchme-sklearn.sif Singularity

2. Run

Next, running the container is going to create a watcher called "watchme-sklearn" which by default will go into your $HOME/.watchme folder. You'll see the watcher generated, followed by the function runs.

singularity run watchme-sklearn.sif

Adding watcher /home/vanessa/.watchme/watchme-sklearn...
Generating watcher config /home/vanessa/.watchme/watchme-sklearn/watchme.cfg

=============================================================================
Manifold learning on handwritten digits: Locally Linear Embedding, Isomap...
=============================================================================

An illustration of various embeddings on the digits dataset.

The RandomTreesEmbedding, from the :mod:`sklearn.ensemble` module, is not
technically a manifold embedding method, as it learn a high-dimensional
representation on which we apply a dimensionality reduction method.
However, it is often useful to cast a dataset into a representation in
which the classes are linearly-separable.

t-SNE will be initialized with the embedding that is generated by PCA in
this example, which is not the default setting. It ensures global stability
of the embedding, i.e., the embedding does not depend on random
initialization.

Linear Discriminant Analysis, from the :mod:`sklearn.discriminant_analysis`
module, and Neighborhood Components Analysis, from the :mod:`sklearn.neighbors`
module, are supervised dimensionality reduction method, i.e. they make use of
the provided labels, contrary to other methods.

Computing random projection
Computing PCA projection
Computing Linear Discriminant Analysis projection
Computing Isomap projection
Done.
Computing LLE embedding
Done. Reconstruction error: 1.63546e-06
Computing modified LLE embedding
Done. Reconstruction error: 0.360659
Computing Hessian LLE embedding
Done. Reconstruction error: 0.212804
Computing LTSA embedding
Done. Reconstruction error: 0.212804
Computing MDS embedding
Done. Stress: 157308701.864713
Computing Spectral embedding
Computing t-SNE embedding

The functions are run fairly quickly, so we measure every quarter of a second. Watchme creates the the git repo and commits data to it (each time the decorator function is run, a decorator-psutils-<name> folder is created with a result.json. Every commit will coincide with a list of timepoints run for a single function. Here is what the repository looks like after the run (without adding these files yet):

$ tree
.
├── decorator-psutils-hessian_lle_embedding
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-isomap_projection
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-lda_projection
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-lle_embedding
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-ltsa_embedding
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-mds_embedding
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-modified_lle_embedding
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-pca_projection
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-plot_digits
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-plot_embedding
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-random_2d_projection
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-spectral_embedding
│   ├── result.json
│   └── TIMESTAMP
├── decorator-psutils-tsne_embedding
│   ├── result.json
│   └── TIMESTAMP
└── watchme.cfg

13 directories, 27 files

And you would next be able to push directly to a new GitHub repository:

cd $HOME/.watchme/watchme-sklearn
git remote add origin https://github.com/vsoch/watchme-sklearn.git
git push -u origin master
``

(add a README to have better documentation about what you've done). 
Or you can export full data for any particular decorator to analyze:

```bash
watchme export watchme-sklearn decorator-psutils-plot_digits result.json  --json

What is exporter? Each commit coincides Here is a programmatic way to export all results to a "data" folder in the repository:

mkdir -p data
for folder in $(find . -maxdepth 1 -type d -name 'decorator*' -print); do
    folder="${folder//.\/}"
    watchme export watchme-sklearn $folder --out data/$folder.json result.json --json --force
done

Advanced

If you already have a watchme repository, and it's located somewhere non-traditional, you can have watchme generate results in the folder where you happen to be by exporting the WATCHME_BASE_DIR first.

export WATCHME_BASE_DIR=$(dirname $PWD)

And for a run from within a Singularity, container you would need to have this export as a SINGULARITYENV_

export SINGULARITYENV_WATCHME_BASE_DIR=$(dirname $PWD)

watchme-sklearn's People

Contributors

vsoch avatar

Watchers

 avatar  avatar  avatar

Forkers

kaestnja

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.