Coder Social home page Coder Social logo

rsokl / noggin Goto Github PK

View Code? Open in Web Editor NEW
36.0 36.0 1.0 4.93 MB

A simple tool for logging and plotting measurements during machine learning experiments

Home Page: https://noggin.readthedocs.io/en/latest

License: MIT License

Python 100.00%
data-visualization livedata machine-learning matplotlib neural-network python real-time

noggin's Introduction

Hello! I do work in the areas of machine learning, physics, and software development. I also work on improving methods for testing scientific/research software, and am a maintainer for the Hypothesis testing library. I am passionate about education, and created the CogWorks course at the MIT Beaver Works Summer Institute as well as the website Python Like You Mean It.

Libraries for accelerating and improving ML research

  • hydra-zen: Making Hydra more pythonic and easier to use at-scale for ML workflows and expriments
  • responsible-ai-toolbox: PyTorch-centric library for evaluating and enhancing the robustness of AI technologies.

Other open source projects

  • MyGrad: Drop-in automatic differentiation for NumPy
  • noggin: A simple tool for logging and plotting metrics in real time
  • custom_inherit: inheriting and merging docstrings in customizable ways (my first ever open source project!)

Tutorials

noggin's People

Contributors

davidmascharka avatar dependabot[bot] avatar rsokl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

lgtm-migrator

noggin's Issues

recreate_plot should take a figsize argument

It would be lovely to be able to take a figsize in recreate_plot so as not to end up with a miniscule plot. When I work out of interactive mode (e.g. when I work in emacs), I'd like to be able to simply construct the plot at the size I want via an interface like:

plotter, fig, ax  = recreate_plot(train_metrics=train, test_metrics=test, figsize=(8, 12))

rather than:

plotter, fig, ax = recreate_plot(train_metrics=train, test_metrics=test)
fig.set_size_inches(8, 12)

Plotting in server mode

Add ability to serve logged data to a plotter. This would permit people to manage a live plot in a separate and multiple notebooks.

This is an ambitious enhancement that has the potential for a large payoff. I would like to carefully consider the best means for serving/listening to data in a simple but robust way. I'd like to get input from other about how to move forward with this (@davidmascharka , @ptran516 , @arjunmajum)

Add support for alternate plotting backends

Abstract away the specific plotting backend (i.e. matplotlib) from LivePlot. Thus the current version of LivePlot would become MatplotlibLivePlot, and would retain the matplotlib-specific functionality. Otherwise LivePlot will serve as an abstract base class that handles all of the metric logging, saving, refresh logic, etc.

Ultimately, it would be nice to support bokeh and toyplot as backends.

x-axis values

Iteration number can be pretty unwieldy. It would be nice to have an option to label the x-axis by iteration number, epoch, etc.

Is noggin coming to conda?

I like to use conda rather than pip to keep all of my packages in one place. Will noggin be coming to conda via conda install at any point?

Make metrics saveable/loadable as x-arrays

Live metrics are already handled as ordered dictionaries of numpy arrays; this is nearly exactly the data format needed to form an xarray of the metrics.

This would permit users to seamlessly access their data as N-dimensional arrays with labeled axes.

Create gif of liveplot in action

The README needs a brief gif that shows liveplot in action. It should show at least two metrics (e.g. loss and accuracy) being plotted with both batch and epoch-level statistics.

fix indentation

    # record training epoch
    if i%10 == 0 and i > 0:
        plotter.plot_train_epoch()

       # cue test-evaluation of model
       for x in np.linspace(0, 10, 5):
           x += (np.random.rand(1) - 0.5)*5
           test_metrics = {"accuracy": x**2}
           plotter.set_test_batch(test_metrics, batch_size=1)
       plotter.plot_test_epoch()
plotter.plot()  # ensures final data gets plotted

Limit data rate for plotting

Currently liveplot will plot all available data regardless of how much data that is. This can lead to large computational costs, making plotting a bottleneck.

We should establish a heuristic for limiting the amount of data being plotted. Ideally this would involve estimating the computational cost of each "draw" during live plotting, and how this scales with the amount of data available.

We would also want to estimate the maximum visually-resolvable density of data. That is, if I am drawing 10,000 points on a typically-sized plot, does drawing every 10th point look just the same as drawing every point?

With these to pieces of analysis, we should be able to arrive at a sensible default for limiting the number of points that we draw in a given call. We could potentially plot sliding-window averages to coarsen the plot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.