Coder Social home page Coder Social logo

stark's Introduction

Stark: Distributed Inference with Stan & Spark

The idea of stark is to adapt a powerful single-node modeling tool (Stan) for running on a distributed platform (Spark). The naive way of doing this is to copy the data to each worker, and each worker produces poster samples. This emberassingly parallel model could be useful, but running in a distributed setting facilitates other possibilities, such as running on much larger datasets.

One strategy for handling more data is to run the same model where each one is provided a subset of data. Each model will then generate samples from a subposterior. Combining subposteriors is not trivial, and seems to be a current area of research. See the reading section for more resources.

Other possibilties could be running the same model with different datasets or hyperparameters (e.g. for a sensitivity analysis). If MAP estimates were found instead of full posteriors then one might have something akin to a distributed bootstrap.

This project is experimental-quality

Modes

The current status is that only the naive mode and simple weighted averaging are implemented.

Example

Currently there is just a single example, built on the 8 schools dataset. Spark is not useful for such a small dataset, it is used just for illustration.

From the stark/example directory, try:

spark-submit stark_ex.py

(I've only tried this with Spark 1.6.2)

Reading

Related projects

stark's People

Contributors

strongh avatar

Watchers

Alexander VR avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.