Coder Social home page Coder Social logo

orbital's Introduction

PyOrbital

Distribute private resources, such as machine learning models, through AWS.

Motivation

Sputnik is a great library that manages data packages for another library, e.g. trained models for a machine learning library. However, Sputnik assumes packages will be hosted behind a webserver, which creates a fair bit of scaffolding work. We would like data packages to live on Amazon S3 instead.

Installation

pip install sputnik-orbital

Usage

Please refer to Sputnik's README for full details on how to structure a package so it can be managed by Sputnik. Essentially, the process is:

  1. Create a resource (on machine A)
  2. Publish resource (on machine A)
  3. Install resource (on machine B)

A full example can be found in orbital/test/test_orbital.py.

Creation

Write your data resource as follows:

.
└── sputnik_sample
    ├── data
    │   └── model.pkl
    └── package.json

Here, model.pkl is the model that we want to distribute, and package.json is a manifest containing metadata about the model, e.g.

{
    "name": "orbital_test_model",
    "description": "This is a demo model, but it is still awesome.",
    "include": [["data", "*"]],
    "version": "2.0.0",
    "license": "Proprietary",
    "compatibility": {
        "my_library": ">=1.1.1"
    }
}

Then build the package for distribution:

from orbital import sputnik

package = sputnik.build("sputnik_sample")

Note we do not import Sputnik directly, but through Orbital. This applies the patches needed to use S3 as the storage layer.

Publishing

from orbital import sputnik

sputnik.upload("myapp", "1.0.0", package.path)

This uploads the package to an S3 bucket. This can be public or private.

Installation

from orbital import sputnik

sputnik.install("my_library", "1.0.0", "orbital_test_model==2.0.0")

This downloads and unpacks the required model version into a local directory.

Use installed model

package = sputnik.package("my_library", "1.1.3.", "orbital_test_model==2.0.0")
path_to_load = package.file_path(model_file_name)

Then load the model as usual, e.g. pickle.

S3 setup

Orbital does not create the S3 bucket where resources will be stored. You have to do that manually. The name of the bucket has to be provided as an environment variable to the upload script, e.g.

BUCKET="my_private_s3_bucket" python upload_all_models.py

To upload to a private bucket, you also need to create an AWS IAM use and give them R/W access to the bucket. Provide the user's credentials to your script, as described in the boto tutorial. The easiest thing to do is to specify the credentials as environment variables, e.g.

AWS_ACCESS_KEY_ID=AAAA AWS_SECRET_ACCESS_KEY=BBB BUCKET="my_private_s3_bucket" python upload_all_models.py

Alternatively, put the credentials in ~/.aws/credentials or ~/.boto.

Running tests

PYTHONPATH=. py.test

orbital's People

Contributors

mbatchkarov avatar

Stargazers

Marcia Oliveira avatar Chris Charlton avatar  avatar Sasho Savkov avatar

Watchers

James Cloos avatar  avatar Clare Walsh avatar Rodger avatar Bernardo Ramos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.