Coder Social home page Coder Social logo

shovel's People

Contributors

calvingiles avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

shovel's Issues

Idea: model shovel more closely on git

Git LFS has some nice properties, but doesn't really map well to large datasets used for analysis. A git model of checking in all resources is good for reproducibility, but it is nice to separate the data from the code.

A proposed future direction for shovel is to support shovel <git commands where shovel intercepts some commands and swaps a bunch of behaviours out. These can likely be done using git hooks, so it may be possible to init those and then use git directly.

One benefit of the shovel model over LFS is that it lets you version datasets separate from a git repo and share them across multiple. In that sense, the git hooks would need to inspect the state of the filesystem and manage the dig and bury steps of shovel as part of the hooks.

These thoughts are very undeveloped.

Project Name Collision - Requirement already satisfied

There's already a python project called shovel (currently >600 stars on GitHub)

https://github.com/seomoz/shovel

Which causes a problem for pip install and module imports

pip install git+https://github.com/lyst/shovel.git#egg=shovel
Requirement already satisfied: shovel from git+https://github.com/lyst/shovel.git#egg=shovel

Potential Solutions

https://www.python.org/dev/peps/pep-0423/

  • Rename the package and project from shovel to lyst.shovel
  • Come up with another name that doesn't already correspond to a package in PyPI

Add ignore_exists argument to bury

It is commit to want to re-run a notebook that bury's data. It should be possible to leave the command in place and have it not error if it has been done before.

Read dataset params from a `.shovel` file in the dataset root

To support better git status versioning etc, the actual version of a dataset should live in a shovel file in the root directory. This allows peek to check the version of whatever the file system thinks the version should be, and bury to update the file in a way that shows up in git status.

This is in response to #20

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.