Coder Social home page Coder Social logo

System-wide local storage about quilt HOT 18 CLOSED

quiltdata avatar quiltdata commented on May 14, 2024
System-wide local storage

from quilt.

Comments (18)

dimaryaz avatar dimaryaz commented on May 14, 2024

Hi @ellisonbg,

XDG_DATA_HOME is used by lots of other programs, not just Quilt, so changing it could break things. Also, that directory is meant to be private, so sharing it between different users is a bad idea. (E.g., Quilt stores login credentials there, not just packages.)

Moving just the quilt_packages directory would be possible with simple code changes - though even that may be a bad idea: 1) malicious users could easily mess with others' data; 2) if two users started downloading the same package simultaneously, they'd be writing to the same files and likely corrupt the data.

The only "safe" way to do this would be a global directory that's readable by everyone, but only writeable by an admin who pre-installs a few useful packages there. Quilt almost supports reading packages from multiple directories; we've discussed using an environment variable like QUILT_PACKAGE_DIRS, but haven't actually implemented it yet. This would be the place to do it: https://github.com/quiltdata/quilt/blob/master/compiler/quilt/tools/store.py#L85

from quilt.

ellisonbg avatar ellisonbg commented on May 14, 2024

from quilt.

asah avatar asah commented on May 14, 2024

hi Brian--

We're on it! and excited to add this to Quilt right away - this is an obvious and killer feature. In the meantime, can I ask a few "requirements" type questions:

  1. what are the goal(s) of sharing the storage? to save storage space? pre-install packages for users? something else?

  2. are all of your students are sharing a single Linux server/instance? if so, how are they isolated from one another (e.g. Linux user accounts?)? if not (separate instances), how do you plan to share the filesystem (e.g. NFS mounts?)? Can students run code (and install data) on their own computers/laptops? If so, is it OK that they have a copy of the data locally (e.g. disconnected operation)?

  3. if you're trying to save space, what are the numbers? # of files/dataframes? typical size of each?
    how many students? how much storage space do you have? etc.

  4. do you need to run your own registry? or can you have students use the normal public Quilt registry (which is free)? https://quiltdata.com/search/?q= If you're running your own registry instance, can we assume there's lots of bandwidth between the students' instance(s) and the registry (e.g. same AWS 'region')?

thanks!
adam

from quilt.

ellisonbg avatar ellisonbg commented on May 14, 2024

from quilt.

asah avatar asah commented on May 14, 2024

In that case, I think we have good news! A preview:
#286

Obviously, there will be docs and examples showing how to set this up and manage it... but basically, an admin can designate one or more shared director(ies) which clients access (read-only, via import) by setting an environment variable (QUILT_PACKAGE_DIRS). If a package isn't available in the user's local directory, it checks the shared directories. A share directory is simply the "local" quilt_packages directory of the admin account on the same (network) file system.

We have a few other features/changes queued for master & release (to pypi aka pip install) over the next few days, but if you're feeling adventurous, you're welcome to try this right now. We'd love the feedback.

from quilt.

ellisonbg avatar ellisonbg commented on May 14, 2024

from quilt.

akarve avatar akarve commented on May 14, 2024

@ellisonbg We've moved the work to a new PR, #286

from quilt.

akarve avatar akarve commented on May 14, 2024

Resolved in #286 and added to the docs here. Let us know if you run into any hiccups.

from quilt.

ellisonbg avatar ellisonbg commented on May 14, 2024

from quilt.

akarve avatar akarve commented on May 14, 2024

@ellisonbg: @dimaryaz will get back to you with which hash to use on master; these changes aren't on pip yet

from quilt.

akarve avatar akarve commented on May 14, 2024

Keeping issue open until release hits PyPI (better experience for students who install Quilt).

from quilt.

akarve avatar akarve commented on May 14, 2024

Installing Quilt from top of tree master will work for @ellisonbg and students:

[redacted :)]

from quilt.

akarve avatar akarve commented on May 14, 2024

Whoops. The above does not work with the new project structure. Will circle back with installation instructions.

from quilt.

ellisonbg avatar ellisonbg commented on May 14, 2024

from quilt.

dimaryaz avatar dimaryaz commented on May 14, 2024

@ellisonbg: here's the correct command to install from git master:

pip install --user 'git+https://github.com/quiltdata/quilt#subdirectory=compiler'

(You may also want to use --user if you're not installing as root.)

from quilt.

ellisonbg avatar ellisonbg commented on May 14, 2024

from quilt.

akarve avatar akarve commented on May 14, 2024

Docs now sync'd with gitbook at https://docs.quiltdata.com/shared-store.html

from quilt.

akarve avatar akarve commented on May 14, 2024

Merged and released to pip:
https://github.com/quiltdata/quilt/releases/tag/2.9.0
https://pypi.python.org/pypi/quilt

Now it's just pip install quilt for system-wide local storage.

from quilt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.