Coder Social home page Coder Social logo

git-lfs-s3-proxy's Introduction

Git LFS S3 Proxy

This Cloudflare Pages site acts as a Git LFS server backed by any S3-compatible service.

  • By replacing GitHub's default LFS server with an R2 bucket behind this proxy, LFS uploads and downloads become free instead of $0.0875/GiB exceeding 10 GiB/month across all repos and forks. Storage exceeding the free tier costs $0.015/GB-month on R2 instead of $0.07/GB-month on GitHub.
  • On most services, latency is low enough to serve entire websites directly from your LFS server. This also allows you to transparently overcome the 25 MiB Cloudflare Pages file size limit by automatically adding any files over this size to LFS.

Usage

Create a bucket

First, create a bucket on an S3-compatible object store to host your LFS assets. In roughly increasing order of cost (as of 2023-08-05), your options include:

We recommend R2 for its generous free tier: your LFS repos can store up to 10 GB and use unlimited bandwidth to write up to 1 million objects and read up to 10 million objects. If serving assets via LFS Client Worker, R2 has the additional benefit of being in the same datacenters as the worker.

Create an access key

Now create an access key with read/write permission to your bucket:

You should now have (to use S3's terminology) two values:

  • An access key ID (example: AKIAIOSFODNN7EXAMPLE)
  • A secret access key (example: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY)

If either value contains non-alphanumeric characters, you may need to urlencode each value.

Optional: Deploy your own instance of the proxy

A canonical instance of the proxy runs at git-lfs-s3-proxy.pages.dev, which can be used by any project. However, there are a few reasons you might want to run your own:

  • The canonical instance runs under the Cloudflare free tier, so it can "only" handle around 100,000 requests per day.
  • The proxy sees your endpoint URL, bucket name, and access key, so malicious instances could read or modify your LFS content.
  • To stop using the canonical instance, every commit in your repo must be rewritten to update its LFS server URL, or LFS objects referenced in old commits become inaccessible.
    • If you deploy your own instance, you could instead update it to redirect to a new LFS server.
    • If your instance uses your own domain name, you could point it at a self-hosted LFS server in this scenario.

The proxy is stateless, so you can switch instances just by changing your LFS server URL. If the underlying bucket remains the same, the old URL will continue to work.

To host your own instance of the proxy:

  • Fork this repo (milkey-mouse/git-lfs-s3-proxy) to your account.
  • Follow the Cloudflare Pages Get Started guide:
    • Sign up for Cloudflare if you haven't already.
    • Create a new Pages site
      • Add your GitHub account to Pages.
      • Grant access to your fork of milkey-mouse/git-lfs-s3-proxy.
      • Set up your Pages site: set Build command to npm install and leave all other settings on their defaults.
  • If you own a domain name (e.g. example.com), you can add a CNAME record to point a subdomain (e.g. git-lfs-s3-proxy.example.com) at your instance. If you don't own a domain, a pages.dev subdomain will work just as well, except you'll have to change your LFS server URL if you ever stop using the proxy.

Find your LFS server URL

We now have everything we need to build the server URL for Git LFS. The format for the URL is

https://<ACCESS_KEY_ID>:<SECRET_ACCESS_KEY>@<INSTANCE>/<ENDPOINT>/<BUCKET>

where <ACCESS_KEY_ID> and <SECRET_ACCESS_KEY> are the first and second values from Create an access key, <ENDPOINT> is the S3-compatible API endpoint for your object store, and <BUCKET> is the name of the bucket from Create a bucket. For example, the LFS server URL for a Cloudflare R2 bucket my-site with access key ID ed41437d53a69dfc and secret access key dc49cbe38583b850a7454c89d74fcd51 created by a Cloudflare user with account ID 7795d95f5507a0c89bd1ed3de8b57061 using the canonical proxy instance git-lfs-s3-proxy.pages.dev would be

https://ed41437d53a69dfc:[email protected]/7795d95f5507a0c89bd1ed3de8b57061.r2.cloudflarestorage.com/my-site

Fetch existing LFS objects

If you were already using Git LFS, ensure you have a local copy of any existing LFS objects before you change servers:

git lfs fetch --all

Configure Git to use your LFS server

Git can be told about the new LFS server in two ways, with slightly different tradeoffs.

Public repo

If only certain people with copies of your repo are allowed to write to it, you should create another access key with only read permission for your bucket. Then, create another server URL using the read-only access key. Finally, add the server URL containing the read-only access key to an .lfsconfig file in the root of your repository:

cd "$(git rev-parse --show-toplevel)"  # move to root of repository
git config -f .lfsconfig lfs.url 'https://<RO_ACCESS_KEY_ID>:<RO_SECRET_ACCESS_KEY>@<INSTANCE>/<ENDPOINT>/<BUCKET>'
git add .lfsconfig
git commit -m "Add .lfsconfig"

To allow a clone of this repo to write to Git LFS, add the server URL containing the read/write access key to its .git/config:

git config lfs.url 'https://<RW_ACCESS_KEY_ID>:<RW_SECRET_ACCESS_KEY>@<INSTANCE>/<ENDPOINT>/<BUCKET>'

This config file is not checked into the repository, so the read/write access key remains private.

Private repo

If you're working with a private repository where everyone with a clone of the repo already has read/write access, you may want to skip generating another access key and manually adding the read/write key to each clone that needs it. (Even in this case, the public repo approach is marginally more secure, but the tradeoff may be worth it for convenience's sake.) To set the LFS server URL for everyone at once, granting anyone with a copy of the repo read/write access to the LFS server, put the LFS server URL containing the read/write access key in .lfsconfig:

cd "$(git rev-parse --show-toplevel)"  # move to root of repository
git config -f .lfsconfig lfs.url 'https://<RW_ACCESS_KEY_ID>:<RW_SECRET_ACCESS_KEY>@<INSTANCE>/<ENDPOINT>/<BUCKET>'
git add .lfsconfig
git commit -m "Add .lfsconfig"

Upload existing LFS objects

If you were already using Git LFS, ensure any existing LFS objects are uploaded to the new server:

git lfs push --all origin

GitLab only: Disable built-in LFS

GitLab "helpfully" rejects commits containing "missing" LFS objects. After configuring a non-GitLab LFS server, GitLab will consider all new LFS objects "missing" and reject new commits:

remote: GitLab: LFS objects are missing. Ensure LFS is properly set up or try a manual "git lfs push --all".
To gitlab.com:milkey-mouse/lfs-test
 ! [remote rejected] main -> main (pre-receive hook declined)
error: failed to push some refs to 'gitlab.com:milkey-mouse/lfs-test'

To disable this "feature", disable LFS on the GitLab repository. This can be done via the repository's GitLab page with Settings > General > Visibility, project features, permissions (click Expand) > Repository > Git Large File Storage (LFS) (disable, then click Save changes), or via the API:

curl --request PUT --header "PRIVATE-TOKEN: <your-token>" \
 --url "https://gitlab.com/api/v4/projects/<your-project-ID>" \
 --data "lfs_enabled=false"

Using Git LFS

After configuring your LFS server, you can set up and use Git LFS as usual.

Install Git LFS

If you haven't used Git LFS before, you may need to install it. Run the following command:

git lfs version

If your output includes git: 'lfs' is not a git command, then follow the Git LFS installation instructions.

Install smudge and clean filters

Even if the Git LFS binary was already installed, the smudge and clean filters Git LFS relies upon may not be. Ensure they are installed for your user account:

git lfs install

Start using LFS

You're now ready to start using Git LFS. For example:

  • To add any .iso files added in future commits to Git LFS, use git lfs track:

    git lfs track '*.iso'
    git add .gitattributes
    git commit -m "Add .iso files to Git LFS"
    
  • To add all existing .iso files to Git LFS (which rewrites history, so be careful), use git lfs migrate:

    git fetch --all
    git lfs migrate import --everything --include='*.iso'
    git push --all --force-with-lease
    
  • To add all existing files above 25 MiB to Git LFS (which rewrites history, so be careful), use git lfs migrate:

    git fetch --all
    git lfs migrate import --everything --above=25MiB
    git push --all --force-with-lease
    

git-lfs-s3-proxy's People

Contributors

milkey-mouse avatar nbc66 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

git-lfs-s3-proxy's Issues

Add disabling locksverify to README

As seen in #4, even successful pushes/pulls often give this warning:

Remote "origin" does not support the Git LFS locking API. Consider disabling it with:
  $ git config lfs.<unknown>.locksverify false

We probably can't ever support the locking API (not without introducing another service—D1?—to store state), so we should mention disabling this in the README. When the page for #5 generates a list of commands to run including setting the URL it should also include one to disable locksverify.

[FEATURE REQUEST] Add Storj support

Hello,

This project is really awesome, however the proxy doesn't seems to properly handle every S3-compatible storage. For instance, the Storj doesn't seem to be properly supported even if it's possible to generate some S3-like credentials.

The main difference I saw during my experiments with other providers is the lack of an userid which gives my a endpoint like : this https://access-key:[email protected]/gateway.storjshare.io/test-bucket

I'd be glad to help to implement it, so feel free to ask me for more details/tests :)

Remote "origin" does not support the Git LFS locking API

I followed your instructions but came up with a problem. I would like to seek your help. Do you know any with this problem?

Remote "origin" does not support the Git LFS locking API. Consider disabling it with:
  $ git config lfs.<unknown>.locksverify false
Uploading LFS objects:   0% (0/1), 0 B | 0 B/s, done.
batch request: missing protocol: "<unknown>"
error: failed to push some refs to 'github.com:Wsine/test.git'

Authorization error

After following the steps provided in the README. I'm still unable to push; initially, when it starts, there's a KB of progress, and then it stops. After several retries, it fails and throws an LFS: Authorization error. I'm using an empty Linode Object Storage.

LFS: Authorization error: {URL}
Check that you have proper access to the repository.

Include hash_algo in responses

See git-lfs/git-lfs@3f5fca5:

While SHA-256 is presently considered strong, it might not always be, so we should consider supporting other hash algorithms. We anticipate that, much like Git, exactly one hash algorithm will be in use at a time per repository. To make it easier for the client and server to negotiate a suitable algorithm, let's add a field to designate the hash algorithm in batch requests.

This allows the client to declare to the server the hash algorithm on first upload, and the server can respond with the hash algorithm in use (if it supports multiple) or a 409 response if it cannot handle that. We expect clients and servers to both gracefully handle the absence of this value and assume SHA-256 if not specified.

LFS URL generator

In hindsight the installation instructions are slightly complicated, especially if we include urlencoding the access key as we apparently need to do (see #4). We should create a gh-pages branch for this project which includes the README content as well as a simple LFS URL generator.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.