Coder Social home page Coder Social logo

Comments (9)

dobe avatar dobe commented on June 22, 2024

I don't know if you are aware of it, but there is builtin BLOB support in Crate. See https://crate.io/blog/using-crate-data-as-a-blobstore/ for an introduction and the documentation https://crate.io/docs/stable/blob.html

from crate.

martinheidegger avatar martinheidegger commented on June 22, 2024

The internal BLOB support is good and useful. I am talking about an option to flip a switch and have crate automatically store the blobs on s3 (or another file storage system).

from crate.

dobe avatar dobe commented on June 22, 2024

Ah, so you mean that Crate should act as a kind of proxy to 3rd party storages. I think this use-case is very seldom, so probably we will not put resources on this. We think that in most cases, one will use Crate as the BLOB store because there is no additional setup required. Also local disks are cheaper in most of the cases.

If you really need S3, it would make sense to access it directly, since a Crate proxy will give you no benefit in this case and will always have the same latency as S3.

from crate.

martinheidegger avatar martinheidegger commented on June 22, 2024

At a first glance I can see three possible benefits for external storage of blobs:

  1. It is possible to use the same code no matter if you use your local test environment or your productive environment on s3.
  2. You only need to know the connection string to crate to access all your data (blobs and text content)
  3. If offered flexibly it could work as a import/export option (i.e. you could easily import blobs from/to s3)

from crate.

martinheidegger avatar martinheidegger commented on June 22, 2024

Amendment: s3 as backup/seed option would also give me confidence that the system really is fail-safe :)

from crate.

dobe avatar dobe commented on June 22, 2024

yep, we have an s3 backup option like "copy to" for blobs on our roadmap.

from crate.

martinheidegger avatar martinheidegger commented on June 22, 2024

But that is just a backup right? There is another reason I just remembered for a blob storage strategy on s3

  • you will never ever run out of disk space.

from crate.

weswam avatar weswam commented on June 22, 2024

I am not a code contributor but I did want to briefly chime in on this. I honestly see zero benefit to having Crate store its blob data in S3. I have actually used crate and its built in blob store to completely move all of our in house data off of S3. Crate allowed us to build an in house replacement for S3 and I believe that was one of the original intents of the Crate blob store. It allows you to store your blobs locally in your own storage cluster. The added latency from connecting to Crate and then having it connect to S3 is never going to be beneficial. You would be better served to store your meta data in Crate and then connect directly to S3 within your application than to have Crate act as a proxy to an external data store such as S3. It seems that your main concern is around not having enough disk space to store your data and really that is easily solved buy building your own storage cluster. It most likely is not cost effective to do so on VMs but dedicated hardware is cheap and plentiful once your project is getting to a scale where you need to be seriously concerned about it and honestly once you have that amount of data local disks is way cheaper than S3.

from crate.

dobe avatar dobe commented on June 22, 2024

i think @weswam is right, the usecase to use crate as a s3 proxy is very seldom.

from crate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.