Coder Social home page Coder Social logo

Where is the uploaded file? about tusd HOT 10 CLOSED

tus avatar tus commented on August 17, 2024 1
Where is the uploaded file?

from tusd.

Comments (10)

kvz avatar kvz commented on August 17, 2024 1

I would have to disagree with you, tus is designed to be deployed at massive scale. What if two users upload a file called x.jpg?

Should:

  • the first one be overwritten or the second one rejected?
  • we change the filename of the second - but now the filename is no longer a representation of the original one?
  • we create a directory per user? But what if one user uploads two times a file called screenshot.jpg, but they are different images?
  • we create one dir per file? That seems like a very wieldy workaround

The scenario for x.jpg, is the meta data such as the original filename is stored in your storage layer, and your app presents and offers x.jpg by reading that information back, along with for instance mime type, filesize, and perhaps which user uploaded the file.

If your app is not susceptible to the listed concerns and you really must have a folder with the original filenames for your usecase, you can still use tusd hooks to rename the file as soon as it has been written to disk. Hooks are separate programs that get presented the meta data and can perform an action with that, they can be written in any language as long as they are executable. But I'd be very careful about the mentioned concerns, going that route.

Does this make sense or do you think we overlooked something here?

For reference, the original ticket mentioned is #44

from tusd.

KempWatson avatar KempWatson commented on August 17, 2024 1

Thanks kvz for the quick answer. That's a very valuable use case indeed; it seems essentially an object store like OpenStack Swift, Amazon S3, Caringo Swarm, Go's own Minio, or many KV databases capable of large values, but embeddable and extensible.

In our use case, we are uploading dozens to thousands of medical images, each image 100 GB to 1 TB in size, into controlled-access folders. Other applications need to access the images by their original file names and extensions, and there will be no filename conflicts. Some of the other applications are written in ASP.NET, some in Go.

So far, I've embedded Tusd in a Go wrapper that controls the login and target directory. I'm assuming that without rewriting parts of Tusd (I'm not a fan of forking projects, it's the scourge of modern collaborative deb development...), my next step would be to read the .info file, get the original filename, and rename the .bin file, then delete the .info file. Am I on the right track, or might you suggest an easier/better approach? Your mention of hooks above might be obviated since I'm using Go on the backend.

Ideally, perhaps Tusd could have two modes on upload config, one as now, one with preservation of filenames?

Also, unrelatedly, does Tusd support HTTP/2, and does it use multiple HTTP streams to speed chunk delivery by utilizing more pipe bandwidth? WebSockets for maintained connection? Go's gobs for encoding/decoding speed? Just thoughts if not yet implemented.

Thanks!

from tusd.

Acconut avatar Acconut commented on August 17, 2024 1

Also, unrelatedly, does Tusd support HTTP/2

Sadly, @kvz, the answer is not that easy :) First of all, the tus protocol on its own absolutely supports HTTP/2, however for tusd, the story is a bit different. Go 1.6 introduced transparent and seamless support for HTTP/2 (see https://golang.org/pkg/net/http/):

The http package has transparent support for the HTTP/2 protocol when using HTTPS.

The tusd binary (the one inside cmd/tusd/) currently has no functionality to use TLS and therefore does not support HTTP/2. The tusd package, however, can be mounted to either HTTP or HTTPS listeners and is therefore possible to talk the new HTTP protocol, when configured correctly.

from tusd.

kvz avatar kvz commented on August 17, 2024 1

A I see, sorry for getting that part wrong, thanks for correcting!

Sent from mobile, pardon the brevity.

On 24 aug. 2016, at 23:14, Marius [email protected] wrote:

Also, unrelatedly, does Tusd support HTTP/2

Sadly, @kvz, the answer is not that easy :) First of all, the tus protocol on its own absolutely supports HTTP/2, however for tusd, the story is a bit different. Go 1.6 introduced transparent and seamless support for HTTP/2 (see https://golang.org/pkg/net/http/):

The http package has transparent support for the HTTP/2 protocol when using HTTPS.

The tusd binary (the one inside cmd/tusd/) currently has no functionality to use TLS and therefore does not support HTTP/2. The tusd package, however, can be mounted to either HTTP or HTTPS listeners and is therefore possible to talk the new HTTP protocol, when configured correctly.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

from tusd.

kvz avatar kvz commented on August 17, 2024

it seems essentially an object store like OpenStack Swift, Amazon S3

It's not an object store itself, but we do offer adapters for S3, google cloud files, etc. tus is really only about the transfer, not the storage per se.

In our use case, we are uploading dozens to thousands of medical images, each image 100 GB to 1 TB in size, into controlled-access folders. Other applications need to access the images by their original file names and extensions, and there will be no filename conflicts. Some of the other applications are written in ASP.NET, some in Go.

Wow that's super interesting. We'd love to cover that in a case study if you're comfortable with that.

Am I on the right track, or might you suggest an easier/better approach?

If you can, I would avoid running a fork as well. I think hooks are the way to fly. That way you can run a release binary, which will prove helpful if you ever run into an issue. It will be harder for the community to replicate failures in custom builds. And it would be easier to dismiss issues too (not cool, but this is due to a human trait that all open source projects have to endure).

Anyway, I think hooks are the way to fly, you'll get your meta data over STDIN in JSON like so: https://github.com/tus/tusd/blob/master/.hooks/post-finish

You can use any language there to parse the JSON and move the file to a different location - preserving the original filename - not having to run a fork. For authentication / etc I'd probably run tusd on localhost, and use HAProxy or some other kind of proxy. This also solves the problem of having to run tusd as root if you want it listening on a port <1024.

Ideally, perhaps Tusd could have two modes on upload config, one as now, one with preservation of filenames?

I'm afraid there is little chance of this happening since the collision of filenames in most cases is so likely it is almost a certainty. Meaning we'd have to support behavior to serve a very small usecase, and people not aware of the issues around this might actually pick this more convenient option and then have files destroyed because of it.

Also, unrelatedly, does Tusd support HTTP/2, and does it use multiple HTTP streams to speed chunk delivery by utilizing more pipe bandwidth? WebSockets for maintained connection? Go's gobs for encoding/decoding speed? Just thoughts if not yet implemented.

It is compatible. For chunks I refer to our concat extension. Websockets aren't needed as we'll just open many more connections. We'll likely not support Gob as the protocol is intended to be spoken in an interoperable way across many platforms and languages.

from tusd.

joshuadiezmo avatar joshuadiezmo commented on August 17, 2024

@kvz how can i get the file extension name?

from tusd.

Acconut avatar Acconut commented on August 17, 2024

@ReverseFlash28 If the uploader supplies the filename using metadata you may be able to extract the extension from there even though this requires strict validation and cannot be trusted in general. Therefore you may want to detect the file's type be looking for a magic numbers (e.g. see unix file(1) command) and then choosing based on the result the corresponding extension.

from tusd.

heri16 avatar heri16 commented on August 17, 2024

Hi @Acconut , could you provide example wrapper code on how to enable HTTP/2 on tusd over TLS?

from tusd.

Acconut avatar Acconut commented on August 17, 2024

@heri16 What does you setup look like? Do you use the tusd package in a custom Go application or run the tusd binary behind a proxy (such as nginx or Apache)?

from tusd.

Acconut avatar Acconut commented on August 17, 2024

Closing this issue due to a lack of information. Feel free to leave a comment if you want to continue the discussion :)

from tusd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.