Coder Social home page Coder Social logo

Comments (8)

ncw avatar ncw commented on July 21, 2024

Heads up @rclone/support - the "Support Contract" label was applied to this issue.

from rclone.

ncw avatar ncw commented on July 21, 2024

Google docs are handled in a special way in rclone.

We don't know their size until after they are downloaded so we can't use the standard copy routines. Instead we use the internal equivalent of rclone rcat which streams the result to the destination.

In this process rclone loses the knowledge of the source backend.

I've had a go at fixing this here - can you give it a try? Note that this includes the fix for #7845 which is very relevant!

v1.67.0-beta.7962.0681eb1c7.fix-7848-metadata-mapper on branch fix-7848-metadata-mapper (uploaded in 15-30 mins)

from rclone.

chscott avatar chscott commented on July 21, 2024

The SrcFsType now looks good, but I've lost the DstFsType.

Now

"SrcFs": "Source{xrgXa}:Test",
"SrcFsType": "drive",
"DstFs": "//?/C:/Users/Chad/Logs/rclone-spool1942096378",
"DstFsType": "local",

Before

"SrcFs": "memory:",
"SrcFsType": "object.memoryFs",
"DstFs": "Target{VZpyf}:Test",
"DstFsType": "onedrive",

from rclone.

chscott avatar chscott commented on July 21, 2024

Using v1.67.0-beta.7962.0681eb1c7.fix-7848-metadata-mapper, the behavior changes based on whether or not --streaming-upload-cutoff governs the transfer.

Without --streaming-upload-cutoff set

Note that the mapper is invoked twice for a single file.

{
    "level": "debug",
    "msg": "Metadata mapper sent: \n{\n\t\"SrcFs\": \"Source{xrgXa}:Test\",\n\t\"SrcFsType\": \"drive\",\n\t\"DstFs\": \"//?/C:/Users/Chad/Domains/cdsconsulting.co/logs/Copy-GOUserDrives_20240515_160249/rclone-spool259196758\",\n\t\"DstFsType\": \"local\",\n\t\"Remote\": \"Sample doc.docx\",\n\t\"Size\": -1,\n\t\"MimeType\": \"application/vnd.openxmlformats-officedocument.wordprocessingml.document\",\n\t\"ModTime\": \"2024-05-15T15:05:22.457Z\",\n\t\"IsDir\": false,\n\t\"Metadata\": {\n\t\t\"btime\": \"2024-05-15T14:56:11.061Z\",\n\t\t\"content-type\": \"application/vnd.google-apps.document\",\n\t\t\"copy-requires-writer-permission\": \"false\",\n\t\t\"mtime\": \"2024-05-15T15:05:22.457Z\",\n\t\t\"owner\": \"[email protected]\",\n\t\t\"permissions\": \"[{\\\"emailAddress\\\":\\\"[email protected]\\\",\\\"id\\\":\\\"14885772533033484759\\\",\\\"role\\\":\\\"writer\\\",\\\"type\\\":\\\"user\\\"},{\\\"emailAddress\\\":\\\"[email protected]\\\",\\\"id\\\":\\\"09287294999424909072\\\",\\\"role\\\":\\\"owner\\\",\\\"type\\\":\\\"user\\\"}]\",\n\t\t\"starred\": \"false\",\n\t\t\"viewed-by-me\": \"true\",\n\t\t\"writers-can-share\": \"true\"\n\t}\n}\n",
    "source": "fs/metadata.go:123",
    "time": "2024-05-15T16:03:00.789903-05:00"
}
{
    "level": "debug",
    "msg": "Metadata mapper sent: \n{\n\t\"SrcFs\": \"//?/C:/Users/Chad/Domains/cdsconsulting.co/logs/Copy-GOUserDrives_20240515_160249/rclone-spool259196758\",\n\t\"SrcFsType\": \"local\",\n\t\"DstFs\": \"Target{qpiyZ}:Test\",\n\t\"DstFsType\": \"onedrive\",\n\t\"Remote\": \"Sample doc.docx\",\n\t\"Size\": 317900,\n\t\"MimeType\": \"application/vnd.openxmlformats-officedocument.wordprocessingml.document\",\n\t\"ModTime\": \"2024-05-15T10:05:22.457-05:00\",\n\t\"IsDir\": false,\n\t\"Metadata\": {\n\t\t\"atime\": \"2024-05-15T10:05:22.457-05:00\",\n\t\t\"btime\": \"2024-05-15T09:56:11.061-05:00\",\n\t\t\"mode\": \"666\",\n\t\t\"mtime\": \"2024-05-15T10:05:22.457-05:00\"\n\t}\n}\n",
    "source": "fs/metadata.go:123",
    "time": "2024-05-15T16:03:01.167733-05:00"
}

With --streaming-upload-cutoff 100Mi set

Note that the mapper is invoked only once.

{
    "level": "debug",
    "msg": "Metadata mapper sent: \n{\n\t\"SrcFs\": \"Source{xrgXa}:Test\",\n\t\"SrcFsType\": \"drive\",\n\t\"DstFs\": \"Target{qpiyZ}:Test\",\n\t\"DstFsType\": \"onedrive\",\n\t\"Remote\": \"Sample doc.docx\",\n\t\"Size\": 317900,\n\t\"MimeType\": \"application/vnd.openxmlformats-officedocument.wordprocessingml.document\",\n\t\"ModTime\": \"2024-05-15T15:05:22.457Z\",\n\t\"IsDir\": false,\n\t\"Metadata\": {\n\t\t\"btime\": \"2024-05-15T14:56:11.061Z\",\n\t\t\"content-type\": \"application/vnd.google-apps.document\",\n\t\t\"copy-requires-writer-permission\": \"false\",\n\t\t\"mtime\": \"2024-05-15T15:05:22.457Z\",\n\t\t\"owner\": \"[email protected]\",\n\t\t\"permissions\": \"[{\\\"emailAddress\\\":\\\"[email protected]\\\",\\\"id\\\":\\\"14885772533033484759\\\",\\\"role\\\":\\\"writer\\\",\\\"type\\\":\\\"user\\\"},{\\\"emailAddress\\\":\\\"[email protected]\\\",\\\"id\\\":\\\"09287294999424909072\\\",\\\"role\\\":\\\"owner\\\",\\\"type\\\":\\\"user\\\"}]\",\n\t\t\"starred\": \"false\",\n\t\t\"viewed-by-me\": \"true\",\n\t\t\"writers-can-share\": \"true\"\n\t}\n}\n",
    "source": "fs/metadata.go:123",
    "time": "2024-05-15T16:05:05.602204-05:00"
}

from rclone.

ncw avatar ncw commented on July 21, 2024

It looks like the changes I made are working for --streaming-upload-cutoff 100M.

Some backends (such as onedrive) can't upload a file of unknown length. You can see these backends with StreamUpload = N in the overview table.

Google docs don't have a defined length - you have to download them to find out how long they are. So either we cache them in memory (if size < --streaming-upload-cutoff) or we download them to a local disk first (if size >= --streaming-upload-cutoff)

If we cache them in memory then you will just get the single transfer and the single invocation of the metadata mapper. If we are caching them on disk then you will get the transfer to the local disk (with metadata invocation) followed by the transfer from local disk to destination (with metadata invocation). Rclone takes some care that the metadata is correct when transferring via the local disk.

In an ideal world we'd change the onedrive backend to accept streaming uploads. I had a quick review of the uploading methods and still think that you can't upload files without knowing how big they are to onedrive. There is a stack overflow question which confirms this) but you may have a better idea. This would be the best solution if possible.

In the mean time I'm going to attempt to rejig the Rcat code so that it doesn't use the rclone machinery to copy an object to local disk. This will mean writing a bit more code but it would mean you'd only get one transfer. I don't think it would make the transfer less reliable. It would affect the stats slightly but I suspect currently they are quite confusing!

from rclone.

ncw avatar ncw commented on July 21, 2024

I've had a go at fixing this by re-working the internals of rcat.

I've run the local and onedrive integration tests with this and I think it is looking OK, but it could do with more testing.

v1.67.0-beta.7967.12d4964f2.fix-7848-metadata-mapper on branch fix-7848-metadata-mapper (uploaded in 15-30 mins)

from rclone.

chscott avatar chscott commented on July 21, 2024

In an ideal world we'd change the onedrive backend to accept streaming uploads. I had a quick review of the uploading methods and still think that you can't upload files without knowing how big they are to onedrive. There is a stack overflow question which confirms this) but you may have a better idea. This would be the best solution if possible.

My research turned up the same. Uploading files of unknown length does not seem possible with OneDrive/SharePoint.

The fix in v1.67.0-beta.7967.12d4964f2.fix-7848-metadata-mapper looks good to me. In particular, when I have --streaming-upload-cutoff unset:

  • The mapper is invoked only once.
  • SrcFsType and DstFsType have the expected values.
{
    "level": "debug",
    "msg": "Metadata mapper sent: \n{\n\t\"SrcFs\": \"Source{xrgXa}:Test\",\n\t\"SrcFsType\": \"drive\",\n\t\"DstFs\": \"Target{qpiyZ}:Test\",\n\t\"DstFsType\": \"onedrive\",\n\t\"Remote\": \"Sample doc.docx\",\n\t\"Size\": 317900,\n\t\"MimeType\": \"application/vnd.openxmlformats-officedocument.wordprocessingml.document\",\n\t\"ModTime\": \"2024-05-15T15:05:22.457Z\",\n\t\"IsDir\": false,\n\t\"Metadata\": {\n\t\t\"btime\": \"2024-05-15T14:56:11.061Z\",\n\t\t\"content-type\": \"application/vnd.google-apps.document\",\n\t\t\"copy-requires-writer-permission\": \"false\",\n\t\t\"mtime\": \"2024-05-15T15:05:22.457Z\",\n\t\t\"owner\": \"[email protected]\",\n\t\t\"permissions\": \"[{\\\"emailAddress\\\":\\\"[email protected]\\\",\\\"id\\\":\\\"14885772533033484759\\\",\\\"role\\\":\\\"writer\\\",\\\"type\\\":\\\"user\\\"},{\\\"emailAddress\\\":\\\"[email protected]\\\",\\\"id\\\":\\\"09287294999424909072\\\",\\\"role\\\":\\\"owner\\\",\\\"type\\\":\\\"user\\\"}]\",\n\t\t\"starred\": \"false\",\n\t\t\"viewed-by-me\": \"true\",\n\t\t\"writers-can-share\": \"true\"\n\t}\n}\n",
    "source": "fs/metadata.go:123",
    "time": "2024-05-16T12:58:07.830524-05:00"
}

from rclone.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.