Comments (8)
OK, it seems that newer net-protocol
reuses string objects when yielding chunks. Previously, in the code below, yielded chunks were all different ruby objects, and now they're all the same object:
client = Aws::S3::Client.new(...)
client.put_object(bucket: "my-bucket", key: "foo", body: "a" * 100_000)
object_ids = []
client.get_object(bucket: "my-bucket", key: "foo") do |chunk|
object_ids << chunk.object_id
end
object_ids #=>
# net-protocol 0.1.3 – [5100, 5120, 5140, 5160, 5180, 5200, 5220, 5240, 5260, 5280, 5300]
# net-protocol 0.2.0 – [5100, 5100, 5100, 5100, 5100, 5100, 5100, 5100, 5100, 5100]
This doesn't cause issues in aws-sdk-s3 itself, but Shrine S3 storage uses Down::ChunkedIO
from the Down gem to wrap the streaming, which relies on these chunks being different objects.
It does seem like it was an intentional behavior change in net-protocol, to make streaming an HTTP response more memory-efficient, but I would consider it a breaking change. Technically, net-protocol is in the 0.x
phase, so AFAIK minor versions are still allowed to introduce breaking changes.
from shrine.
Nice find. Interesting -- so the same String object, but with mutated contents?
I can see how that would be a lot more performant. My apps deal with streaming very large (~200-500 megabyte) files, via shrine and down, and I have noticed that there can be pretty severe memory and performance (from GC) costs to doing so, which has been disappointing to me.
I guess this change could be towards amelioerating that... depending on what Down::ChunkedIO goes on to do with it, maybe.
I wouldn't necessarily have expected Down::ChunkedIO to rely on the chunk being differnet objects; presumably the net-protocol authors didn't expect anything to either, or they would have warned more clearly about the change. Or not done it -- although the performance improvements really will be welcome. (That a part of the stdlib is still at pre-1.0 and could release breaking changes with no notice is not awesome either :( )
Does Down actually fail it's CI tests when using net-protocol 0.2.0? That'd be a nice signal of reliability there, and make things easier! When trying to fix Down::ChunkedIO, whenever someone gets around to it -- it would be nice to see if there's a way to preserve and continue the performance enhancements here -- ruby really does pay a penalty when creating lots of objects (RAM growth that can't always be reversed; GC time), so net-protocol reducing that would be a boon for streaming large IO -- but not if Down::ChunkedIO just (eg) calls dup
on each chunk to make just as many new objects as before!
from shrine.
The net-protocol
0.2.0 release tag is here https://github.com/ruby/net-protocol/releases/tag/v0.2.0
from shrine.
Which storage are you using? I'm guessing streaming I/O isn't working correctly with the newest net-protocol version.
from shrine.
from shrine.
I was in the middle of working on an update that took my app from net-protocol 0.1.3 to 0.2.0.
Thanks for this report, I noticed and checked it out... it was breaking something for me too, although not something that my automated test build caught (oops; I don't test against real S3) -- something I caught only testing manually, but it didn't raise an error, it caused a mysterious failure where it looked like bytes were getting corrupt somehow or something, without a raise?
I'm not sure exactly what was going on. Reverting to net-protocol 0.1.3 resolved things for me.
Disconcerting, since net-protocol 0.2.0 release notes don't include anything much, def not anything that should be backwards incompat.
Thanks for filing this, it would have been a lot harder for me to figure out my weird error was related without your issue. Very odd, very curious if anyone can figure anything out.
It looks like net-protocol 0.2,0 was just released yesterday, so perhaps it has a bug that has not yet been reported? (I didn't mean to upgrade net-protocol even, it just happened by accident in my attempt to upgrade to Rails 7.). curious if all test on shrine, down, and any other relevant dependencies, will pass with net-protocol 0.2.0, or if the regression could be found that way.
from shrine.
This is the only code-related commit in 0.2.0 – ruby/net-protocol@781e400 – and it does seem to change the logic a lot, so it's very possible it introduced a breaking change. aws-sdk-s3 uses net/http for HTTP requests, so it's possible streaming is not working correctly anymore. Someone would have to isolate the problem, ideally using net/http directly – no Shrine or aws-sdk-s3 or Down – and report it to ruby/net-protocol
.
from shrine.
This issue should be resolved by upgrading to net-protocol 0.2.1.
from shrine.
Related Issues (20)
- [Bug]: Corruption of downloaded files when using `net-protocol v0.2.0` HOT 16
- [Bug]: Fix for bytesize on Array doesn't work when aws-core is installed from debian HOT 5
- [Bug]: remove_attachment error with ruby 3.2 HOT 2
- [Bug]: upload_options plugin moves file from store to cache HOT 2
- [Bug]: NoMethodError (undefined method 'bucket' for nil:NilClass) HOT 1
- [Bug]: no _dump_data is defined for class Proc (Shrine with Rails Cache) HOT 1
- [Bug]: `create_derivatives` uploads files already in the store HOT 2
- Website search broken after migrating to Docusaurus v2
- Support retries with remote_url HOT 1
- Unpermitted parameter: :images with dropzonejs HOT 4
- Specifying host: with plugin :url_options has no effect on the returned URL HOT 5
- Disabling cache is tricky HOT 2
- [Bug] File not promoted on permanent storage if reload the model inside a transaction HOT 4
- Using Shrine with Cloudflare R2 causes error Aws::S3::Errors::NotImplemented - x-amz-tagging HOT 5
- Documentation for `image_url` options HOT 2
- Is it possible to pass storage config dynamically ?
- Shrine 3.5.0 does not support Rack 3 (undefined method `map' for #Rack::Files::Iterator) HOT 3
- WARN: Job arguments to PromoteJob do not serialize to JSON safely. This will raise an error in Sidekiq 7.0. HOT 3
- Resize original file HOT 4
- SignatureDoesNotMatch error when uploading to DigitalOcean Spaces with presigned URL HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from shrine.