Comments (9)
@topac Could you add versions of ruby & msgpack-ruby, and stack trace?
from msgpack-ruby.
@tagomoris You're right.
- ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-darwin14]
- MessagePack::VERSION is 0.7.6
test.rb:33:in
readpartial': Invalid argument @ io_getpartial - /tmp/testmsgpack1 (Errno::EINVAL)
from test.rb:33:inload' from test.rb:33:in
'`
from msgpack-ruby.
Thank you. I'll check it, but give me time for it.
from msgpack-ruby.
Ah, got it. It's a kind of this issue: msgpack/msgpack#214
MessagePack supports a string/binary which has a length in 32bit integer (== 2GB in 32bit signed int).
It's the limitation of msgpack spec, not bug of msgpack-ruby.
It might be a problem that error message should be more clear, or #to_msgpack
should show any errors.
from msgpack-ruby.
@tagomoris I think it's related to the fact that is loading from a file. In fact, skipping the part which write string
to file and loading directly it does not trigger the error:
# ......
begin
print "to_msgpack..."
string = {"a" => "foo", "b" => 42, "c" => generate_input_string}.to_msgpack
puts "done"
print "loading..."
MessagePack.load(string) # instead of MessagePack.load(File.open(TMP_FILE_PATH))
puts "done"
ensure
FileUtils.rm_f(TMP_FILE_PATH)
end
from msgpack-ruby.
This bug is still there.
The workaround is MessagePack.load(string) # instead of MessagePack.load(File.open(TMP_FILE_PATH))
, as suggested, and that works.
The issue is not msgpack/msgpack#214, it's that msgpack
tries to do readpartial
on the entire string at once. readpartial
calls the read
system call, which does not allow reading more than INTMAX
bytes, returning the error code EINVAL
as expected, which is what we see.
msgpack
's buffer needs to be engineered to not read more than INTMAX
bytes at a time.
require 'msgpack'
require 'fileutils'
require 'securerandom'
TMP_FILE_PATH = "/tmp/testmsgpack1"
MAX_SIZE = 1_000_000 * 3_000 # 3gb
# note that no error is raised if for example MAX_SIZE = 2mb
def generate_random_string
string = "".force_encoding("BINARY")
loop do
string << (SecureRandom.random_bytes(2048)*100)
return string if string.bytesize >= MAX_SIZE
end
end
big_string = generate_random_string
begin
# Create an hash with 3 keys, the value of the key "c" is a binary string of 3gb
print "to_msgpack..."
string = {"a" => "foo", "b" => 42, "c" => big_string}.to_msgpack
puts "done"
# Write the msgpack to disk
print "write to disk..."
File.open(TMP_FILE_PATH, "wb") do |f|
curr = 0
loop do
buffer = string[curr..curr+20000-1]
break if buffer.to_s.bytesize == 0
curr += f.write(buffer)
end
end
puts "done"
file = File.open(TMP_FILE_PATH)
file.readpartial(MAX_SIZE, big_string) # this is the underlying issue
file.close
ensure
FileUtils.rm_f(TMP_FILE_PATH)
end
from msgpack-ruby.
Seems like some IO uses this buffer size option, which has a sensible default, and some try to read it all
% grep io_partial_read_method **/*.c
ext/msgpack/buffer.c: b->io_buffer = rb_funcall(b->io, b->io_partial_read_method, 1, SIZET2NUM(b->io_buffer_size));
ext/msgpack/buffer.c: VALUE ret = rb_funcall(b->io, b->io_partial_read_method, 2, SIZET2NUM(b->io_buffer_size), b->io_buffer);
ext/msgpack/buffer.c: VALUE ret = rb_funcall(b->io, b->io_partial_read_method, 2, SIZET2NUM(length), string);
ext/msgpack/buffer.c: VALUE ret = rb_funcall(b->io, b->io_partial_read_method, 2, SIZET2NUM(length), b->io_buffer);
ext/msgpack/buffer.c: VALUE ret = rb_funcall(b->io, b->io_partial_read_method, 2, SIZET2NUM(length), b->io_buffer);
ext/msgpack/buffer_class.c: b->io_partial_read_method = get_partial_read_method(io);
from msgpack-ruby.
read_raw_body_begin
to read_raw_body_cont
to msgpack_buffer_read_to_string
to _msgpack_buffer_read_from_io_to_string
, which ignores the buffer size.
from msgpack-ruby.
I think I have a fix - merge #289 before I open it as a PR.
diff --git a/ext/msgpack/buffer.c b/ext/msgpack/buffer.c
index 9cfe0c7..15fc892 100644
--- a/ext/msgpack/buffer.c
+++ b/ext/msgpack/buffer.c
@@ -613,9 +613,11 @@ size_t _msgpack_buffer_feed_from_io(msgpack_buffer_t* b)
size_t _msgpack_buffer_read_from_io_to_string(msgpack_buffer_t* b, VALUE string, size_t length)
{
+#define MIN(x, y) (((x) < (y)) ? (x) : (y))
+
if(RSTRING_LEN(string) == 0) {
/* direct read */
- VALUE ret = rb_funcall(b->io, b->io_partial_read_method, 2, SIZET2NUM(length), string);
+ VALUE ret = rb_funcall(b->io, b->io_partial_read_method, 2, SIZET2NUM(MIN(b->io_buffer_size, length)), string);
if(ret == Qnil) {
return 0;
}
@@ -627,7 +629,7 @@ size_t _msgpack_buffer_read_from_io_to_string(msgpack_buffer_t* b, VALUE string,
b->io_buffer = rb_str_buf_new(0);
}
- VALUE ret = rb_funcall(b->io, b->io_partial_read_method, 2, SIZET2NUM(length), b->io_buffer);
+ VALUE ret = rb_funcall(b->io, b->io_partial_read_method, 2, SIZET2NUM(MIN(b->io_buffer_size, length)), b->io_buffer);
if(ret == Qnil) {
return 0;
}
@@ -635,6 +637,8 @@ size_t _msgpack_buffer_read_from_io_to_string(msgpack_buffer_t* b, VALUE string,
rb_str_buf_cat(string, (const void*)RSTRING_PTR(b->io_buffer), rl);
return rl;
+
+#undef MIN
}
size_t _msgpack_buffer_skip_from_io(msgpack_buffer_t* b, size_t length)
from msgpack-ruby.
Related Issues (20)
- Does msgpack_each support nested calls in design principle? HOT 3
- Unpack "end of buffer reached" error on Windows HOT 13
- Extra bytes when loading string HOT 5
- Seg Fault in #load HOT 3
- msgpack.jar seems to be missing in v1.5.5 HOT 7
- Correct way to pack and unpack `Struct` instances. HOT 4
- Packer can overwrite earlier parts of its buffer in version 1.6.0 HOT 10
- jruby: `MessagePack::Unpacker#buffer` is missing
- [BUG] Failed to free an rmem pointer, memory leak? HOT 8
- Compile fails on older toolsets HOT 5
- undefined method `to_msgpack' for BigDecimal (NoMethodError) HOT 4
- Attempt to GC mark already marked object HOT 22
- Pass empty string into MessagePack::Unpacker.feed will stop further feeding valid string in version >= 1.7.0 HOT 1
- Amazon Linux 2 w Ruby 2.6.10 - msgpack requires Ruby version >= 2.5. HOT 1
- Write msgpack to msgpack HOT 2
- Pack as uint16 HOT 1
- Feature request: Time packer HOT 1
- Fix for platforms without compaction support not complete
- 10 extra bytes after the deserialized object HOT 1
- Is Ruby 3.3.0 supported? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from msgpack-ruby.