Coder Social home page Coder Social logo

ruby-xz's People

Contributors

chrisistuff avatar genail avatar larsch avatar quintus avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ruby-xz's Issues

Add methods for determining the approx. (de)compressed size

It looks as if liblzma has functions that estimate how large the uncompressed data of a compressed archive is, and how large a compressed archive of a given amount of uncompressed data is going to be. These should be made accessible by this binding.

Invalid archive checksum should raise exception

To be of any use, the flags regarding archive checksums should actually make the archive’s validity accessible somehow rather than just printing a warning about an invalid checksum. Should probably cause an exception as an invalid archive can’t be read anyway.

API breaking, so for 1.0.0.

Valete,
Quintus

StreamWriter crashes when writing more than a couple of bytes

This works fine:

$ ruby -rxz -e 'XZ::StreamWriter.open("foo.xz") {|x| x.write "a" }'

But this crashes:

$ ruby -rxz -e 'XZ::StreamWriter.open("foo.xz") {|x| x.write "a"*10000 }'
corrupted size vs. prev_size
Aborted

So does this:

$ ruby -rxz -e 'XZ::StreamWriter.open("foo.xz") {|x| x.write "abcdefghijkl"*10000 }'
/home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream_writer.rb:237: [BUG] Segmentation fault at 0x0000000000000018
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0006 p:0023 s:0029 e:000025 METHOD /home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream_writer.rb:237
c:0005 p:0008 s:0022 e:000021 METHOD /home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream.rb:225
c:0004 p:0010 s:0018 e:000017 RESCUE /home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream_writer.rb:131
c:0003 p:0047 s:0014 e:000013 METHOD /home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream_writer.rb:131
c:0002 p:0018 s:0006 e:000005 EVAL   -e:1 [FINISH]
c:0001 p:0000 s:0003 E:001b80 (none) [FINISH]

-- Ruby level backtrace information ----------------------------------------
-e:1:in `<main>'
/home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream_writer.rb:131:in `open'
/home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream_writer.rb:131:in `ensure in open'
/home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream.rb:225:in `close'
/home/u3/.gem/ruby/3.0.0/gems/ruby-xz-1.0.0/lib/xz/stream_writer.rb:0:in `finish'

-- Machine register context ------------------------------------------------
 RIP: 0x00007ff3632a03d0 RBP: 0x00007fffaad080b0 RSP: 0x00007fffaad07ff0
 RAX: 0x000056462ff36780 RBX: 0x00007ff3621acec0 RCX: 0x0000000000000000
 RDX: 0x000056462fc4a0b0 RDI: 0x00007ff3621acec0 RSI: 0x0000000000000000
  R8: 0x000056462ff8fb08  R9: 0x0000564630083500 R10: 0x000056462fc439b8
 R11: 0x00007ff3620ad120 R12: 0x000056462fbf47e0 R13: 0x000056462ff36790
 R14: 0x000056462ff367a8 R15: 0x00007ff3621acec0 EFL: 0x0000000000010246

-- C level backtrace information -------------------------------------------
malloc(): smallbin double linked list corrupted
Aborted

In the above example malloc aborts because of a corrupted heap, but I've also seen SIGSEGVs. The error I get seems pretty random, if I rerun the above examples I get a different error (and sometimes they do work).

I'm using gentoo linux with the following ruby/liblzma versions:

$ ruby --version
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]
$ qfile -v /usr/lib64/liblzma.so 
app-arch/xz-utils-5.2.5-r1: /usr/lib64/liblzma.so

Memory leak

There's a massive memory leak somewhere.
Presumably not in the gem ruby code because the objects' count stays the same, but process memory grows astonishingly quickly.

Here's a measurement of mem and object counts before and after running compression/decompression for 10k times:

Before - Process Memory: 178.3125 mb
Before - Objects count: 580438
Before - Symbols count: 11443
After - Process Memory: 351.3359375 mb
After - Objects count: 580401
After - Symbols count: 11443

178mb -> 351mb, and it doesn't get GC'ed ever.

Here's the gist:
https://gist.github.com/xTRiM/afdd9ea25519a479672153637637f0f7

Running on ruby 2.7.2p137

Question: what constitutes big amounts of data?

This library looks pretty sweet, but I had one question: I saw this comment in the code:

#Don't use this method for big amounts of data--you may run out of
#memory. Use compress_file or compress_stream instead.
def compress(....)

What constitutes a big amount of data? Are we talking megabytes? Hundreds of megabytes? If you don't know then I guess I'll just do some testing and post the results here, but if you already know I can save some time :)

Cheers

ruby-xz breaks Resolv.getaddress in ruby 2.2.x

Reproducible pretty trivially:

Gemfile:

source 'https://rubygems.org'
gem 'ruby-xz'

test.rb

#!/usr/bin/env ruby
require 'resolv'
require 'xz'
puts Resolv.getaddress('google.com')
$ bundle exec ./test.rb
/Users/ken/.rbenv/versions/2.2.3/lib/ruby/2.2.0/resolv.rb:905:in `initialize': wrong number of arguments (2 for 0..1) (ArgumentError)
    from /Users/ken/.rbenv/versions/2.2.3/lib/ruby/2.2.0/resolv.rb:929:in `new'
    from /Users/ken/.rbenv/versions/2.2.3/lib/ruby/2.2.0/resolv.rb:929:in `parse_resolv_conf'
    from /Users/ken/.rbenv/versions/2.2.3/lib/ruby/2.2.0/resolv.rb:961:in `default_config_hash'
    from /Users/ken/.rbenv/versions/2.2.3/lib/ruby/2.2.0/resolv.rb:982:in `block in lazy_initialize'

Looks like what's happening is that resolv.rb is calling an unqualified "open", which is probably expected to call "File.open"... but instead it ends up calling "DNS.open" (which calls DNS.new), which doesn't have the same argument count.

I think it may be something in the io-like gem that's being mixed in and monkey patching something, such that ruby's method resolution order is changed.

I think it's a deeper issue here in the io-like gem that calling include IO::Like tends to poison the method resolution order for lots of IO stuff in newer rubies (their github page says they don't even support 1.9.x yet.) I think the fix is to get off of io-like altogether.

Sync API with Ruby’s Zlib::GzipFile

There are some subtle differences between ruby-xz’s API and Ruby’s own Zlib wrapper, which should be removed so that a uniform interface exists. These differences are:

  • StreamReader#close and StreamWriter#close do not close the underlying IO automatically, Zlib does.
  • StreamReader::new and StreamReader::open behave slightly different with respect to their arguments. The former should not accept filenames, the latter should not accept IO objects.
  • Likewise for StreamWriter.

These are breaking API changes and require a major release. Given the low amount of reported bugs, doing a 1.0.0 release should be doable anyway.

can't require 'xz' on OS X without adding a symlink

I am on OS X 10.11.3 using the system ruby and got the following error when I tried to require the ruby-xz gem:

$ irb -rxz
/Library/Ruby/Gems/2.0.0/gems/ffi-1.9.10/lib/ffi/library.rb:133:in `block in ffi_lib':LoadError: Could not open library 'lzma.so.5': dlopen(lzma.so.5, 5): image not found.
Could not open library 'liblzma.so.5.dylib': dlopen(liblzma.so.5.dylib, 5): image not found.
Could not open library 'lzma.so': dlopen(lzma.so, 5): image not found.
Could not open library 'liblzma.so.dylib': dlopen(liblzma.so.dylib, 5): image not found.
Could not open library 'lzma': dlopen(lzma, 5): image not found.
Could not open library 'liblzma.dylib': dlopen(liblzma.dylib, 5): image not found

The following files exist:

/usr/lib/liblzma.5.dylib
/usr/lib/liblzma.dylib -> liblzma.5.dylib

I symlinked liblzma.dylib as /usr/local/lib/liblzma.so.dylib and things work then as expected... You may want to add liblzma.5.dylib and/or liblzma.dylib to the search list since that seems to be how they are named on OS X now.

remove initialize in lib_lzma.rb

With whatever FFI is shipping in JRuby, you can't super with no args in subclasses of FFI::Struct. You must explicitly super(). For the case where you're building an initialized struct, your fields don't match ( no :next ) so I assume that code isn't ever being hit, or it would break.

So, overall, ( I think ) the method does nothing except break JRuby at the moment - correct me if I'm wrong!

Works fine for me when I just remove the whole overloaded initialize method.

Stream (`IO`-like) mode would be great

Hi!

String-to-string decompression is great, but sometimes stream mode is very handy. For example, to support .tar.xz decompression on the fly. Here comes Gzip/plain file example:

stream = fn.match /\.gz\z/
  Zlib::GzipReader.open(fn)
elsif fn.match /\.tar\z/
  File.open(fn)
else
  raise "Error: Don't know how to handle '#{fn}', aborting"
end

untar = Gem::Package::TarReader.new(stream)
untar.each do |entry|
  ...
end

Do you think liblzma would allow to implement IO-like streaming in ruby-xz, too?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.