Coder Social home page Coder Social logo

lrzip 0.651 test issue about lrzip HOT 3 OPEN

chenrui333 avatar chenrui333 commented on August 16, 2024
lrzip 0.651 test issue

from lrzip.

Comments (3)

chenrui333 avatar chenrui333 commented on August 16, 2024

Here is more information

$ echo "1 2 3" > data.txt
$ lrzip -vv -o data.txt.lrz data.txt
The following options are in effect for this COMPRESSION.
Threading is ENABLED. Number of CPUs detected: 10
Detected 34359738368 bytes ram
Compression level 7
Nice Value: 19
Show Progress
Max Verbose
Output Filename Specified: data.txt.lrz
Temporary Directory set as: /var/folders/4w/t9h6qm850g395cdb8js574nc0000gn/T/
Compression mode is: LZMA. LZ4 Compressibility testing enabled
Heuristically Computed Compression Window: 218 = 21800MB
Storage time in seconds 1388011776
File size: 6
Succeeded in testing 6 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 6
Byte width: 1
Warning, low memory for chosen compression settings
Succeeded in testing 2191720448 sized malloc for back end compression
Using up to 11 threads to compress up to 16384 bytes each.
Beginning rzip pre-processing phase
hashsize = 4194304.  bits = 22. 64MB
0 total hashes
Malloced 11453235200 for checksum ckbuf
Starting thread 0 to compress 10 bytes from stream 0
Starting thread 1 to compress 6 bytes from stream 1
Writing initial chunk bytes value 1 at 24
Writing EOF flag as 1
Writing initial header at 27
Compthread 0 seeking to 3 to store length 1
Compthread 0 seeking to 8 to write header
Thread 0 writing 10 compressed bytes from stream 0
Compthread 0 writing data at 12
Compthread 1 seeking to 7 to store length 1
Compthread 1 seeking to 22 to write header
Thread 1 writing 6 compressed bytes from stream 1
Compthread 1 writing data at 26
MD5: f2b33fb7b3d0eb95090a16060e6a24f9
matches=0 match_bytes=0
literals=2 literal_bytes=6
true_tag_positives=0 false_tag_positives=0
inserts=0 match 0.167
data.txt - Compression Ratio: 0.080. Average Compression Speed:  0.000MB/s.
Total time: 00:00:00.01

$ lrzip -vv -d data.txt.lrz
The following options are in effect for this DECOMPRESSION.
Threading is ENABLED. Number of CPUs detected: 10
Detected 34359738368 bytes ram
Compression level 7
Nice Value: 19
Show Progress
Max Verbose
Temporary Directory set as: /var/folders/4w/t9h6qm850g395cdb8js574nc0000gn/T/
Output filename is: data.txt
Detected lrzip version 0.6 file.
MD5 being used for integrity testing.
Decompressing...
Reading chunk_bytes at 24
Expected size: 6
Chunk byte width: 1
Reading eof flag at 25
EOF: 1
Reading expected chunksize at 26
Chunk size: 0
Reading stream 0 header at 28
Reading stream 1 header at 32
Reading ucomp header at 36
Fill_buffer stream 0 c_len 10 u_len 10 last_head 0
Starting thread 0 to decompress 10 bytes from stream 0
Thread 0 decompressed 10 bytes from stream 0
Taking decompressed data from thread 0
Reading ucomp header at 50
Fill_buffer stream 1 c_len 6 u_len 6 last_head 0
Starting thread 1 to decompress 6 bytes from stream 1
Thread 1 decompressed 6 bytes from stream 1
Taking decompressed data from thread 1
Closing stream at 59, want to seek to 59
Failed to pthread_mutex_lock
No such file or directory
Deleting broken file data.txt
Fatal error - exiting

from lrzip.

pete4abw avatar pete4abw commented on August 16, 2024

WTF? Why on earth are you testing a 6 byte file? What do you think would happen? Please don't waste time here. Granted, lrzip should abandon any foolish attempt like this.

from lrzip.

pete4abw avatar pete4abw commented on August 16, 2024

There are three bugs here.

  1. This is a MAC related issue. Works fine on x86_64
  2. When chunk size is < 4096 bytes, chunk size will show as 4096 as a minimum.
  3. When chunk bytes is 1, i.e. input is < 256 bytes, chunk size shows as 0.
    This is because 4,096 is stored in lrzip file as 00 01. When chunk bytes is 1, the 01 is truncated out and only 00 remains.

When compressing, lrzip seems to recognize the chunk size. The problem is that what it computes, and what it stores does not match with smaller files.

See these examples:

File size: 4096
Succeeded in testing 4096 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 4096
Byte width: 2

00000000 4c 52 5a 49 00 06 00 10 00 00 00 00 00 00 00 00 |LRZI............|
00000010 5d 00 00 00 01 01 00 00 02 01 00 10 03 00 00 00 |]...............|
02 = byte width
01 = EOF marker
00 10 = 0x1000 = 4096

File size: 4095
Succeeded in testing 4095 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 4095
Byte width: 2

00000000 4c 52 5a 49 00 06 ff 0f 00 00 00 00 00 00 00 00 |LRZI............|
00000010 5d 00 00 00 01 01 00 00 02 01 00 10 03 00 00 00 |]...............|
02 = byte width
01 = EOF marker
00 10 = 0x1000 = 4096

File size: 511
Succeeded in testing 511 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 511
Byte width: 2

00000000 4c 52 5a 49 00 06 ff 01 00 00 00 00 00 00 00 00 |LRZI............|
00000010 5d 00 00 00 01 01 00 00 02 01 00 10 03 00 00 00 |]...............|
02 = byte width
01 = EOF marker
00 10 = 0x1000 = 4096

File size: 255
Succeeded in testing 255 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 255
Byte width: 1

00000000 4c 52 5a 49 00 06 ff 00 00 00 00 00 00 00 00 00 |LRZI............|
00000010 5d 00 00 00 01 01 00 00 01 01 00 03 00 00 08 03 |]...............|
01 = byte width
01 = EOF marker
00 = zero chunk size! Only one byte written to header

Interestingly, this does not seem to impact decompression

Decompressing...
Reading chunk_bytes at 24
Expected size: 255
Chunk byte width: 1
Reading eof flag at 25
EOF: 1
Reading expected chunksize at 26
Chunk size: 0
...
Closing stream at 60, want to seek to 60

Average DeCompression Speed:  0.000MB/s
MD5: 6df9012b2b7cb3c55963499a26309bba
Output filename is: data.txt: [OK] - 255 bytes                                
Total time: 00:00:00.07

I reviewed the code and the problem seems to occur in the compthread() function in stream.c for the initial thread. It's not clear right now where the 4,096 is coded in

from lrzip.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.