Comments (3)
Here is more information
$ echo "1 2 3" > data.txt
$ lrzip -vv -o data.txt.lrz data.txt
The following options are in effect for this COMPRESSION.
Threading is ENABLED. Number of CPUs detected: 10
Detected 34359738368 bytes ram
Compression level 7
Nice Value: 19
Show Progress
Max Verbose
Output Filename Specified: data.txt.lrz
Temporary Directory set as: /var/folders/4w/t9h6qm850g395cdb8js574nc0000gn/T/
Compression mode is: LZMA. LZ4 Compressibility testing enabled
Heuristically Computed Compression Window: 218 = 21800MB
Storage time in seconds 1388011776
File size: 6
Succeeded in testing 6 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 6
Byte width: 1
Warning, low memory for chosen compression settings
Succeeded in testing 2191720448 sized malloc for back end compression
Using up to 11 threads to compress up to 16384 bytes each.
Beginning rzip pre-processing phase
hashsize = 4194304. bits = 22. 64MB
0 total hashes
Malloced 11453235200 for checksum ckbuf
Starting thread 0 to compress 10 bytes from stream 0
Starting thread 1 to compress 6 bytes from stream 1
Writing initial chunk bytes value 1 at 24
Writing EOF flag as 1
Writing initial header at 27
Compthread 0 seeking to 3 to store length 1
Compthread 0 seeking to 8 to write header
Thread 0 writing 10 compressed bytes from stream 0
Compthread 0 writing data at 12
Compthread 1 seeking to 7 to store length 1
Compthread 1 seeking to 22 to write header
Thread 1 writing 6 compressed bytes from stream 1
Compthread 1 writing data at 26
MD5: f2b33fb7b3d0eb95090a16060e6a24f9
matches=0 match_bytes=0
literals=2 literal_bytes=6
true_tag_positives=0 false_tag_positives=0
inserts=0 match 0.167
data.txt - Compression Ratio: 0.080. Average Compression Speed: 0.000MB/s.
Total time: 00:00:00.01
$ lrzip -vv -d data.txt.lrz
The following options are in effect for this DECOMPRESSION.
Threading is ENABLED. Number of CPUs detected: 10
Detected 34359738368 bytes ram
Compression level 7
Nice Value: 19
Show Progress
Max Verbose
Temporary Directory set as: /var/folders/4w/t9h6qm850g395cdb8js574nc0000gn/T/
Output filename is: data.txt
Detected lrzip version 0.6 file.
MD5 being used for integrity testing.
Decompressing...
Reading chunk_bytes at 24
Expected size: 6
Chunk byte width: 1
Reading eof flag at 25
EOF: 1
Reading expected chunksize at 26
Chunk size: 0
Reading stream 0 header at 28
Reading stream 1 header at 32
Reading ucomp header at 36
Fill_buffer stream 0 c_len 10 u_len 10 last_head 0
Starting thread 0 to decompress 10 bytes from stream 0
Thread 0 decompressed 10 bytes from stream 0
Taking decompressed data from thread 0
Reading ucomp header at 50
Fill_buffer stream 1 c_len 6 u_len 6 last_head 0
Starting thread 1 to decompress 6 bytes from stream 1
Thread 1 decompressed 6 bytes from stream 1
Taking decompressed data from thread 1
Closing stream at 59, want to seek to 59
Failed to pthread_mutex_lock
No such file or directory
Deleting broken file data.txt
Fatal error - exiting
from lrzip.
WTF? Why on earth are you testing a 6 byte file? What do you think would happen? Please don't waste time here. Granted, lrzip
should abandon any foolish attempt like this.
from lrzip.
There are three bugs here.
- This is a MAC related issue. Works fine on x86_64
- When chunk size is < 4096 bytes, chunk size will show as 4096 as a minimum.
- When chunk bytes is 1, i.e. input is < 256 bytes, chunk size shows as 0.
This is because 4,096 is stored in lrzip file as00 01
. When chunk bytes is 1, the01
is truncated out and only00
remains.
When compressing, lrzip seems to recognize the chunk size. The problem is that what it computes, and what it stores does not match with smaller files.
See these examples:
File size: 4096
Succeeded in testing 4096 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 4096
Byte width: 2
00000000 4c 52 5a 49 00 06 00 10 00 00 00 00 00 00 00 00 |LRZI............|
00000010 5d 00 00 00 01 01 00 00 02 01 00 10 03 00 00 00 |]...............|
02 = byte width
01 = EOF marker
00 10 = 0x1000 = 4096
File size: 4095
Succeeded in testing 4095 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 4095
Byte width: 2
00000000 4c 52 5a 49 00 06 ff 0f 00 00 00 00 00 00 00 00 |LRZI............|
00000010 5d 00 00 00 01 01 00 00 02 01 00 10 03 00 00 00 |]...............|
02 = byte width
01 = EOF marker
00 10 = 0x1000 = 4096
File size: 511
Succeeded in testing 511 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 511
Byte width: 2
00000000 4c 52 5a 49 00 06 ff 01 00 00 00 00 00 00 00 00 |LRZI............|
00000010 5d 00 00 00 01 01 00 00 02 01 00 10 03 00 00 00 |]...............|
02 = byte width
01 = EOF marker
00 10 = 0x1000 = 4096
File size: 255
Succeeded in testing 255 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 255
Byte width: 1
00000000 4c 52 5a 49 00 06 ff 00 00 00 00 00 00 00 00 00 |LRZI............|
00000010 5d 00 00 00 01 01 00 00 01 01 00 03 00 00 08 03 |]...............|
01 = byte width
01 = EOF marker
00 = zero chunk size! Only one byte written to header
Interestingly, this does not seem to impact decompression
Decompressing...
Reading chunk_bytes at 24
Expected size: 255
Chunk byte width: 1
Reading eof flag at 25
EOF: 1
Reading expected chunksize at 26
Chunk size: 0
...
Closing stream at 60, want to seek to 60
Average DeCompression Speed: 0.000MB/s
MD5: 6df9012b2b7cb3c55963499a26309bba
Output filename is: data.txt: [OK] - 255 bytes
Total time: 00:00:00.07
I reviewed the code and the problem seems to occur in the compthread()
function in stream.c for the initial thread. It's not clear right now where the 4,096 is coded in
from lrzip.
Related Issues (20)
- How to read the output HOT 1
- lrzip -t file.lrz fails when run from write-protected dir
- Streaming issue "No space left on device" in lrzip 0.651 HOT 1
- "Warning, low memory for chosen compression settings" for small target files HOT 12
- Use of uninitialized memory bug HOT 1
- 41e8014 Add a -Q/--very-quiet option: Suppresses INFO. It shouldn't
- memory error in fill_buffer lrzip/stream.c HOT 1
- ZPAQ Segfault with incompressible blocks
- lrztar shows spurious "illegal option" with long options HOT 4
- heap-buffer-overflow in libzpaq/libzpaq.cpp:1208:25 libzpaq::PostProcessor::write(int) HOT 3
- Issue with -p or --threads HOT 12
- Incomplete fix of Issue #206 makes use-after-free still possible HOT 4
- "Unable to allocate enough memory for operation" on x86 when 10 threads are used. HOT 1
- Unable to stat file error on broken symlinks HOT 2
- Suggestion to support encrypted Stdin/Stdout when password provided on command line
- CPU detection does not account for CPU affinity HOT 8
- autoconf generates warnings on deprecated macros
- autoreconf: not found
- Makefile.am:34: error: Libtool library used but 'LIBTOOL' is undefined HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lrzip.