Comments (2)
This is an interesting bug! Here's what's happening:
- We read the first file's blocks successfully, and get to the index of the first file
- When reading the index, we read in big chunks of data: https://github.com/vasi/pixz/blob/master/src/read.c#L328 . This leaves our read buffer with a lot of data in it
- Reading the index + footer of the first file, and header and first block of the second file, just consume from the read buffer. That's fine
- But when we dispatch the first block of the second file to a decompressor threads, we pass along the entire read buffer, even though it contains too much data. Then the next block doesn't get that data and blows up.
So this bug only happens when there's small, concatenated files. Fun!
from pixz.
Should be fixed, please give it a try. You can do cat f1.xz f2.xz f3.xz | pixz -d > outut.txt
.
Note that pixz isn't really better than xz for compression/decompressing lots of individual small files. We can only really use parallelization with large files (including tarballs that contain lots of small files).
from pixz.
Related Issues (20)
- configure: error: AsciiDoc not found, not able to generate the man page. HOT 3
- Error decoding stream footer when trying to decompress a 3.1 TiB .tpxz file HOT 8
- cppcheck 2.8 warnings about uninitialized variables
- Crash when using -x option
- What is the default level of compression? HOT 2
- Can't compile on Fedora 38 HOT 2
- msys2 build failure HOT 18
- Indexes HOT 2
- Clarify README section on differences with xz HOT 5
- any plans for another release soon? HOT 1
- Server mode HOT 1
- -k should be the default HOT 2
- Building On Windows HOT 2
- manpage not installed if building from release tarball HOT 2
- build env question not package liblzma HOT 1
- Questions about tpxz / file index format HOT 6
- Error creating block encoder HOT 3
- Syntax for converting existing tar.xz archive to indexed pixz file? HOT 1
- Random failures when compressing large directories HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pixz.