Comments (3)
I wonder how does it know to consume that final empty block without peeking ahead (or does it?)
TBH I haven't looked into it. It could be that it is trying to always have data decompressed, so it decodes a block as soon as there is no more data available.
from compress.
The main issue is that ReadFull
doesn't read until EOF if the array can be satisfied.
If you change the last part to be...
decomp, err := io.ReadAll(zstdReader)
if err != nil && err != io.ErrUnexpectedEOF {
panic(err)
}
if sampleLen != len(decomp) {
panic(fmt.Sprintf("Zstd Length mismatch %d != %d", sampleLen, len(decomp)))
}
if !bytes.Equal(decompGozstd[:sampleLen], decomp[:sampleLen]) {
panic("Decoded data doesn't match")
}
Everything will be read from the input.
My guess is that "valyala" adds an empty block with a Last_Block identifier. This matches with the 3 bytes you see. I try to add this to the actual last block to not waste 3 bytes.
Since there is enough data to satisfy your request without decoding more, it will not decode more blocks until the data is actually needed.
You can "reintroduce" the issue by doing decomp, err := io.ReadAll(&io.LimitedReader{R: zstdReader, N: int64(sampleLen)})
- since the LimitedReader will stop reading upstream once N has been reached.
It is however not my impression the ZipDecompressor
forces this and creates the problem by itself.
from compress.
That all makes perfect sense, thanks so much for quickly responding. Since for my application I know the exact upper bound length of the decompressed data, I can easily work around this by just ensuring the read buffer is strictly longer than that upper bound.
Given your explanation, the most surprising thing is that the gozstd generated io.Reader does consistently consume the input stream to EOF when used with io.ReadFull(). I wonder how does it know to consume that final empty block without peeking ahead (or does it?)
from compress.
Related Issues (20)
- s2: Read skippable block at the beginning of the stream
- all goroutines are asleep - deadlock HOT 4
- zstd: Another data corruption on SpeedBestCompression level HOT 7
- Memory Leak HOT 1
- s2 compression panic: index out of range [0] with length 0 HOT 3
- zstd: make use of RLE_Block
- matchLen redeclared in this block HOT 1
- Support go 1.22? HOT 2
- zstd: compression at concurrency 1 produces 2x bigger compressed data HOT 2
- investigate lz4 conversion crash
- BuildZStdDict Testing HOT 6
- zstd: Reuse single encoder/decoder with many dictionaries
- Add LZ4 support HOT 1
- is it ok to not close zstd reader?
- Entries being sorted after reading a zip file HOT 1
- zstd EncodeAll cost smaller cpu than zstd.NewWriter.Write() HOT 1
- build error HOT 1
- zstd decoder sometimes thinks a good file is corrupted HOT 6
- zip: support different compression types
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compress.