klauspost / compress Goto Github PK

View Code? Open in Web Editor NEW

4.5K 72.0 295.0 58.79 MB

Optimized Go Compression Packages

License: Other

Go 75.36% Assembly 24.37% Batchfile 0.04% Shell 0.23%

zstd gzip compression decompression snappy zip golang zstandard go deflate

compress's Introduction

compress

This package provides various compression algorithms.

zstandard compression and decompression in pure Go.
S2 is a high performance replacement for Snappy.
Optimized deflate packages which can be used as a dropin replacement for gzip, zip and zlib.
snappy is a drop-in replacement for github.com/golang/snappy offering better compression and concurrent streams.
huff0 and FSE implementations for raw entropy encoding.
gzhttp Provides client and server wrappers for handling gzipped requests efficiently.
pgzip is a separate package that provides a very fast parallel gzip implementation.

changelog

Feb 5th, 2024 - 1.17.6
- zstd: Fix incorrect repeat coding in best mode #923
- s2: Fix DecodeConcurrent deadlock on errors #925
Jan 26th, 2024 - v1.17.5
- flate: Fix reset with dictionary on custom window encodes #912
- zstd: Add Frame header encoding and stripping #908
- zstd: Limit better/best default window to 8MB #913
- zstd: Speed improvements by @greatroar in #896 #910
- s2: Fix callbacks for skippable blocks and disallow 0xfe (Padding) by @Jille in #916 #917 #919 #918
Dec 1st, 2023 - v1.17.4
- huff0: Speed up symbol counting by @greatroar in #887
- huff0: Remove byteReader by @greatroar in #886
- gzhttp: Allow overriding decompression on transport #892
- gzhttp: Clamp compression level #890
- gzip: Error out if reserved bits are set #891
Nov 15th, 2023 - v1.17.3
- fse: Fix max header size #881
- zstd: Improve better/best compression #877
- gzhttp: Fix missing content type on Close #883
Oct 22nd, 2023 - v1.17.2
- zstd: Fix rare CORRUPTION output in "best" mode. See #876
Oct 14th, 2023 - v1.17.1
- s2: Fix S2 "best" dictionary wrong encoding by @klauspost in #871
- flate: Reduce allocations in decompressor and minor code improvements by @fakefloordiv in #869
- s2: Fix EstimateBlockSize on 6&7 length input by @klauspost in #867
Sept 19th, 2023 - v1.17.0
- Add experimental dictionary builder #853
- Add xerial snappy read/writer #838
- flate: Add limited window compression #843
- s2: Do 2 overlapping match checks #839
- flate: Add amd64 assembly matchlen #837
- gzip: Copy bufio.Reader on Reset by @thatguystone in #860

See changes to v1.16.x

July 1st, 2023 - v1.16.7
- zstd: Fix default level first dictionary encode #829
- s2: add GetBufferCapacity() method by @GiedriusS in #832
June 13, 2023 - v1.16.6
- zstd: correctly ignore WithEncoderPadding(1) by @ianlancetaylor in #806
- zstd: Add amd64 match length assembly #824
- gzhttp: Handle informational headers by @rtribotte in #815
- s2: Improve Better compression slightly #663
Apr 16, 2023 - v1.16.5
- zstd: readByte needs to use io.ReadFull by @jnoxon in #802
- gzip: Fix WriterTo after initial read #804
Apr 5, 2023 - v1.16.4
- zstd: Improve zstd best efficiency by @greatroar and @klauspost in #784
- zstd: Respect WithAllLitEntropyCompression #792
- zstd: Fix amd64 not always detecting corrupt data #785
- zstd: Various minor improvements by @greatroar in #788 #794 #795
- s2: Fix huge block overflow #779
- s2: Allow CustomEncoder fallback #780
- gzhttp: Suppport ResponseWriter Unwrap() in gzhttp handler by @jgimenez in #799
Mar 13, 2023 - v1.16.1
- zstd: Speed up + improve best encoder by @greatroar in #776
- gzhttp: Add optional BREACH mitigation. #762 #768 #769 #770 #767
- s2: Add Intel LZ4s converter #766
- zstd: Minor bug fixes #771 #772 #773
- huff0: Speed up compress1xDo by @greatroar in #774
Feb 26, 2023 - v1.16.0
- s2: Add Dictionary support. #685
- s2: Add Compression Size Estimate. #752
- s2: Add support for custom stream encoder. #755
- s2: Add LZ4 block converter. #748
- s2: Support io.ReaderAt in ReadSeeker. #747
- s2c/s2sx: Use concurrent decoding. #746

See changes to v1.15.x

Jan 21st, 2023 (v1.15.15)
- deflate: Improve level 7-9 by @klauspost in #739
- zstd: Add delta encoding support by @greatroar in #728
- zstd: Various speed improvements by @greatroar #741 #734 #736 #744 #743 #745
- gzhttp: Add SuffixETag() and DropETag() options to prevent ETag collisions on compressed responses by @willbicks in #740
Jan 3rd, 2023 (v1.15.14)
- flate: Improve speed in big stateless blocks #718
- zstd: Minor speed tweaks by @greatroar in #716 #720
- export NoGzipResponseWriter for custom ResponseWriter wrappers by @harshavardhana in #722
- s2: Add example for indexing and existing stream #723
Dec 11, 2022 (v1.15.13)
- zstd: Add MaxEncodedSize to encoder #691
- zstd: Various tweaks and improvements #693 #695 #696 #701 #702 #703 #704 #705 #706 #707 #708
Oct 26, 2022 (v1.15.12)
- zstd: Tweak decoder allocs. #680
- gzhttp: Always delete HeaderNoCompression #683
Sept 26, 2022 (v1.15.11)
- flate: Improve level 1-3 compression #678
- zstd: Improve "best" compression by @nightwolfz in #677
- zstd: Fix+reduce decompression allocations #668
- zstd: Fix non-effective noescape tag #667
Sept 16, 2022 (v1.15.10)
- zstd: Add WithDecodeAllCapLimit #649
- Add Go 1.19 - deprecate Go 1.16 #651
- flate: Improve level 5+6 compression #656
- zstd: Improve "better" compresssion #657
- s2: Improve "best" compression #658
- s2: Improve "better" compression. #635
- s2: Slightly faster non-assembly decompression #646
- Use arrays for constant size copies #659
July 21, 2022 (v1.15.9)
- zstd: Fix decoder crash on amd64 (no BMI) on invalid input #645
- zstd: Disable decoder extended memory copies (amd64) due to possible crashes #644
- zstd: Allow single segments up to "max decoded size" by @klauspost in #643
July 13, 2022 (v1.15.8)
- gzip: fix stack exhaustion bug in Reader.Read #641
- s2: Add Index header trim/restore #638
- zstd: Optimize seqdeq amd64 asm by @greatroar in #636
- zstd: Improve decoder memcopy #637
- huff0: Pass a single bitReader pointer to asm by @greatroar in #634
- zstd: Branchless getBits for amd64 w/o BMI2 by @greatroar in #640
- gzhttp: Remove header before writing #639
June 29, 2022 (v1.15.7)
- s2: Fix absolute forward seeks #633
- zip: Merge upstream #631
- zip: Re-add zip64 fix #624
- zstd: translate fseDecoder.buildDtable into asm by @WojciechMula in #598
- flate: Faster histograms #620
- deflate: Use compound hcode #622
June 3, 2022 (v1.15.6)
- s2: Improve coding for long, close matches #613
- s2c: Add Snappy/S2 stream recompression #611
- zstd: Always use configured block size #605
- zstd: Fix incorrect hash table placement for dict encoding in default #606
- zstd: Apply default config to ZipDecompressor without options #608
- gzhttp: Exclude more common archive formats #612
- s2: Add ReaderIgnoreCRC #609
- s2: Remove sanity load on index creation #607
- snappy: Use dedicated function for scoring #614
- s2c+s2d: Use official snappy framed extension #610
May 25, 2022 (v1.15.5)
- s2: Add concurrent stream decompression #602
- s2: Fix final emit oob read crash on amd64 #601
- huff0: asm implementation of Decompress1X by @WojciechMula #596
- zstd: Use 1 less goroutine for stream decoding #588
- zstd: Copy literal in 16 byte blocks when possible #592
- zstd: Speed up when WithDecoderLowmem(false) #599
- zstd: faster next state update in BMI2 version of decode by @WojciechMula in #593
- huff0: Do not check max size when reading table. #586
- flate: Inplace hashing for level 7-9 by @klauspost in #590
May 11, 2022 (v1.15.4)
- huff0: decompress directly into output by @WojciechMula in #577
- inflate: Keep dict on stack #581
- zstd: Faster decoding memcopy in asm #583
- zstd: Fix ignored crc #580
May 5, 2022 (v1.15.3)
- zstd: Allow to ignore checksum checking by @WojciechMula #572
- s2: Fix incorrect seek for io.SeekEnd in #575
Apr 26, 2022 (v1.15.2)
- zstd: Add x86-64 assembly for decompression on streams and blocks. Contributed by @WojciechMula. Typically 2x faster. #528 #531 #545 #537
- zstd: Add options to ZipDecompressor and fixes #539
- s2: Use sorted search for index #555
- Minimum version is Go 1.16, added CI test on 1.18.
Mar 11, 2022 (v1.15.1)
- huff0: Add x86 assembly of Decode4X by @WojciechMula in #512
- zstd: Reuse zip decoders in #514
- zstd: Detect extra block data and report as corrupted in #520
- zstd: Handle zero sized frame content size stricter in #521
- zstd: Add stricter block size checks in #523
Mar 3, 2022 (v1.15.0)
- zstd: Refactor decoder by @klauspost in #498
- zstd: Add stream encoding without goroutines by @klauspost in #505
- huff0: Prevent single blocks exceeding 16 bits by @klauspost in#507
- flate: Inline literal emission by @klauspost in #509
- gzhttp: Add zstd to transport by @klauspost in #400
- gzhttp: Make content-type optional by @klauspost in #510

Both compression and decompression now supports "synchronous" stream operations. This means that whenever "concurrency" is set to 1, they will operate without spawning goroutines.

Stream decompression is now faster on asynchronous, since the goroutine allocation much more effectively splits the workload. On typical streams this will typically use 2 cores fully for decompression. When a stream has finished decoding no goroutines will be left over, so decoders can now safely be pooled and still be garbage collected.

While the release has been extensively tested, it is recommended to testing when upgrading.

See changes to v1.14.x

Feb 22, 2022 (v1.14.4)
- flate: Fix rare huffman only (-2) corruption. #503
- zip: Update deprecated CreateHeaderRaw to correctly call CreateRaw by @saracen in #502
- zip: don't read data descriptor early by @saracen in #501 #501
- huff0: Use static decompression buffer up to 30% faster by @klauspost in #499 #500
Feb 17, 2022 (v1.14.3)
- flate: Improve fastest levels compression speed ~10% more throughput. #482 #489 #490 #491 #494 #478
- flate: Faster decompression speed, ~5-10%. #483
- s2: Faster compression with Go v1.18 and amd64 microarch level 3+. #484 #486
Jan 25, 2022 (v1.14.2)
- zstd: improve header decoder by @dsnet #476
- zstd: Add bigger default blocks #469
- zstd: Remove unused decompression buffer #470
- zstd: Fix logically dead code by @ningmingxiao #472
- flate: Improve level 7-9 #471 #473
- zstd: Add noasm tag for xxhash #475
Jan 11, 2022 (v1.14.1)
- s2: Add stream index in #462
- flate: Speed and efficiency improvements in #439 #461 #455 #452 #458
- zstd: Performance improvement in #420 #456 #437 #467 #468
- zstd: add arm64 xxhash assembly in #464
- Add garbled for binaries for s2 in #445

See changes to v1.13.x

Aug 30, 2021 (v1.13.5)
- gz/zlib/flate: Alias stdlib errors #425
- s2: Add block support to commandline tools #413
- zstd: pooledZipWriter should return Writers to the same pool #426
- Removed golang/snappy as external dependency for tests #421
Aug 12, 2021 (v1.13.4)
- Add snappy replacement package.
- zstd: Fix incorrect encoding in "best" mode #415
Aug 3, 2021 (v1.13.3)
- zstd: Improve Best compression #404
- zstd: Fix WriteTo error forwarding #411
- gzhttp: Return http.HandlerFunc instead of http.Handler. Unlikely breaking change. #406
- s2sx: Fix max size error #399
- zstd: Add optional stream content size on reset #401
- zstd: use SpeedBestCompression for level >= 10 #410
Jun 14, 2021 (v1.13.1)
- s2: Add full Snappy output support #396
- zstd: Add configurable Decoder window size #394
- gzhttp: Add header to skip compression #389
- s2: Improve speed with bigger output margin #395
Jun 3, 2021 (v1.13.0)
- Added gzhttp which allows wrapping HTTP servers and clients with GZIP compressors.
- zstd: Detect short invalid signatures #382
- zstd: Spawn decoder goroutine only if needed. #380

See changes to v1.12.x

May 25, 2021 (v1.12.3)
- deflate: Better/faster Huffman encoding #374
- deflate: Allocate less for history. #375
- zstd: Forward read errors #373
Apr 27, 2021 (v1.12.2)
- zstd: Improve better/best compression #360 #364 #365
- zstd: Add helpers to compress/decompress zstd inside zip files #363
- deflate: Improve level 5+6 compression #367
- s2: Improve better/best compression #358 #359
- s2: Load after checking src limit on amd64. #362
- s2sx: Limit max executable size #368
Apr 14, 2021 (v1.12.1)
- snappy package removed. Upstream added as dependency.
- s2: Better compression in "best" mode #353
- s2sx: Add stdin input and detect pre-compressed from signature #352
- s2c/s2d: Add http as possible input #348
- s2c/s2d/s2sx: Always truncate when writing files #352
- zstd: Reduce memory usage further when using WithLowerEncoderMem #346
- s2: Fix potential problem with amd64 assembly and profilers #349

See changes to v1.11.x

Mar 26, 2021 (v1.11.13)
- zstd: Big speedup on small dictionary encodes #344 #345
- zstd: Add WithLowerEncoderMem encoder option #336
- deflate: Improve entropy compression #338
- s2: Clean up and minor performance improvement in best #341
Mar 5, 2021 (v1.11.12)
- s2: Add s2sx binary that creates self extracting archives.
- s2: Speed up decompression on non-assembly platforms #328
Mar 1, 2021 (v1.11.9)
- s2: Add ARM64 decompression assembly. Around 2x output speed. #324
- s2: Improve "better" speed and efficiency. #325
- s2: Fix binaries.
Feb 25, 2021 (v1.11.8)
- s2: Fixed occational out-of-bounds write on amd64. Upgrade recommended.
- s2: Add AMD64 assembly for better mode. 25-50% faster. #315
- s2: Less upfront decoder allocation. #322
- zstd: Faster "compression" of incompressible data. #314
- zip: Fix zip64 headers. #313
Jan 14, 2021 (v1.11.7)
- Use Bytes() interface to get bytes across packages. #309
- s2: Add 'best' compression option. #310
- s2: Add ReaderMaxBlockSize, changes s2.NewReader signature to include varargs. #311
- s2: Fix crash on small better buffers. #308
- s2: Clean up decoder. #312
Jan 7, 2021 (v1.11.6)
- zstd: Make decoder allocations smaller #306
- zstd: Free Decoder resources when Reset is called with a nil io.Reader #305
Dec 20, 2020 (v1.11.4)
- zstd: Add Best compression mode #304
- Add header decoder #299
- s2: Add uncompressed stream option #297
- Simplify/speed up small blocks with known max size. #300
- zstd: Always reset literal dict encoder #303
Nov 15, 2020 (v1.11.3)
- inflate: 10-15% faster decompression #293
- zstd: Tweak DecodeAll default allocation #295
Oct 11, 2020 (v1.11.2)
- s2: Fix out of bounds read in "better" block compression #291
Oct 1, 2020 (v1.11.1)
- zstd: Set allLitEntropy true in default configuration #286
Sept 8, 2020 (v1.11.0)
- zstd: Add experimental compression dictionaries #281
- zstd: Fix mixed Write and ReadFrom calls #282
- inflate/gz: Limit variable shifts, ~5% faster decompression #274

See changes to v1.10.x

July 8, 2020 (v1.10.11)
- zstd: Fix extra block when compressing with ReadFrom. #278
- huff0: Also populate compression table when reading decoding table. #275
June 23, 2020 (v1.10.10)
- zstd: Skip entropy compression in fastest mode when no matches. #270
June 16, 2020 (v1.10.9):
- zstd: API change for specifying dictionaries. See #268
- zip: update CreateHeaderRaw to handle zip64 fields. #266
- Fuzzit tests removed. The service has been purchased and is no longer available.
June 5, 2020 (v1.10.8):
- 1.15x faster zstd block decompression. #265
June 1, 2020 (v1.10.7):
- Added zstd decompression dictionary support
- Increase zstd decompression speed up to 1.19x. #259
- Remove internal reset call in zstd compression and reduce allocations. #263
May 21, 2020: (v1.10.6)
- zstd: Reduce allocations while decoding. #258, #252
- zstd: Stricter decompression checks.
April 12, 2020: (v1.10.5)
- s2-commands: Flush output when receiving SIGINT. #239
Apr 8, 2020: (v1.10.4)
- zstd: Minor/special case optimizations. #251, #250, #249, #247
Mar 11, 2020: (v1.10.3)
- s2: Use S2 encoder in pure Go mode for Snappy output as well. #245
- s2: Fix pure Go block encoder. #244
- zstd: Added "better compression" mode. #240
- zstd: Improve speed of fastest compression mode by 5-10% #241
- zstd: Skip creating encoders when not needed. #238
Feb 27, 2020: (v1.10.2)
- Close to 50% speedup in inflate (gzip/zip decompression). #236 #234 #232
- Reduce deflate level 1-6 memory usage up to 59%. #227
Feb 18, 2020: (v1.10.1)
- Fix zstd crash when resetting multiple times without sending data. #226
- deflate: Fix dictionary use on level 1-6. #224
- Remove deflate writer reference when closing. #224
Feb 4, 2020: (v1.10.0)
- Add optional dictionary to stateless deflate. Breaking change, send nil for previous behaviour. #216
- Fix buffer overflow on repeated small block deflate. #218
- Allow copying content from an existing ZIP file without decompressing+compressing. #214
- Added S2 AMD64 assembler and various optimizations. Stream speed >10GB/s. #186

See changes prior to v1.10.0

Jan 20,2020 (v1.9.8) Optimize gzip/deflate with better size estimates and faster table generation. #207 by luyu6056, #206.
Jan 11, 2020: S2 Encode/Decode will use provided buffer if capacity is big enough. #204
Jan 5, 2020: (v1.9.7) Fix another zstd regression in v1.9.5 - v1.9.6 removed.
Jan 4, 2020: (v1.9.6) Regression in v1.9.5 fixed causing corrupt zstd encodes in rare cases.
Jan 4, 2020: Faster IO in s2c + s2d commandline tools compression/decompression. #192
Dec 29, 2019: Removed v1.9.5 since fuzz tests showed a compatibility problem with the reference zstandard decoder.
Dec 29, 2019: (v1.9.5) zstd: 10-20% faster block compression. #199
Dec 29, 2019: zip package updated with latest Go features
Dec 29, 2019: zstd: Single segment flag condintions tweaked. #197
Dec 18, 2019: s2: Faster compression when ReadFrom is used. #198
Dec 10, 2019: s2: Fix repeat length output when just above at 16MB limit.
Dec 10, 2019: zstd: Add function to get decoder as io.ReadCloser. #191
Dec 3, 2019: (v1.9.4) S2: limit max repeat length. #188
Dec 3, 2019: Add WithNoEntropyCompression to zstd #187
Dec 3, 2019: Reduce memory use for tests. Check for leaked goroutines.
Nov 28, 2019 (v1.9.3) Less allocations in stateless deflate.
Nov 28, 2019: 5-20% Faster huff0 decode. Impacts zstd as well. #184
Nov 12, 2019 (v1.9.2) Added Stateless Compression for gzip/deflate.
Nov 12, 2019: Fixed zstd decompression of large single blocks. #180
Nov 11, 2019: Set default s2c block size to 4MB.
Nov 11, 2019: Reduce inflate memory use by 1KB.
Nov 10, 2019: Less allocations in deflate bit writer.
Nov 10, 2019: Fix inconsistent error returned by zstd decoder.
Oct 28, 2019 (v1.9.1) ztsd: Fix crash when compressing blocks. #174
Oct 24, 2019 (v1.9.0) zstd: Fix rare data corruption #173
Oct 24, 2019 zstd: Fix huff0 out of buffer write #171 and always return errors #172
Oct 10, 2019: Big deflate rewrite, 30-40% faster with better compression #105

See changes prior to v1.9.0

Oct 10, 2019: (v1.8.6) zstd: Allow partial reads to get flushed data. #169
Oct 3, 2019: Fix inconsistent results on broken zstd streams.
Sep 25, 2019: Added -rm (remove source files) and -q (no output except errors) to s2c and s2d commands
Sep 16, 2019: (v1.8.4) Add s2c and s2d commandline tools.
Sep 10, 2019: (v1.8.3) Fix s2 decoder Skip.
Sep 7, 2019: zstd: Added WithWindowSize, contributed by ianwilkes.
Sep 5, 2019: (v1.8.2) Add WithZeroFrames which adds full zero payload block encoding option.
Sep 5, 2019: Lazy initialization of zstandard predefined en/decoder tables.
Aug 26, 2019: (v1.8.1) S2: 1-2% compression increase in "better" compression mode.
Aug 26, 2019: zstd: Check maximum size of Huffman 1X compressed literals while decoding.
Aug 24, 2019: (v1.8.0) Added S2 compression, a high performance replacement for Snappy.
Aug 21, 2019: (v1.7.6) Fixed minor issues found by fuzzer. One could lead to zstd not decompressing.
Aug 18, 2019: Add fuzzit continuous fuzzing.
Aug 14, 2019: zstd: Skip incompressible data 2x faster. #147
Aug 4, 2019 (v1.7.5): Better literal compression. #146
Aug 4, 2019: Faster zstd compression. #143 #144
Aug 4, 2019: Faster zstd decompression. #145 #143 #142
July 15, 2019 (v1.7.4): Fix double EOF block in rare cases on zstd encoder.
July 15, 2019 (v1.7.3): Minor speedup/compression increase in default zstd encoder.
July 14, 2019: zstd decoder: Fix decompression error on multiple uses with mixed content.
July 7, 2019 (v1.7.2): Snappy update, zstd decoder potential race fix.
June 17, 2019: zstd decompression bugfix.
June 17, 2019: fix 32 bit builds.
June 17, 2019: Easier use in modules (less dependencies).
June 9, 2019: New stronger "default" zstd compression mode. Matches zstd default compression ratio.
June 5, 2019: 20-40% throughput in zstandard compression and better compression.
June 5, 2019: deflate/gzip compression: Reduce memory usage of lower compression levels.
June 2, 2019: Added zstandard compression!
May 25, 2019: deflate/gzip: 10% faster bit writer, mostly visible in lower levels.
Apr 22, 2019: zstd decompression added.
Aug 1, 2018: Added huff0 README.
Jul 8, 2018: Added Performance Update 2018 below.
Jun 23, 2018: Merged Go 1.11 inflate optimizations. Go 1.9 is now required. Backwards compatible version tagged with v1.3.0.
Apr 2, 2018: Added huff0 en/decoder. Experimental for now, API may change.
Mar 4, 2018: Added FSE Entropy en/decoder. Experimental for now, API may change.
Nov 3, 2017: Add compression Estimate function.
May 28, 2017: Reduce allocations when resetting decoder.
Apr 02, 2017: Change back to official crc32, since changes were merged in Go 1.7.
Jan 14, 2017: Reduce stack pressure due to array copies. See Issue #18625.
Oct 25, 2016: Level 2-4 have been rewritten and now offers significantly better performance than before.
Oct 20, 2016: Port zlib changes from Go 1.7 to fix zlib writer issue. Please update.
Oct 16, 2016: Go 1.7 changes merged. Apples to apples this package is a few percent faster, but has a significantly better balance between speed and compression per level.
Mar 24, 2016: Always attempt Huffman encoding on level 4-7. This improves base 64 encoded data compression.
Mar 24, 2016: Small speedup for level 1-3.
Feb 19, 2016: Faster bit writer, level -2 is 15% faster, level 1 is 4% faster.
Feb 19, 2016: Handle small payloads faster in level 1-3.
Feb 19, 2016: Added faster level 2 + 3 compression modes.
Feb 19, 2016: Rebalanced compression levels, so there is a more even progresssion in terms of compression. New default level is 5.
Feb 14, 2016: Snappy: Merge upstream changes.
Feb 14, 2016: Snappy: Fix aggressive skipping.
Feb 14, 2016: Snappy: Update benchmark.
Feb 13, 2016: Deflate: Fixed assembler problem that could lead to sub-optimal compression.
Feb 12, 2016: Snappy: Added AMD64 SSE 4.2 optimizations to matching, which makes easy to compress material run faster. Typical speedup is around 25%.
Feb 9, 2016: Added Snappy package fork. This version is 5-7% faster, much more on hard to compress content.
Jan 30, 2016: Optimize level 1 to 3 by not considering static dictionary or storing uncompressed. ~4-5% speedup.
Jan 16, 2016: Optimization on deflate level 1,2,3 compression.
Jan 8 2016: Merge CL 18317: fix reading, writing of zip64 archives.
Dec 8 2015: Make level 1 and -2 deterministic even if write size differs.
Dec 8 2015: Split encoding functions, so hashing and matching can potentially be inlined. 1-3% faster on AMD64. 5% faster on other platforms.
Dec 8 2015: Fixed rare one byte out-of bounds read. Please update!
Nov 23 2015: Optimization on token writer. ~2-4% faster. Contributed by @dsnet.
Nov 20 2015: Small optimization to bit writer on 64 bit systems.
Nov 17 2015: Fixed out-of-bound errors if the underlying Writer returned an error. See #15.
Nov 12 2015: Added io.WriterTo support to gzip/inflate.
Nov 11 2015: Merged CL 16669: archive/zip: enable overriding (de)compressors per file
Oct 15 2015: Added skipping on uncompressible data. Random data speed up >5x.

deflate usage

The packages are drop-in replacements for standard libraries. Simply replace the import path to use them:

old import	new import	Documentation
`compress/gzip`	`github.com/klauspost/compress/gzip`	gzip
`compress/zlib`	`github.com/klauspost/compress/zlib`	zlib
`archive/zip`	`github.com/klauspost/compress/zip`	zip
`compress/flate`	`github.com/klauspost/compress/flate`	flate

Optimized deflate packages which can be used as a dropin replacement for gzip, zip and zlib.

You may also be interested in pgzip, which is a drop in replacement for gzip, which support multithreaded compression on big files and the optimized crc32 package used by these packages.

The packages contains the same as the standard library, so you can use the godoc for that: gzip, zip, zlib, flate.

Currently there is only minor speedup on decompression (mostly CRC32 calculation).

Memory usage is typically 1MB for a Writer. stdlib is in the same range. If you expect to have a lot of concurrently allocated Writers consider using the stateless compress described below.

For compression performance, see: this spreadsheet.

To disable all assembly add -tags=noasm. This works across all packages.

Stateless compression

This package offers stateless compression as a special option for gzip/deflate. It will do compression but without maintaining any state between Write calls.

This means there will be no memory kept between Write calls, but compression and speed will be suboptimal.

This is only relevant in cases where you expect to run many thousands of compressors concurrently, but with very little activity. This is not intended for regular web servers serving individual requests.

Because of this, the size of actual Write calls will affect output size.

In gzip, specify level -3 / gzip.StatelessCompression to enable.

For direct deflate use, NewStatelessWriter and StatelessDeflate are available. See documentation

A bufio.Writer can of course be used to control write sizes. For example, to use a 4KB buffer:

	// replace 'ioutil.Discard' with your output.
	gzw, err := gzip.NewWriterLevel(ioutil.Discard, gzip.StatelessCompression)
	if err != nil {
		return err
	}
	defer gzw.Close()

	w := bufio.NewWriterSize(gzw, 4096)
	defer w.Flush()
	
	// Write to 'w'

This will only use up to 4KB in memory when the writer is idle.

Compression is almost always worse than the fastest compression level and each write will allocate (a little) memory.

Performance Update 2018

It has been a while since we have been looking at the speed of this package compared to the standard library, so I thought I would re-do my tests and give some overall recommendations based on the current state. All benchmarks have been performed with Go 1.10 on my Desktop Intel(R) Core(TM) i7-2600 CPU @3.40GHz. Since I last ran the tests, I have gotten more RAM, which means tests with big files are no longer limited by my SSD.

The raw results are in my updated spreadsheet. Due to cgo changes and upstream updates i could not get the cgo version of gzip to compile. Instead I included the zstd cgo implementation. If I get cgo gzip to work again, I might replace the results in the sheet.

The columns to take note of are: MB/s - the throughput. Reduction - the data size reduction in percent of the original. Rel Speed relative speed compared to the standard library at the same level. Smaller - how many percent smaller is the compressed output compared to stdlib. Negative means the output was bigger. Loss means the loss (or gain) in compression as a percentage difference of the input.

The gzstd (standard library gzip) and gzkp (this package gzip) only uses one CPU core. pgzip, bgzf uses all 4 cores. zstd uses one core, and is a beast (but not Go, yet).

Overall differences.

There appears to be a roughly 5-10% speed advantage over the standard library when comparing at similar compression levels.

The biggest difference you will see is the result of re-balancing the compression levels. I wanted by library to give a smoother transition between the compression levels than the standard library.

This package attempts to provide a more smooth transition, where "1" is taking a lot of shortcuts, "5" is the reasonable trade-off and "9" is the "give me the best compression", and the values in between gives something reasonable in between. The standard library has big differences in levels 1-4, but levels 5-9 having no significant gains - often spending a lot more time than can be justified by the achieved compression.

There are links to all the test data in the spreadsheet in the top left field on each tab.

Web Content

This test set aims to emulate typical use in a web server. The test-set is 4GB data in 53k files, and is a mixture of (mostly) HTML, JS, CSS.

Since level 1 and 9 are close to being the same code, they are quite close. But looking at the levels in-between the differences are quite big.

Looking at level 6, this package is 88% faster, but will output about 6% more data. For a web server, this means you can serve 88% more data, but have to pay for 6% more bandwidth. You can draw your own conclusions on what would be the most expensive for your case.

Object files

This test is for typical data files stored on a server. In this case it is a collection of Go precompiled objects. They are very compressible.

The picture is similar to the web content, but with small differences since this is very compressible. Levels 2-3 offer good speed, but is sacrificing quite a bit of compression.

The standard library seems suboptimal on level 3 and 4 - offering both worse compression and speed than level 6 & 7 of this package respectively.

Highly Compressible File

This is a JSON file with very high redundancy. The reduction starts at 95% on level 1, so in real life terms we are dealing with something like a highly redundant stream of data, etc.

It is definitely visible that we are dealing with specialized content here, so the results are very scattered. This package does not do very well at levels 1-4, but picks up significantly at level 5 and levels 7 and 8 offering great speed for the achieved compression.

So if you know you content is extremely compressible you might want to go slightly higher than the defaults. The standard library has a huge gap between levels 3 and 4 in terms of speed (2.75x slowdown), so it offers little "middle ground".

Medium-High Compressible

This is a pretty common test corpus: enwik9. It contains the first 10^9 bytes of the English Wikipedia dump on Mar. 3, 2006. This is a very good test of typical text based compression and more data heavy streams.

We see a similar picture here as in "Web Content". On equal levels some compression is sacrificed for more speed. Level 5 seems to be the best trade-off between speed and size, beating stdlib level 3 in both.

Medium Compressible

I will combine two test sets, one 10GB file set and a VM disk image (~8GB). Both contain different data types and represent a typical backup scenario.

The most notable thing is how quickly the standard library drops to very low compression speeds around level 5-6 without any big gains in compression. Since this type of data is fairly common, this does not seem like good behavior.

Un-compressible Content

This is mainly a test of how good the algorithms are at detecting un-compressible input. The standard library only offers this feature with very conservative settings at level 1. Obviously there is no reason for the algorithms to try to compress input that cannot be compressed. The only downside is that it might skip some compressible data on false detections.

Huffman only compression

This compression library adds a special compression level, named HuffmanOnly, which allows near linear time compression. This is done by completely disabling matching of previous data, and only reduce the number of bits to represent each character.

This means that often used characters, like 'e' and ' ' (space) in text use the fewest bits to represent, and rare characters like '¤' takes more bits to represent. For more information see wikipedia or this nice video.

Since this type of compression has much less variance, the compression speed is mostly unaffected by the input data, and is usually more than 180MB/s for a single core.

The downside is that the compression ratio is usually considerably worse than even the fastest conventional compression. The compression ratio can never be better than 8:1 (12.5%).

The linear time compression can be used as a "better than nothing" mode, where you cannot risk the encoder to slow down on some content. For comparison, the size of the "Twain" text is 233460 bytes (+29% vs. level 1) and encode speed is 144MB/s (4.5x level 1). So in this case you trade a 30% size increase for a 4 times speedup.

For more information see my blog post on Fast Linear Time Compression.

This is implemented on Go 1.7 as "Huffman Only" mode, though not exposed for gzip.

Other packages

Here are other packages of good quality and pure Go (no cgo wrappers or autoconverted code):

github.com/pierrec/lz4 - strong multithreaded LZ4 compression.
github.com/cosnicolaou/pbzip2 - multithreaded bzip2 decompression.
github.com/dsnet/compress - brotli decompression, bzip2 writer.
github.com/ronanh/intcomp - Integer compression.
github.com/spenczar/fpc - Float compression.
github.com/minio/zipindex - External ZIP directory index.
github.com/ybirader/pzip - Fast concurrent zip archiver and extractor.

license

This code is licensed under the same conditions as the original Go code. See LICENSE file.

compress's People

Stargazers

Watchers

Forkers

gonuts samuraicode andradeandrey zhoudianyou rahulsingh71 cinderalla yxd-hde paulrigor aybabtme zhuweijava pombredanne patricktoca wheelcomplex driangray rasky escribano revel ateleshev kcruci dutao-88 nowk tongfeifei rohanpai flintlongjohnsilver strogo gy-games-libs hiveminded jameslinus ycllz etsangsplk cafxx irobayna layeka kernullist godofdream hujilee ianlancetaylor mewbak yosua19 carryon sdtm1016 codelingobot hellastorm isgasho shammishailaj flanglet jangocheng inrg glycerine junxie6 abeusher xu188618861886 backwardn r4b3rt jsign tsl-karlp the-alchemist modasi ianwilkes gleke gophersgang fproulx-dfuse crazy-x deraylei bwighane mak12776 rjammala ismiyati jarifibrahim jsyzjhj bonedaddy bagelflower saonam luyu6056 bronze1man pubgo rafaelescrich greatroar happy-co icemark123 aahmed-se sheisback dferstay webgamelinux saracen tawawhite forkkit rajcspsg sjanulonoks rjoleary hixio-mh sensorphalanx buengese rizalgowandy fstab itsintern feyrob santosh653 startime-h mostynb

compress's Issues

Int overflow on 32 bits arches

Golang 1.12.6 on i686 and armv7:

Testing    in: /builddir/build/BUILD/compress-1.7.0/_build/src
         PATH: /builddir/build/BUILD/compress-1.7.0/_build/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/sbin
       GOPATH: /builddir/build/BUILD/compress-1.7.0/_build:/usr/share/gocode
  GO111MODULE: off
      command: go test -buildmode pie -compiler gc -ldflags "-X github.com/klauspost/compress/version=1.7.0 -extldflags '-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld '"
      testing: github.com/klauspost/compress
github.com/klauspost/compress
testing: warning: no tests to run
PASS
ok  	github.com/klauspost/compress	0.004s
github.com/klauspost/compress/flate
PASS
ok  	github.com/klauspost/compress/flate	21.662s
github.com/klauspost/compress/fse
FAIL	github.com/klauspost/compress/fse [build failed]
BUILDSTDERR: # github.com/klauspost/compress/fse [github.com/klauspost/compress/fse.test]
BUILDSTDERR: ./compress.go:22:13: constant 2147483648 overflows int
BUILDSTDERR: ./fse.go:130:25: constant 2147483648 overflows int

zstd encode: Add block continuation logic

For fast modes, continue on current block if very low number of sequences and literals. Keep 2MB limit in mind though.

zstd decoder: stricter check on output size for single segment frames.

We currently do not check if total size of a frame exceeds the size set on single segment encodes.

While this is not super important for our decoder, it would be nice to have to make sure our streams remain compatible (though the problem should show up elsewhere as well).

can't go get this package

Hi guy
after i reinstalled go revel, i got this error
go get github.com/revel/revel

cd e:\go\datafirst\src\github.com\klauspost\compress; git pull --ff-only

fatal: Not a git repository (or any of the parent directories): .git
package github.com/klauspost/compress/gzip: exit status 128
package github.com/klauspost/compress/zlib: cannot find package "github.com/klauspost/compress/zlib" in any of:
E:\go1.6\src\github.com\klauspost\compress\zlib (from $GOROOT)
e:\go\datafirst\src\github.com\klauspost\compress\zlib (from $GOPATH)
e:\go\gopath\src\github.com\klauspost\compress\zlib
can u check for me
thanks guy

zstd: Decoder.Reset deadlock

Starting with commit 50006fb, my simple zstd test program started deadlocking. The bytes.Buffer I pass to Decoder.Reset is less than 1MB, which is what that commit optimizes for.

Thanks for your efforts on a pure-go zstd implementation!

fatal error: all goroutines are asleep - deadlock!

goroutine 1 [chan send]:
github.com/klauspost/compress/zstd.(*Decoder).Reset(0xc0000f8000, 0x114cc60, 0xc00066c030, 0x7dd3b, 0x10c52)
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/decoder.go:171 +0x1f6
main.zstdCompress(0xc000128000, 0x10c52, 0x1fe00, 0xc00009bbb0, 0x105adfc, 0x114cc80)
	/Users/aaronb/zstd.go:168 +0x1c7
main.compress(0xc000128000, 0x10c52, 0x1fe00, 0xc000016d80, 0x1, 0x1, 0x0, 0x0, 0x1)
	/Users/aaronb/zstd.go:102 +0x6f
main.main()
	/Users/aaronb/zstd.go:57 +0x23f

goroutine 4 [chan receive]:
github.com/klauspost/compress/zstd.(*blockDec).startDecoder(0xc0000922c0)
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/blockdec.go:188 +0x120
created by github.com/klauspost/compress/zstd.newBlockDec
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/blockdec.go:106 +0x155

goroutine 7 [chan receive]:
github.com/klauspost/compress/zstd.(*Decoder).startStreamDecoder(0xc0000f8000, 0xc00005a360)
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/decoder.go:379 +0x272
created by github.com/klauspost/compress/zstd.(*Decoder).Reset
	/Users/aaronb/go/src/github.com/klauspost/compress/zstd/decoder.go:154 +0x422
exit status 2

Here is a snippet of my simple test program. zstdCompress is called in a loop on different inputs. The deadlock is occurring on the line with zstdReader.Reset(buf).

type compressedResult struct {
	size           int
	compressTime   time.Duration
	decompressTime time.Duration
}

var zstdWriter, _ = zstd.NewWriter(nil, zstd.WithEncoderConcurrency(1))
var zstdReader, _ = zstd.NewReader(nil, zstd.WithDecoderConcurrency(1))

func zstdCompress(msg []byte) compressedResult {
	var r compressedResult
	buf := &bytes.Buffer{}
	t1 := time.Now()
	zstdWriter.Reset(buf)
	if _, err := zstdWriter.Write(msg); err != nil {
		panic(err)
	}
	if err := zstdWriter.Close(); err != nil {
		panic(err)
	}
	r.compressTime = time.Since(t1)
	r.size = len(buf.Bytes())

	t2 := time.Now()
	if err := zstdReader.Reset(buf); err != nil {
		panic(err)
	}
	out, err := ioutil.ReadAll(zstdReader)
	if err != nil {
		panic(err)
	}
	r.decompressTime = time.Since(t2)

	if !bytes.Equal(msg, out) {
		fmt.Println("bad decompress")
	}

	return r
}

multiple int overflows on 32 bit arches in s2_test.go

While running tests on Fedora rawhide (32), I get the following test failures:

./s2_test.go:39:4: constant 4294967285 overflows int
./s2_test.go:39:22: constant 4294967295 overflows int
./s2_test.go:40:47: constant 4294967290 overflows int
./s2_test.go:40:65: constant 4294967295 overflows int
./s2_test.go:40:83: constant 4294967295 overflows int
./s2_test.go:41:23: constant 4294967286 overflows int
./s2_test.go:42:23: constant 4294967287 overflows int
./s2_test.go:43:23: constant 4294967288 overflows int
./s2_test.go:44:23: constant 4294967289 overflows int
./s2_test.go:45:23: constant 4294967290 overflows int
./s2_test.go:45:23: too many errors

It looks like similar to #133 .

Switch from DataDog/zstd to valyala/gozstd in tests and benchmarks

DataDog/zstd has poor support for streaming, has potential bugs in streaming and is less optimized than valyala/gozstd. So I'd recommend switching to valyala/gozstd in tests and benchmarks.

Also valyala/gozstd vendors the latest upstream zstd release without any modifications.

flate: regression on efficiency for very short strings

Using: 9d711f4

Example link: https://play.golang.org/p/3N7YRHAmGO

When compressing very short strings, the KP version of flate outputs strings larger than what the standard library did, which itself outputted strings larger than what zlib did.

Compressing the string "a" on level 6, outputs the following:

zlib: 4b0400
std:  4a04040000ffff
kp:   04c08100000000009056ff13180000ffff

Where zlib is the C library, std is the Go1.6 standard library, and kp is your library. It seems that the KP version uses a dynamic block, rather than a fixed block. If we address this change, we may want to avoid the [final, last, empty block] we currently emit (the 0x0000ffff bytes at the end). That will allow us to produce shorter outputs (like what zlib can produce).

Avoiding the [final, last, empty block] will be beneficial to https://go-review.googlesource.com/#/c/21290/

Password-protected zip files

Are you planning to support password protected zip files by any chance?

Kind regards,

pieter

Investigate processing 2 values/loop

Investigate if this could help deflate level 1-4: facebook/zstd@72a3fbc

zlib decode is not "All heap memory allocations eliminated"

It only have one alloc left when I reuse the reader object:

https://github.com/klauspost/compress/blob/master/zlib/reader.go#L176

tracealloc(0xc4205fe030, 0x10, adler32.digest)
goroutine 5 [running]:
runtime.mallocgc(0x10, 0x1162cc0, 0x1, 0x0)
	/usr/local/go/src/runtime/malloc.go:783 +0x4d3 fp=0xc4205ebc38 sp=0xc4205ebb90 pc=0x100fa63
runtime.newobject(0x1162cc0, 0x126c6a0)
	/usr/local/go/src/runtime/malloc.go:840 +0x38 fp=0xc4205ebc68 sp=0xc4205ebc38 pc=0x100ffc8
hash/adler32.New(...)
	/usr/local/go/src/hash/adler32/adler32.go:38
github.com/bronze1man/kmg/vendor/github.com/klauspost/compress/zlib.(*reader).Reset(0xc420074190, 0x126c6a0, 0xc420074140, 0x0, 0x0, 0x0, 0xfd8dd378fc5066be, 0xc4200602f8)
	/xxx/src/github.com/bronze1man/kmg/vendor/github.com/klauspost/compress/zlib/reader.go:176 +0x3df fp=0xc4205ebda0 sp=0xc4205ebc68 pc=0x113bc6f

Affected by zip slip?

I'm not sure if this is the right place to ask, but can you look into whether this library is affected by zip slip? It seems like golang's standard library is.

https://github.com/snyk/zip-slip-vulnerability

Error when building flate with gcc-go

When cloning compress, entering the flate directory and running go build, I get the following error messages:

output.txt

This is with: go version go1.8.1 gccgo (GCC) 7.1.1 20170516 linux/amd64

Bug in flate?

Hello,

I am trying to use your library here https://github.com/gen2brain/raylib-go/tree/master/rres/cmd/rrem , to embed game resources in file. I have an issue only with DEFLATE (LZ4, XZ, BZIP2 etc. are ok) and both with your library and official compress/flate, not sure how they are related.

There is no issue with .wav data for example, that one is ok after compress/uncompress, only with image.Pix data array that I am compressing. This is what I am getting after uncompress http://imagizer.imageshack.com/img924/9867/EZEAmY.png , and it should be like this http://imagizer.imageshack.com/img924/3088/8TwKUa.png . Unfortunately, I don't have small example to reproduce this behaviour.

Integrate golang/go#11030

Hi there,

I just found your project, also skimmed over pgzip. Is zlib support somewhat planned or it's just not worth it? After checking zlib's source code it's just 150 lines of extensive use of compress/flate and hash/adler32.

p.d: Thanks for your work! Having golang/go#11030 resolved in an external library compatible with 1.4 is awesome!

Does Go 1.7 now match this library's speed?

Hi Klaus – Go 1.7 was released last week and the release notes say that compress/flate compression speed at the default compression level has doubled.

Is this because your code was incorporated to 1.7? Is your library still faster than 1.7?

Relatedly, I only ever see DefaultCompression encoded as -1. When and how does it become the level 6 that it is supposed to be?

Thanks,

Joe

Missing BSD license

Is looks like some of the files are licensed under BSD:

// Copyright 2011 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.

BSD says the following:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

You are not respecting this condition, as there is no BSD license in this repository.

Depending on your intentions, you may want to do the following:

relicense klauspost/compress under BSD
Include the full license on all headers
update the headers and mention a LICENSE.bsd file instead, and then include LICENSE.bsd at the root of the repository.

Add byte slice deflate.

Add a simple byte array deflate helper.

Move fuzz data to a separate repo

The repo is getting quite large - mostly because of fuzz data.

Move it to a separate repo.

Improve small payload performance

Noticed that after pulling master, performance of gzip compression is now lower than native Go implementation.

package lib

import (
    "testing"
    "bytes"
    "compress/gzip"
    ogzip "github.com/klauspost/compress/gzip"
    "fmt"
)

var bidReq = []byte(`{"id":"50215d10a41d474f77591bff601f6ade","imp":[{"id":"86df3bc6-7bd4-44d9-64e2-584a69790229","native":{"request":"{\"ver\":\"1.0\",\"plcmtcnt\":1,\"assets\":[{\"id\":1,\"data\":{\"type\":12}},{\"id\":2,\"required\":1,\"title\":{\"len\":50}},{\"id\":3,\"required\":1,\"img\":{\"type\":1,\"w\":80,\"h\":80}},{\"id\":4,\"required\":1,\"img\":{\"type\":3,\"w\":1200,\"h\":627}},{\"id\":5,\"data\":{\"type\":3}},{\"id\":6,\"required\":1,\"data\":{\"type\":2,\"len\":100}}]}","ver":"1.0"},"tagid":"1","bidfloor":0.6,"bidfloorcur":"USD"}],"site":{"id":"1012864","domain":"www.abc.com","cat":["IAB3"],"mobile":1,"keywords":"apps,games,discovery,recommendation"},"device":{"dnt":1,"ua":"Mozilla/5.0 (Linux; U; Android 4.2.2; km-kh; SHV-E120S Build/JZO54K) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30","ip":"175.100.59.170","geo":{"lat":11.5625,"lon":104.916,"country":"KHM","region":"12","city":"Phnom Penh","type":2},"carrier":"Viettel (cambodia) Pte., Ltd.","language":"km","model":"android","os":"Android","osv":"4.2.2","connectiontype":2,"devicetype":1},"user":{"id":"325a32d3-1dba-5ffc-82f2-1df428520728"},"at":2,"tmax":100,"wseat":["74","17","30","142","167","177","153","7","90","140","148","164","104","71","19","187","139","63","88","160","222","205","46"],"cur":["USD"]}`)

func BenchmarkNativeGzip(b *testing.B) {
    fmt.Println("BenchmarkNativeGzip")
    for i := 0; i < b.N; i++ {
        b := bytes.NewBuffer(nil)
        w := gzip.NewWriter(b)
        w.Write(bidReq)
        w.Close()
    }
}

func BenchmarkKlauspostGzip(b *testing.B) {
    fmt.Println("BenchmarkKlauspostGzip")
    for i := 0; i < b.N; i++ {
        b := bytes.NewBuffer(nil)
        w := ogzip.NewWriter(b)
        w.Write(bidReq)
        w.Close()
    }
}

/usr/local/Cellar/go/1.7.1/libexec/bin/go test -v github.com/kostyantyn/compressiontest/lib -bench "^BenchmarkNativeGzip|BenchmarkKlauspostGzip$" -run ^$
BenchmarkNativeGzip
    3000        387628 ns/op
BenchmarkKlauspostGzip
    3000        429190 ns/op
PASS
ok      github.com/pubnative/ad_server/lib  2.556s

zstd: Don't copy bytes when encoding

Since encoders operate on a single slice with history+input, we can avoid copying input by writing directly to the input slice of the encoder.

This means that encoderState.filling will be a slice of encoder.hist. We might need to make other changes to ensure that the input remains available even if the input is shifted down, so async block encoding still has it available.

Block encodes could just directly pass in the input, since we will not need any additional history.

zstd decode mismatch?

Investigate potential decoder mismatch.

CRC mismatch seen.

Using KP Compress outside of Golang contexts

Hi Klaus – Given your recent performance update, is Compress competitive with its C-based counterparts, like zlib and libdeflate? It would be interesting to see benchmarks against them instead of the standard Go library, since zlib is the standard benchmark and libdeflate is a faster, modern reimplementation of zlib. If Compress beats them, how would we use it, for example with nginx?

Building for arm: constant 2147483648 overflows int

When building a project that uses this as a dependency, I get this error:

$ CGO_ENABLED=1 CC=arm-linux-musleabihf-gcc GOOS=linux GOARCH=arm go build
# github.com/klauspost/compress/fse
../../../../klauspost/compress/fse/compress.go:22:13: constant 2147483648 overflows int
../../../../klauspost/compress/fse/fse.go:130:25: constant 2147483648 overflows int

Similarly in zstd/frameenc.go:

../../../../klauspost/compress/zstd/frameenc.go:108:11: constant 4294967295 overflows int

zstd: Non-compliant stream generated: "should consume entire input"

Seems we can generate invalid streams.

Getting: Decoding error (37) : should consume entire input

Replace with hash/crc32?

https://github.com/klauspost/compress/blob/master/gzip/gunzip.go#L17

Can this be replaced with hash/crc32?

Looking at the changelog at https://github.com/klauspost/crc32 and saw that changes have been merged into official go?

Reason why I'm asking, github.com/klauspost/crc32 seems to be breaking my Google App Engine builds.

(My deps tree: https://github.com/gavv/httpexpect -> https://github.com/valyala/fasthttp -> https://github.com/klauspost/compress/tree/master/gzip)

out-of-bounds read in crc32sseAll

From klauspost/crc32#4

There is a 1 byte out-of bound read on (length & 7) == 7.

snappy comparison with upstream need updating

You are obviously free to respond however you want, but FYI the upstream github.com/golang/snappy encoder now implements an asm version of what you call matchLenSSE4, and it (call this "new") now compares favorably with the github.com/klauspost/compress/snappy version (call this "old"):

benchmark                     old MB/s     new MB/s     speedup
BenchmarkWordsEncode1e1-8     153.77       673.38       4.38x
BenchmarkWordsEncode1e2-8     217.81       428.78       1.97x
BenchmarkWordsEncode1e3-8     282.31       446.89       1.58x
BenchmarkWordsEncode1e4-8     225.73       315.17       1.40x
BenchmarkWordsEncode1e5-8     158.92       267.72       1.68x
BenchmarkWordsEncode1e6-8     206.50       311.30       1.51x
BenchmarkRandomEncode-8       4055.50      14507.66     3.58x
Benchmark_ZFlat0-8            481.82       791.69       1.64x
Benchmark_ZFlat1-8            190.36       434.39       2.28x
Benchmark_ZFlat2-8            6436.37      16301.77     2.53x
Benchmark_ZFlat3-8            368.55       632.13       1.72x
Benchmark_ZFlat4-8            3257.82      7990.39      2.45x
Benchmark_ZFlat5-8            474.40       764.96       1.61x
Benchmark_ZFlat6-8            183.83       280.09       1.52x
Benchmark_ZFlat7-8            170.28       262.54       1.54x
Benchmark_ZFlat8-8            190.70       298.19       1.56x
Benchmark_ZFlat9-8            158.43       247.14       1.56x
Benchmark_ZFlat10-8           581.40       1028.24      1.77x
Benchmark_ZFlat11-8           310.57       408.89       1.32x

For the record, here's the -tags=noasm comparison. The numbers are worse for small inputs but better for large inputs, which I'd argue is still a net improvement:

benchmark                     old MB/s     new MB/s     speedup
BenchmarkWordsEncode1e1-8     140.02       677.54       4.84x
BenchmarkWordsEncode1e2-8     224.74       86.86        0.39x
BenchmarkWordsEncode1e3-8     274.82       258.34       0.94x
BenchmarkWordsEncode1e4-8     189.95       244.60       1.29x
BenchmarkWordsEncode1e5-8     140.10       185.91       1.33x
BenchmarkWordsEncode1e6-8     169.03       211.16       1.25x
BenchmarkRandomEncode-8       3746.11      13192.30     3.52x
Benchmark_ZFlat0-8            357.12       430.88       1.21x
Benchmark_ZFlat1-8            181.27       276.50       1.53x
Benchmark_ZFlat2-8            5959.15      14075.70     2.36x
Benchmark_ZFlat3-8            312.09       171.85       0.55x
Benchmark_ZFlat4-8            2008.62      3111.51      1.55x
Benchmark_ZFlat5-8            357.46       425.45       1.19x
Benchmark_ZFlat6-8            155.59       189.98       1.22x
Benchmark_ZFlat7-8            149.70       182.01       1.22x
Benchmark_ZFlat8-8            160.04       199.81       1.25x
Benchmark_ZFlat9-8            140.87       175.73       1.25x
Benchmark_ZFlat10-8           415.88       509.88       1.23x
Benchmark_ZFlat11-8           236.50       274.77       1.16x

In any case, the regular case (without -tags=noasm) seems always faster with upstream snappy, on this limited set of benchmarks.

zstd: Custom Dictionary Compression Support

First off, thanks for writing a pure Go implementation! My team has wanted to use zstd in our project for a long time now, but have been trying to avoid having any c-Go dependencies.

We have one use-case in particular that would really benefit from the ability to train and use custom dictionaries on the fly.

Is that feature on your roadmap anytime soon? and if not, how challenging do you think it would be for me to try upstream it? I'm happy to contribute some engineering work.

Cheers,
Richie

how to create an empty zlib.reader ?

I write a memory object pool to reuse some zlib.reader objects.
But I can not find an api to create a empty zlib.reader.
If I pass an empty io.Reader to zlib.NewReader, it will return unexpected EOF.
I can only use lazy init or pass an simple zlib compress output to it to work around this bug right now.

Investigate alternative histogram function

hist_4_32 could be an interesting histogram function.

timeout in zstd/encoder_test.go:100 on ARM Cortex-A9

The whole zst test takes over 120s total on an ARM Cortex-A9:

ok  	github.com/klauspost/compress/zstd	122.090s

I experimented with setting different timeout values in zstd/encoder_test.go:100 and found that 35 seconds is enough for this particular hardware, but you might want to increase it further to have some headroom for slower hardware.

zstd: Use single buffer for encodes, but copy data

The simplification by using a single buffer instead of holding on to the previous and switching between them seems to be considerably faster.

This also makes longer windows much more feasible

before/after:
file	out	level	insize	outsize	millis	mb/s
enwik9	zskp	1	1000000000	348027537	7499	127.16
enwik9	zskp	1	1000000000	343933099	5897	161.72

10gb.tar	zskp	1	10065157632	5001038195	58193	164.95
10gb.tar	zskp	1	10065157632	4888194207	45787	209.64

Compression rate is different at the same level old vs new

Hi, I did some local benchmark with different types of data and found that with a same compression level, the compression rate is different using old library vs this library. In particular, when I used old library I can achieve a compression rate with level 2 while using this one I need to use level 5. This makes this library actually slower than the old one if we are targeting at a same compression level. Is this a known issue or maybe I'm testing it wrong?

brotli support

It would be nice to have some wrapper or native implementation for golang.

Cannot be used with Modules enabled?

First, I'd like to thank you very much for your work!

I'm unable to use this library when Go Modules are enabled (using Go 1.12):

$ export GOPATH=$(mktemp -d)
$ export GO111MODULE=on

$ cat main.go
package main

import (
	. "github.com/klauspost/compress/zstd"
)

func main() {
}

$ go mod init github.com/fd0/zstdtest

$ go get github.com/klauspost/compress
go: finding github.com/klauspost/compress v1.7.0
go: downloading github.com/klauspost/compress v1.7.0
go: extracting github.com/klauspost/compress v1.7.0

$ go build
go: finding github.com/cespare/xxhash v1.1.0
go: downloading github.com/cespare/xxhash v1.1.0
go: extracting github.com/cespare/xxhash v1.1.0
go: finding github.com/OneOfOne/xxhash v1.2.2
go: finding github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72
# github.com/klauspost/compress/zstd
/tmp/tmp.66p2ytn1dg/pkg/mod/github.com/klauspost/[email protected]/zstd/enc_fast.go:32:15: undefined: xxhash.Digest

As far as I can see the reason is that while your package hasn't adopted Go Modules yet, the xxhash package has and is already at v2.0.0 and the zstd package depends on functionality not available in versions < v2.0.0.

So the following happens:

In non-Go-Module mode (GOPATH-mode), the master branch of the xxhash package is used, which is at v2.0.0, so the code compiles
In Module mode, my program gets compress at version v1.7.0. The toolchain detects that it depends on github.com/cespare/xxhash. That library has opted in to Go Modules, under which the import path github.com/cespare/xxhash is only valid for versions v0.x.x and v1.x.x. So the toolchain selects v1.1.0. But the API used by the zstd package is only available in v2.0.0, and the build fails.

The only solution (as far as I can see) is to opt into Go Modules:

In Module mode, the toolchain sees the requirement for v2.0.0 of xxhash, gets it and the build just works
In non-Module mode, the older versions of Go (starting at 1.9.7 and 1.10.3) have been patched to support dropping the /v2 at the end of the import path, so the toolchain will just get the master branch of the xxhash package and it also works. I've verified this using Go 1.12 and Go 1.9.7. Building with older versions (< 1.9.7 or < 1.10.3) fails though, although this could maybe be fixed by the author of the xxhash package.

Please let me know if you'd like me to submit this as a pull request :)

Get rid of the ebook

Hey,

Would you mind getting rid of the ebook? (testdata/Mark.Twain-Tom.Sawyer.txt)

I am packaging compress for Debian and I don't think this book is free software ;). Maybe replace it with gibberish?

panic on Writer.Close, runtime error: index out of range

Unfortunately I don't have a test to reproduce this, but over hundreds of thousands of calls we started seeing a few of these panics:

runtime error: index out of range
goroutine 148112966 [running]:
net/http.func·011()
        /usr/lib/go/src/net/http/server.go:1130 +0xbb
github.com/klauspost/compress/flate.(*compressor).deflateNoSkip(0xc20e431600)
        _vendor/src/github.com/klauspost/compress/flate/deflate.go:594 +0xc68
github.com/klauspost/compress/flate.(*compressor).close(0xc20e431600, 0x0, 0x0)
        _vendor/src/github.com/klauspost/compress/flate/deflate.go:773 +0x49
github.com/klauspost/compress/flate.(*Writer).Close(0xc20e431600, 0x0, 0x0)
        _vendor/src/github.com/klauspost/compress/flate/deflate.go:854

We never saw this before I updated our snapshot on Nov 2nd. The previous update was Sep 8th.

zstd decoder: race on concatenated streams

It seems there is a race with concatenated streams, identified in #136

zstd: Make new frames start concurrently

Currently the start of a new frame requires the previous one to finish decoding.

Since the new frame isn't dependent on the previous it could be advantageous to start decoding it right away.

decompression for streaming data in zip

Hi,

May i ask if the zlib package here supports streaming zip data decompression ?
Imagine there is a streaming of zlib data over a tcp socket, would like to try to use zlib package to do the decompression for data each read from the packet. Pseudo code is as below.

for {
   ...
   err := socketconn.Read(data)
   ...
   zlib.Reader(...) // incrementally does the decompression when new data flows in
}

Thank you !

can't compile compress/fse on 32bit architectures

steps to reproduce

create a Go program.

package main

import _ "github.com/klauspost/compress/fse"

func main() {
}

then build it for 32bit architectures.

$ GOARCH=386 go build main.go 
go: finding github.com/klauspost/compress/fse latest
go: downloading github.com/klauspost/compress v1.7.0
go: extracting github.com/klauspost/compress v1.7.0
# github.com/klauspost/compress/fse
../../pkg/mod/github.com/klauspost/[email protected]/fse/compress.go:22:13: constant 2147483648 overflows int
../../pkg/mod/github.com/klauspost/[email protected]/fse/fse.go:130:25: constant 2147483648 overflows int

SIGILL: illegal instruction

Environment:

$ uname -a
Linux test01 2.6.32-504.16.2.el6.x86_64 #1 SMP Wed Apr 22 06:48:29 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

go1.4.2

SIGILL: illegal instruction
PC=0x5fd7f2

goroutine 24 [running]:
github.com/klauspost/crc32.ieeeSSE42(0xffffffff, 0xc208123000, 0x1000, 0x1000, 0xc208187a58, 0x0, 0x0, 0x1000, 0xc208123000, 0x1000, ...)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/crc32/crc32_amd64.s:122 +0x52 fp=0xc2081c2a78 sp=0xc2081c2a70
github.com/klauspost/crc32.updateIEEE(0x0, 0xc208123000, 0x1000, 0x1000, 0x7000)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/crc32/crc32_amd64x.go:39 +0x99 fp=0xc2081c2ad8 sp=0xc2081c2a78
github.com/klauspost/crc32.Update(0xc200000000, 0xc20802c000, 0xc208123000, 0x1000, 0x1000, 0x1000)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/crc32/crc32.go:115 +0x85 fp=0xc2081c2b10 sp=0xc2081c2ad8
github.com/klauspost/crc32.(*digest).Write(0xc20813e000, 0xc208123000, 0x1000, 0x1000, 0x1000, 0x0, 0x0)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/crc32/crc32.go:121 +0x62 fp=0xc2081c2b48 sp=0xc2081c2b10
github.com/klauspost/compress/gzip.(*Reader).Read(0xc2081a8000, 0xc208123000, 0x1000, 0x1000, 0x1000, 0x0, 0x0)
    /home/darren/testproj/Godeps/_workspace/src/github.com/klauspost/compress/gzip/gunzip.go:251 +0x191 fp=0xc2081c2c18 sp=0xc2081c2b48
bufio.(*Scanner).Scan(0xc2081a4080, 0xc208152008)
    /usr/local/go/src/bufio/scan.go:180 +0x688 fp=0xc2081c2d90 sp=0xc2081c2c18
main.parseLogFile(0xc2080f0210, 0x30, 0xc2080e82c0, 0x0, 0x0)
    /home/darren/testproj/main.go:162 +0x426 fp=0xc2081c2f98 sp=0xc2081c2d90
main.func·006(0xc2080f0210, 0x30, 0xc2080e82c0)
    /home/darren/testproj/main.go:298 +0x5a fp=0xc2081c2fc8 sp=0xc2081c2f98
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc2081c2fd0 sp=0xc2081c2fc8
created by main.ProcessLogs
    /home/darren/testproj/main.go:299 +0x37d
...

goroutine 83 [runnable]:
main.func·003()
    /home/darren/testproj/main.go:153
created by main.parseLogFile
    /home/darren/testproj/main.go:159 +0x3eb

rax     0x1000
rbx     0xffffffff
rcx     0xfc0
rdx     0xc208123000
rdi     0x1000
rsi     0xc208123040
rbp     0xc20802c000
rsp     0xc2081c2a70
r8      0x18
r9      0x8000
r10     0x18
r11     0xc2081c4000
r12     0xc2081cbfe8
r13     0x1e
r14     0x0
r15     0x3
rip     0x5fd7f2
rflags  0x10202
cs      0x33
fs      0x0
gs      0x0

flate: incorrect encoding for level==2 and Flush calls

This program works for compress/flate but not for "github.com/klauspost/compress/flate"

package main

import (
    "bytes"
    "io/ioutil"
    "log"

    "github.com/klauspost/compress/flate"
)

func main() {
    buf := new(bytes.Buffer)

    w, err := flate.NewWriter(buf, 2)
    if err != nil {
        log.Fatal(err)
    }
    defer w.Close()

    abc := make([]byte, 128)
    for i := range abc {
        abc[i] = byte(i)
    }

    bs := [][]byte{
        bytes.Repeat(abc, 65536/len(abc)),
        abc,
    }
    for _, b := range bs {
        w.Write(b)
        w.Flush()
    }
    w.Close()

    r := flate.NewReader(buf)
    defer r.Close()
    got, err := ioutil.ReadAll(r)
    if err != nil {
        log.Fatal(err)
    }

    want := bytes.Join(bs, nil)
    if bytes.Equal(got, want) {
        return
    }
    if len(got) != len(want) {
        log.Fatalf("length: got %d, want %d", len(got), len(want))
    }
    for i, g := range got {
        if w := want[i]; g != w {
            log.Fatalf("byte #%d: got %#02x, want %#02x", i, g, w)
        }
    }
}

compress/flat: how to optimize memory allocations

I'm try to use compress/flat and compress/gzip in http middleware
from gorilla/handlers:
https://github.com/gorilla/handlers/blob/master/compress.go

payload that sent/received is relative small 300-500 bytes.
When i'm profile my code i see:

(pprof) top
Showing nodes accounting for 2920.49MB, 95.58% of 3055.68MB total
Dropped 223 nodes (cum <= 15.28MB)
Showing top 10 nodes out of 76
      flat  flat%   sum%        cum   cum%
 1298.35MB 42.49% 42.49%  2358.66MB 77.19%  compress/flate.NewWriter
  658.62MB 21.55% 64.04%  1060.31MB 34.70%  compress/flate.(*compressor).init
  451.80MB 14.79% 78.83%   451.80MB 14.79%  regexp.(*bitState).reset
  391.68MB 12.82% 91.65%   391.68MB 12.82%  compress/flate.newDeflateFast

does it possible to minimize memory allocations/usage for such kind of payload

Commit 4a06a0409c7ce17f1355156990df2def3fc6b40c breaks fasthttp tests

fasthttp tests started failing after the commit 6d586fb :

--- FAIL: TestFlateCompress (0.00s)
    compress_test.go:86: unexpected string after decompression: "". Expecting "adf asd asd fasd fasd"
--- FAIL: TestResponseDeflate (0.00s)
    http_test.go:757: unexpected body "". Expecting "asoiowqoieroqweiruqwoierqo"
FAIL

See https://travis-ci.org/valyala/fasthttp/jobs/169191781 for more info.

please tag and version this project

Hello,

Can you please tag and version this project?

I am the Debian Maintainer for compress and versioning would help Debian keep up with development.

Merge into standard library

Good work with these optimizations! Is there a reason that these improvements shouldn't/wouldn't be merged into the standard library?

zstd write: panic: runtime error: slice bounds out of range

 panic: runtime error: slice bounds out of range
 
 goroutine 1830 [running]:
 github.com/klauspost/compress/zstd.(*Encoder).Write(0xc0277fdc00, 0xc03051c000, 0x9dc4, 0xf6eb, 0xc000034078, 0xc27ba0, 0xe8c7c8)
 	/src/github.com/klauspost/compress/zstd/encoder.go:128 +0x4d4

snappy/decode.go length can overflow

snappy/decode.go has this code:

x = uint(src[s-4]) | uint(src[s-3])<<8 | uint(src[s-2])<<16 | uint(src[s-1])<<24
}
// length always > 0
length = int(x + 1)

That comment isn't always true, if you're on 32-bit ints, and so the subsequent "src[s:s+length]" can panic.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.