Coder Social home page Coder Social logo

facebook / zstd Goto Github PK

View Code? Open in Web Editor NEW
22.7K 412.0 2.0K 37.85 MB

Zstandard - Fast real-time compression algorithm

Home Page: http://www.zstd.net

License: Other

Makefile 1.97% C 76.04% Shell 3.04% C++ 13.97% CMake 0.73% Batchfile 0.08% Python 2.57% Roff 0.62% Meson 0.48% Lua 0.03% Dockerfile 0.02% HTML 0.02% Starlark 0.17% Assembly 0.26% Swift 0.02%

zstd's Introduction

Zstandard

Zstandard, or zstd as short version, is a fast lossless compression algorithm, targeting real-time compression scenarios at zlib-level and better compression ratios. It's backed by a very fast entropy stage, provided by Huff0 and FSE library.

Zstandard's format is stable and documented in RFC8878. Multiple independent implementations are already available. This repository represents the reference implementation, provided as an open-source dual BSD OR GPLv2 licensed C library, and a command line utility producing and decoding .zst, .gz, .xz and .lz4 files. Should your project require another programming language, a list of known ports and bindings is provided on Zstandard homepage.

Development branch status:

Build Status Build status Build status Fuzzing Status

Benchmarks

For reference, several fast compression algorithms were tested and compared on a desktop featuring a Core i7-9700K CPU @ 4.9GHz and running Ubuntu 20.04 (Linux ubu20 5.15.0-101-generic), using lzbench, an open-source in-memory benchmark by @inikep compiled with gcc 9.4.0, on the Silesia compression corpus.

Compressor name Ratio Compression Decompress.
zstd 1.5.6 -1 2.887 510 MB/s 1580 MB/s
zlib 1.2.11 -1 2.743 95 MB/s 400 MB/s
brotli 1.0.9 -0 2.702 395 MB/s 430 MB/s
zstd 1.5.6 --fast=1 2.437 545 MB/s 1890 MB/s
zstd 1.5.6 --fast=3 2.239 650 MB/s 2000 MB/s
quicklz 1.5.0 -1 2.238 525 MB/s 750 MB/s
lzo1x 2.10 -1 2.106 650 MB/s 825 MB/s
lz4 1.9.4 2.101 700 MB/s 4000 MB/s
lzf 3.6 -1 2.077 420 MB/s 830 MB/s
snappy 1.1.9 2.073 530 MB/s 1660 MB/s

The negative compression levels, specified with --fast=#, offer faster compression and decompression speed at the cost of compression ratio.

Zstd can also offer stronger compression ratios at the cost of compression speed. Speed vs Compression trade-off is configurable by small increments. Decompression speed is preserved and remains roughly the same at all settings, a property shared by most LZ compression algorithms, such as zlib or lzma.

The following tests were run on a server running Linux Debian (Linux version 4.14.0-3-amd64) with a Core i7-6700K CPU @ 4.0GHz, using lzbench, an open-source in-memory benchmark by @inikep compiled with gcc 7.3.0, on the Silesia compression corpus.

Compression Speed vs Ratio Decompression Speed
Compression Speed vs Ratio Decompression Speed

A few other algorithms can produce higher compression ratios at slower speeds, falling outside of the graph. For a larger picture including slow modes, click on this link.

The case for Small Data compression

Previous charts provide results applicable to typical file and stream scenarios (several MB). Small data comes with different perspectives.

The smaller the amount of data to compress, the more difficult it is to compress. This problem is common to all compression algorithms, and reason is, compression algorithms learn from past data how to compress future data. But at the beginning of a new data set, there is no "past" to build upon.

To solve this situation, Zstd offers a training mode, which can be used to tune the algorithm for a selected type of data. Training Zstandard is achieved by providing it with a few samples (one file per sample). The result of this training is stored in a file called "dictionary", which must be loaded before compression and decompression. Using this dictionary, the compression ratio achievable on small data improves dramatically.

The following example uses the github-users sample set, created from github public API. It consists of roughly 10K records weighing about 1KB each.

Compression Ratio Compression Speed Decompression Speed
Compression Ratio Compression Speed Decompression Speed

These compression gains are achieved while simultaneously providing faster compression and decompression speeds.

Training works if there is some correlation in a family of small data samples. The more data-specific a dictionary is, the more efficient it is (there is no universal dictionary). Hence, deploying one dictionary per type of data will provide the greatest benefits. Dictionary gains are mostly effective in the first few KB. Then, the compression algorithm will gradually use previously decoded content to better compress the rest of the file.

Dictionary compression How To:

  1. Create the dictionary

    zstd --train FullPathToTrainingSet/* -o dictionaryName

  2. Compress with dictionary

    zstd -D dictionaryName FILE

  3. Decompress with dictionary

    zstd -D dictionaryName --decompress FILE.zst

Build instructions

make is the officially maintained build system of this project. All other build systems are "compatible" and 3rd-party maintained, they may feature small differences in advanced options. When your system allows it, prefer using make to build zstd and libzstd.

Makefile

If your system is compatible with standard make (or gmake), invoking make in root directory will generate zstd cli in root directory. It will also create libzstd into lib/.

Other available options include:

  • make install : create and install zstd cli, library and man pages
  • make check : create and run zstd, test its behavior on local platform

The Makefile follows the GNU Standard Makefile conventions, allowing staged install, standard flags, directory variables and command variables.

For advanced use cases, specialized compilation flags which control binary generation are documented in lib/README.md for the libzstd library and in programs/README.md for the zstd CLI.

cmake

A cmake project generator is provided within build/cmake. It can generate Makefiles or other build scripts to create zstd binary, and libzstd dynamic and static libraries.

By default, CMAKE_BUILD_TYPE is set to Release.

Support for Fat (Universal2) Output

zstd can be built and installed with support for both Apple Silicon (M1/M2) as well as Intel by using CMake's Universal2 support. To perform a Fat/Universal2 build and install use the following commands:

cmake -B build-cmake-debug -S build/cmake -G Ninja -DCMAKE_OSX_ARCHITECTURES="x86_64;x86_64h;arm64"
cd build-cmake-debug
ninja
sudo ninja install

Meson

A Meson project is provided within build/meson. Follow build instructions in that directory.

You can also take a look at .travis.yml file for an example about how Meson is used to build this project.

Note that default build type is release.

VCPKG

You can build and install zstd vcpkg dependency manager:

git clone https://github.com/Microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.sh
./vcpkg integrate install
./vcpkg install zstd

The zstd port in vcpkg is kept up to date by Microsoft team members and community contributors. If the version is out of date, please create an issue or pull request on the vcpkg repository.

Visual Studio (Windows)

Going into build directory, you will find additional possibilities:

  • Projects for Visual Studio 2005, 2008 and 2010.
    • VS2010 project is compatible with VS2012, VS2013, VS2015 and VS2017.
  • Automated build scripts for Visual compiler by @KrzysFR, in build/VS_scripts, which will build zstd cli and libzstd library without any need to open Visual Studio solution.

Buck

You can build the zstd binary via buck by executing: buck build programs:zstd from the root of the repo. The output binary will be in buck-out/gen/programs/.

Bazel

You easily can integrate zstd into your Bazel project by using the module hosted on the Bazel Central Repository.

Testing

You can run quick local smoke tests by running make check. If you can't use make, execute the playTest.sh script from the src/tests directory. Two env variables $ZSTD_BIN and $DATAGEN_BIN are needed for the test script to locate the zstd and datagen binary. For information on CI testing, please refer to TESTING.md.

Status

Zstandard is currently deployed within Facebook and many other large cloud infrastructures. It is run continuously to compress large amounts of data in multiple formats and use cases. Zstandard is considered safe for production environments.

License

Zstandard is dual-licensed under BSD OR GPLv2.

Contributing

The dev branch is the one where all contributions are merged before reaching release. If you plan to propose a patch, please commit into the dev branch, or its own feature branch. Direct commit to release are not permitted. For more information, please read CONTRIBUTING.

zstd's People

Contributors

ahmedabdellah19 avatar bimbashrestha avatar binhdvo avatar bket avatar cwoffenden avatar cyan4973 avatar daniellerozenblit avatar danlark1 avatar dependabot[bot] avatar eli-schwartz avatar embg avatar ephiepark avatar felixhandte avatar georgelu97 avatar inikep avatar krzysfr avatar mailagentrus avatar nmagerko avatar paulcruz74 avatar scottchiefbaker avatar sean-purcell avatar senhuang42 avatar shakeelrao avatar shashank0791 avatar stellamplau avatar terrelln avatar tesuji avatar thatsafunnyname avatar yijinfb avatar yoniko avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zstd's Issues

Doesn't compile with GCC 4.4

Error due to type redefinition.
Original discussion : #24

Note : it's not an issue for later versions of GCC and clang, because the type redefinition defines exactly the same type. But earlier GCC versions nonetheless consider it an error.

Segfault during decompression

Hello! Thanks for a great library!
I've recently encountered problem while decoding data generated by this script:

#!/usr/bin/env python

import sys

n = int(sys.argv[1])

for i in range(0, n):
    sys.stdout.write(chr(i + ord('a')) * (2**i))

When I run:

  ./generate.py 20 > input 
  ./zstd -f input compressed  # Compressed 1048575 bytes into 235 bytes ==> 0.02%
  ./zstd -f -d compressed decompressed # Segmentation fault

I get segmentation fault.

If I replace 20 with 19 I get:

  ./generate.py 19 > input 
  ./zstd -f input compressed  # Compressed 524287 bytes into 159 bytes ==> 0.03%
  ./zstd -f -d compressed decompressed # Decoded 262144 bytes

Which is weird because decompressed input size doesn't match original input size.

I faced this situation on OSX 10.10 and on Ubuntu 12.04 with both Clang and GCC 4.9

Standard names for compression levels

I'm currently updating my wrapper for the new API introduced in 0.4.x, and I'm merging all the methods with and without compression levels into a single set of methods.

Asking the caller to pass an untyped integer value for the compression level seems a bit dangerous: most people will probably remember zlib, and expect a 1 - 9 scale with default at 5. That's why I would like to have an enum with well known names, and a clear default "if you don't know better use that one" level.

Quick questions:

  • Is the default level still intended to be level 1? or is the whole zstd vs zstd_hc a thing of the past?
  • What about level 0? The comment in zstd_static.h says that level 0 is "never used", but quick tests show that using compression level 0 works fine (and compress about the same as level 1)
  • Are there plans to have standardized names for some compression levels, like "fast", "high", "ultra" and so on? And if yes, what would be the values?
  • Right now max level is 20 but it seems that it can change. If yes, do you plan to have some way to, at runtime, probe for the range of supported compression levels?

exit nonzero upon write failure

Report by Jim Meyering :

Please make it diagnose and exit nonzero upon write failure. A good way to demonstrate the problem is to use linux's /dev/full device:
$ echo foo | programs/zstd > /dev/full; echo $?

lost 20% of decompression speed in v0.4

It seems that you have lost 20% of decompression speed in v0.4:

Compressor name Compression Decompress. Compr. size Ratio
zstd_HC v0.3.6 level 1 250 MB/s 529 MB/s 51230550 48.86
zstd_HC v0.3.6 level 2 186 MB/s 498 MB/s 49678572 47.38
zstd_HC v0.3.6 level 3 90 MB/s 484 MB/s 48838293 46.58
zstd_HC v0.3.6 level 4 75 MB/s 474 MB/s 48423913 46.18
zstd_HC v0.3.6 level 5 61 MB/s 467 MB/s 46480999 44.33
zstd_HC v0.3.6 level 6 40 MB/s 477 MB/s 45723093 43.60
zstd_HC v0.3.6 level 7 28 MB/s 480 MB/s 44803941 42.73
zstd_HC v0.3.6 level 8 21 MB/s 475 MB/s 44511976 42.45
zstd_HC v0.3.6 level 9 15 MB/s 497 MB/s 43899996 41.87
zstd_HC v0.3.6 level 10 16 MB/s 493 MB/s 43845344 41.81
zstd_HC v0.3.6 level 11 15 MB/s 491 MB/s 42506862 40.54
zstd_HC v0.3.6 level 12 11 MB/s 493 MB/s 42402232 40.44
zstd v0.4 level 1 244 MB/s 492 MB/s 51160301 48.79
zstd v0.4 level 2 176 MB/s 443 MB/s 49719335 47.42
zstd v0.4 level 3 88 MB/s 422 MB/s 48749022 46.49
zstd v0.4 level 4 74 MB/s 402 MB/s 48352259 46.11
zstd v0.4 level 5 69 MB/s 387 MB/s 46389082 44.24
zstd v0.4 level 6 36 MB/s 387 MB/s 45525313 43.42
zstd v0.4 level 7 29 MB/s 390 MB/s 44805120 42.73
zstd v0.4 level 8 23 MB/s 389 MB/s 44509894 42.45
zstd v0.4 level 9 16 MB/s 402 MB/s 43892280 41.86
zstd v0.4 level 10 18 MB/s 407 MB/s 43807530 41.78
zstd v0.4 level 11 15 MB/s 417 MB/s 42498160 40.53
zstd v0.4 level 12 11 MB/s 406 MB/s 42394424 40.43

Question around ZSTD_decompressContinue

Hi @Cyan4973

If my understanding is correct, currently ZSTD_decompressContinue expects src to be the [compressed block + block header of next block]. This is a problem in scenarios where we want to decompress blocks independently using the framing format, e.g. we dont have the next block available yet.

Wouldn't it make more sense that ZSTD_decompressContinue takes src as [block header + compressed block] ? This way the current block can be decompressed without needing the header of the next block?

searchLength=7 aliases searchLength=4

searchLength=7 in ZSTD_parameters is permitted, but it seems to give exactly the same compression and speed results as searchLength=4. Possibly this is because of the switches on matchLengthSearch in the code that have cases for 4-6 but not 7.

Cmake support

I can prepare and push CMakeLists.txt file for generating solution or make file on different platforms. I don't know if this feature will be useful or not?

Debian Package

Hi Yann,

I'm just posting this as a courtesy message to say that I intend to package the Zstd library for Debian. A quick question for you, is the preferred name "zstd", or "zstandard"?

I'll post back with progress as it occurs.

Cheers,
Kevin

Idea: adaptive compression

I need to send large core dumps from embedded device to server, and do it in minimal time. I've found zstd to be optimal solution because it is very fast and has good compression ratio. I'm using it like this:

zstd  -c | curl -T - $url

where kernel fills stdin of zstd

However, if user has narrow bandwidth for upload, it could be benifical to switch from fast compression to high compression which in my case is 30% more effective. For example, if in first 10 seconds zstd detects that compression throughput is N times larger than write speed (in this case to stdout), it automatically switches high compression and uses it up to the end of input stream.

Would such feature make sense?

btlazy2 strategy is incredibly slow on highly repetitive data

For example, on a file containing 10,000 repetitions of "All work and no play makes Jack a dull boy.\n" (440,000 bytes total), zstd -b15 gives about 23 MB/s on my laptop while zstd -b16 and higher give about 0.02 MB/s. I had to add another digit to the speed output to see anything but 0.0. I assume the switch to the btlazy2 strategy is what makes the difference.

ZSTD_GENERIC_ERROR when compressing large binary files

When compressing a large 6GB binary file a compression error happens. After some investigation it appears to happen when there are more symbols than the max symbol limit. The line is here

https://github.com/Cyan4973/zstd/blob/master/lib/fse.c#L1458

A simple temporary fix is just deleting this line, but my guess is that this isn't a good solution. Increasing the max symbol limit didn't seem to work, but I'm not that familiar with the code base so I'm sure I missed something.

"Stack cookie instrumentation code detected a stack-based buffer overrun" in HUF_fillDTableX4Level2

tl;dr: this is a codegen issue with VS2013 in Release x64, see discussion below.

I have a test that attempts to compress and decompress vectors of raw data, and some of them (1 in 100+?) consistently crash during decompression. The crash is reproducible and deterministic, always on the same vectors.

The test program is written in .NET 4.6 and is using PInvoke to call into a version of zstd build as an x64 dll, compiled from the 0.2.1 release (9e61835).

The crash message is Stack cookie instrumentation code detected a stack-based buffer overrun and looks like the stack was overwritten by garbage during decoding.

When I try to decompress the data using zstd.exe from the command line, it works fine. But whenever I try to decompress it from my code, it crashes. I tried either calling ZSTD_compress(...) directly, or reimplementing the same logic in fileio.c (using a DCTX and calling repeatedly ZSTD_nextSrcSizeToDecompress and ZSTD_decompressContinue) and both fail exactly the same way (and also both work perfectly fine with the other 99% vectors).

The crash occurs during the first call to ZSTD_decompressContinue() that has actual compressed data (ie: first call with the frame header returns 0, then next call with the first chunk of compressed data crashes).

I was able to create a pair of one file that decompress fine, and another file that systematically crashes, and can reproduce the issue with the test (.NET code). The same code works perfectly with the previous 0.1.x branch.

The original files (both are a highly compressed vector of integer values) can be found here: https://github.com/KrzysFR/frqsspslt/blob/76e8e799c936096819bb7b97bfb13d764949d115/attachments/zstd/sample_data.zip?raw=true

  • original_pass.bin: compress/decompress ok
  • original_fail.bin: compress ok, decompress using zstdcli, but fail when decompressing from .NET:

Test program:

var files = new[] { "original_pass.bin", "original_fail.bin" };

foreach (var file in files)
{
    Trace.WriteLine("## " + file);

    var original = new ArraySegment<byte>(File.ReadAllBytes(Path.Combine(@"..\..", file)));
    ulong h1 = XxHash64.FromBytes(original);
    Trace.WriteLine($"> Original    : {original.Count,10:N0} bytes (hash: 0x{h1:x16})");

    var compressed = ZStd.CompressBuffer(original);
    Trace.WriteLine($"> Compressed  : {compressed.Count,10:N0} bytes (hash: 0x{XxHash64.FromBytes(compressed):x16})");
    using (var fs = File.Create(Path.Combine(@"..\..", file + ".zst")))
    { // save to disk (for reference)
        fs.Write(compressed.Array, compressed.Offset, compressed.Count);
    }

    var decompressed = ZStd.DecompressBuffer(compressed, originalSize: original.Count);
    var h2 = XxHash64.FromBytes(decompressed);
    Trace.WriteLine($"> Decompressed: {decompressed.Count,10:N0} bytes (hash: 0x{h2:x16})");

    if (h1 != h2)
    {
        Trace.WriteLine("> FAILED! hashes to not match!");
        Trace.WriteLine(HexaDump.Versus(original, compressed));
    }
    else
    {
        Trace.WriteLine("> PASS");
    }
}

Outputs:

## original_pass.bin
> Original    :  1,469,465 bytes (hash: 0x514cf5e26f0c9054)
> Compressed  :      2,855 bytes (hash: 0xa3a91b9ecdfaed20)
> Decompressed:  1,469,465 bytes (hash: 0x514cf5e26f0c9054)
> PASS
## original_fail.bin
> Original    :  1,958,527 bytes (hash: 0xd6172d7d482e5460)
> Compressed  :    177,267 bytes (hash: 0x452a6dce4b2dfb04)
> Decompressed:  >CRASH< (StackOverflow?)
Unhandled exception at 0x00007FF964CDC798 (zstd_x64.dll) in Test.exe: Stack cookie instrumentation code detected a stack-based buffer overrun.

Attaching a debugger, I get the following stacktrace:

CallStack:
    zstd_x64.dll!__report_gsfailure(unsigned __int64 StackCookie) Line 151  C
    zstd_x64.dll!__GSHandlerCheck(_EXCEPTION_RECORD * ExceptionRecord, void * EstablisherFrame, _CONTEXT * ContextRecord, _DISPATCHER_CONTEXT * DispatcherContext) Line 91  C
    ntdll.dll!RtlpExecuteHandlerForException()  Unknown
    ntdll.dll!RtlDispatchException()    Unknown
    ntdll.dll!KiUserExceptionDispatch() Unknown
>   zstd_x64.dll!HUF_fillDTableX4Level2(HUF_DEltX4 * DTable, unsigned int sizeLog, const unsigned int consumed, const unsigned int * rankValOrigin, const int minWeight, const sortedSymbol_t * sortedSymbols, const unsigned int sortedListSize, unsigned int nbBitsBaseline, unsigned short baseSeq) Line 893 C
    zstd_x64.dll!HUF_fillDTableX4(HUF_DEltX4 * DTable, const unsigned int targetLog, const sortedSymbol_t * sortedList, const unsigned int sortedListSize, const unsigned int * rankStart, unsigned int[17] * rankValOrigin, const unsigned int maxWeight, const unsigned int nbBitsBaseline) Line 951  C
    zstd_x64.dll!HUF_readDTableX4(unsigned int * DTable, const void * src, unsigned __int64 srcSize) Line 1041  C
    zstd_x64.dll!HUF_decompress4X4(void * dst, unsigned __int64 dstSize, const void * cSrc, unsigned __int64 cSrcSize) Line 1255    C
    // everything before that in the callstack looks like garbage (more like data, and not actual method pointers)
    0109000701090007()  Unknown
    0109000701090007()  Unknown
    0109000701090007()  Unknown
    //... this garbage address is repeated about a thousand times, and is the same value in cSrcSize/dstSize below, which confirms that the stack has been overwritten
    0109000701090007()  Unknown
    0109000701090007()  Unknown
    0000002078746341()  Unknown
    000030b400000001()  Unknown
    00000000000000dc()  Unknown
    0000000000000020()  Unknown
    0000000100000014()  Unknown
    0000003400000007()  Unknown
    000000010000017c()  Unknown

HUF_fillDTableX4Level2() locals:

        baseSeq             0x0007  unsigned short
        consumed            0x7ef2b8e0  const unsigned int
+       DElt                {sequence=0x0007 nbBits=0x09 '\t' length=0x01 '\x1' }   HUF_DEltX4
+       DTable              0x0000006f7edd9fb0 {sequence=0x0405 nbBits=0x06 '\x6' length=0x04 '\x4' }   HUF_DEltX4 *
        minWeight           0x00000007  const int
        nbBitsBaseline      0x0000000a  unsigned int
+       rankVal             0x0000006f7edd95c0 {0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x5ea64b14, 0x00007ff9, ...}    unsigned int[0x00000011]
+       rankValOrigin       0x00007ff95ea037fd {Inside clr.dll!EEHeapFreeInProcessHeap(void)} {0x48c0b60f}  const unsigned int *
        sizeLog             0x00000000  unsigned int
        sortedListSize      0x00000003  const unsigned int
+       sortedSymbols       0x0000006f7edd9fba {symbol=0x00 '\0' weight=0x07 '\a' } const sortedSymbol_t *

HUF_fillDTableX4() locals:

+       DElt                {sequence=0xb8e0 nbBits=0xf2 'ò' length=0x7e '~' } HUF_DEltX4
+       DTable              0x0000000000008918 {sequence=??? nbBits=??? length=??? }    HUF_DEltX4 *
        maxWeight           0x00000007  const unsigned int
        nbBitsBaseline      0x0000000a  const unsigned int
+       rankStart           0x0000006f7edd97e0 {0x00000000} const unsigned int *
+       rankVal             0x0000006f7edd96f0 {0x00000000, 0x00000000, 0x000007b0, 0x000007c0, 0x000007c0, 0x00000880, 0x00000a00, ...}    unsigned int[0x00000011]
+       rankValOrigin       0x0000006f7edd9880 {0x00000000, 0x00000000, 0x000007b0, 0x000007c0, 0x000007c0, 0x00000880, 0x00000a00, ...}    unsigned int[0x00000011] *
+       sortedList          0x0000006f19f628c8 {symbol=0x00 '\0' weight=0x00 '\0' } const sortedSymbol_t *
        sortedListSize      0x00000008  const unsigned int
        targetLog           0x7edd9890  const unsigned int

HUF_readDTableX4() locals:

+       DTable              0x0000006f7eeeddb0 {0x5f0d2408} unsigned int *
        nbSymbols           0x00000100  unsigned int
+       rankStart0          0x0000006f7edd97e0 {0x00000000, 0x00000000, 0x000000f6, 0x000000f7, 0x000000f7, 0x000000fa, 0x000000fd, ...}    unsigned int[0x00000012]
+       rankStats           0x0000006f7edd9830 {0x00000000, 0x000000f6, 0x00000001, 0x00000000, 0x00000003, 0x00000003, 0x00000000, ...}    unsigned int[0x00000011]
+       rankVal             0x0000006f7edd9880 {0x0000006f7edd9880 {0x00000000, 0x00000000, 0x000007b0, 0x000007c0, 0x000007c0, ...}, ...}  unsigned int[0x00000010][0x00000011]
+       sortedSymbol        0x0000006f7edd9dc0 {{symbol=0x07 '\a' weight=0x01 '\x1' }, {symbol=0x08 '\b' weight=0x01 '\x1' }, {symbol=...}, ...}    sortedSymbol_t[0x00000100]
        src                 mscorlib.ni.dll!0x00007ff95d0c0220 (load symbols for additional information)    const void *
        srcSize             0x0000000000008918  unsigned __int64
        tableLog            0x00000009  unsigned int
+       weightList          0x0000006f7edd9cc0 "\a\x5\x5\x5\x4\x4\x4\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x2\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\a\a\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1\x1...    unsigned char[0x00000100]

HUF_decompress4X4() locals:

        cSrc                0x0109000701090007  const void *
        cSrcSize            0x0109000701090007  unsigned __int64
        dst                 0x0109000701090007  void *
        dstSize             0x0109000701090007  unsigned __int64
+       DTable              0x0000006f7edda040 {0x0000000c, 0x01090007, 0x01090007, 0x01090007, 0x01090007, 0x01090007, 0x01090007, ...}    unsigned int[0x00001001]

My guess is that DTable which is allocated on the stack, was overwritten somewhere, which makes it impossible for the debugger to unwind the stack properly.

zstd crashes on decoding invalid archives

Hi. It seems that zstd will read illegal pointers and crash when presented with mangled archives. Here's one such example file (GitHub doesn't allow binary attachments, so I'm providing a hex dump):

0000000    fd  2f  b5  1c  00  00  1c  40  00  12  31  32  31  31  31  31
0000020    31  31  31  31  32  32  32  32  32  32  32  0a  10  98  00  ff
0000040    7f  00  84  c0  00  00

Here's what gdb has to say about this problem:

(gdb) run -d <example.zst >example
Starting program: zstd -d <example.zst >example

Program received signal SIGSEGV, Segmentation fault.
0x0000000000410965 in ZSTD_decompressBlock (srcSize=28, src=0x801011000, maxDstSize=524288, dst=0x801032000, ctx=0x801006000) at lib/zstd.c:1533
(gdb) bt
#0  0x0000000000410965 in ZSTD_decompressBlock (srcSize=28, src=0x801011000, maxDstSize=524288, dst=0x801032000, ctx=0x801006000) at lib/zstd.c:1533
#1  ZSTD_decompressContinue (dctx=0x801006000, dst=0x801032000, maxDstSize=524288, src=0x801011000, srcSize=31) at lib/zstd.c:1680
#2  0x0000000000408681 in FIO_decompressFilename (output_filename=0x410f65 "-",input_filename=0x410f65 "-") at programs/fileio.c:363
#3  0x0000000000401a4d in main (argc=2, argv=0x7fffffffd9d0) at programs/zstdcli.c:314

This is with zstd as of commit 00f9507; the crash is located over here. The problem is that ZSTD_decompressBlock does not validate how big matchLength can get; in this case it is equal to 8650883, while the maxDstSize is only 524288 bytes, which results in an attempt to copy past the end of the output buffer.

Tag error enum

The enum in error_public.h is anonymous. It would be nice if you included a tag or a typedef, if only to help make -Wswitch / -Wswitch-enum usable. AFAIK there is currently no way to get the compiler to warn if a switch does not include a case for every enum value. I would like to be able to do something like

size_t res = …;
if (ZSTD_isError(res)) {
  switch ((ZSTD_ErrorCode) -res) {
    case ZSTD_error_No_Error:
      /* … */
      break;
  }
}

Note the cast in the controlling expression; without it the type will just be size_t, and the compiler doesn't realize that there is any association to the enum so it will not emit a warning when there is no case for a particular enum value.

FWIW, I still think it would be easier to do something like

typedef enum {
  ZSTD_error_No_Error = 0,
  ZSTD_error_GENERIC,
  ZSTD_error_prefix_uknown,
  …
} ZSTD_ErrorCode;

ZSTD_ErrorCode ZSTD_getError(size_t code);

ZSTD_getError would work just like ZSTD_isError does now; you could still do if (ZSTD_getError(code)) { … } since ZSTD_error_No_Error would be 0 and errors would be positive integers, but it would make the API a bit easier to understand since you hide the rather weird detail of negating an unsigned type.

Another thing that would be nice is a consistency check. For example, note that each of the values I wrote in the enum above (chosen because they're the first three entries in the enum in zstd, not to make this point) have different capitalization conventions.

Test data fails round-trip

I've got another failure case. This time, the data crashes during decompression. I've uploaded a gzipped version of the bad2.bin at the gist https://gist.github.com/mwiebe/c54c790288b8e16a7970.

C:\Dev>dir bad2.bin
 Volume in drive C is Windows
 Volume Serial Number is DA14-224A

 Directory of C:\Dev

2015-03-25  04:51 PM         6,434,440 bad2.bin
               1 File(s)      6,434,440 bytes
               0 Dir(s)  266,752,405,504 bytes free

C:\Dev>"C:\Dev\zstd\visual\2012\x64\Release\zstd.exe" bad2.bin out2.bin
Compressed 6434440 bytes into 3003788 bytes ==> 46.68%

C:\Dev>"C:\Dev\zstd\visual\2012\x64\Release\zstd.exe" -d out2.bin bad2.roundtrip.bin
**CRASH**

Visual Studio 2012 Release broken

Visual Studio 2012 Release mode - decompress data (which generated in HC mode) is broken again... Debug - is OK :)
Seems like a bug as in 0.2 version...

Compressing individual documents with a dictionary produced from a sampled batch to get better compression ratio

I'm very excited to see work being done on dictionary support in the API, because this is something that could greatly help me solve a pressing problem.

Context

In the context of a Document Store, where we are storing a set of JSON-like documents that are sharing the same schema. Each document can be created, read or updated individually in a random fashion. We would like to compress the documents on disk, but there is very little redundancy within each document, which yields very poor compression ratio (maybe 10-20%).
When compressing batchs of 10s or 100s documents, the compression ratio gets really good (10x, 50x or sometimes even more), because there is a lot of redundancy between documents, from:

  • the structure of the JSON itself which has a lot of ": ", or ": true, or {[[...]],[[...]]} symbols.
  • the names of the JSON fields: "Id", "Name", "Label", "SomeVeryLongFieldNameThatIsPresentOnlyOncePerDocument", etc..
  • frequent values like constants (true, "Red", "Administrator", ...), keywords, dates that start with 2015-12-14T.... for the next 24h, and even well-known or frequently used GUIDs that are shared by documents (Product Category, Tag Id, hugely popular nodes in graph databases, ...)

In the past, I used femtozip (https://github.com/gtoubassi/femtozip) which is intended precisely for this use case. It includes a dictionary training step (by building a sample batch of documents), that is then used to compress and decompress single documents, with the same compression ratio as if it was a batch. Using real life data, compressing 1000 documents individually would give the same compression ratio as compressing all 1000 documents in a batch with gzip -5.

The dictionary training part of femtozip can be very long: the more samples, the better the compression ratio would be in the end but you need tons of RAM to train it.

Also, I realized that femtozip would sometimes offset the differences in size between different formats like JSON/BSON/JSONB/ProtoBuf and other binary formats, because it would pick up the "grammar" of the format (text or binary) in the dictionary, and only deal with the "meat" of the documents (guids, integers, doubles, natural text) when compressing. This means I can use a format like JSONB (used by Postgres) which is less compact, but is faster to decode at runtime than JSON text.

Goal

I would like to be able to do something similar with Zstandard. I don't really care about building the most efficient dictionary (though it could be nice), but at least being able to exploit the fact that FSE builds a list of tokens sorted by frequency. Extracting this list of tokens may help in building a dictionary that will have the most common tokens in the training batch.

The goal would be:

  • For each new or modified document D, compress it AS IF we were compressing SAMPLES[cur_gen] + D.json, and only storing the bits produced by the D.json part.
  • When reading document D, decompress it AS IF we had the complete compressed version of SAMPLES[D.gen] + D.compressed, and only keeping the last decoded bits that make up D.

Since it would be impractical to change the compression code to be able to know which compressed bits are from D, and which from the batch, we could aproximate this by computing a DICTIONARY[gen] that would be used to initialize the compressor and decompressor.

Idea
  • Start by serializing an empty object into JSON (we would get the json structure and all the field names, but no values)
  • Use this as the the initial "gen 0" dictionary for the first batch of documents (when starting with an empty database)
  • After N documents, sample k random documents and compress them to produce a "generation 1" dictionary.
  • Compress each new or updated document (individually) with this new dictionary
  • After another N documents, or if some heuristic shows that compression ratio starts declining, then start a new generation of dictionary.

The Document Store would durably store each generations of dictionaries, and use them to decompress older entries. Periodically, it could recycle the entire store by recompressing everything with the most recent dictionary.

Concrete example:

Training set:

  • { "id": 123, "label": "Hello", "enabled": true, "uuid": "9ad51b87-d627-4e04-85c2-d6cb77415981" }
  • { "id": 126, "label": "Hell", "enabled": false, "uuid": "0c8e13a5-cdc8-4e1f-8e80-4fee025ee59c" }
  • { "id": 129, "label": "Help", "enabled": true, "uuid": "fe6db321-cddd-4e7f-b3d6-6b38365b3e2a" }

Looking at it, we can extract the following repeating segments: { "id": 12.., "label": "Hel... ", "enabled": ... e, "uuid": " ... " }, which could be condensed into:

  • { "id": 12, "label": "Hel", "enabled": e, "uuid":"" } (53 bytes shared by all docs)

The unique part of each documents would be:

  • ...3...lo...tru...9ad51b87-d627-4e04-85c2-d6cb77415981 (42 bytes)
  • ...6...l...fals...0c8e13a5-cdc8-4e1f-8e80-4fee025ee59c (42 bytes)
  • ...9......tru...fe6db321-cddd-4e7f-b3d6-6b38365b3e2a (40 bytes)

Zstd would only have to work on 42 bytes per doc, instead of 85 bytes. More realistic documents will have a lot more stuff in common than this example.

What I've tested so far
  • create "gen0" dictionary with "hollow" JSON: { "id": , "foo": "", "bar": "", ....} produced by removing all values from the JSON document.
  • using ZSTD_compress_insertDictionary, compressing { "id": 123, "foo": "Hello", "bar": "World", ...} is indeed smaller than without dictionary.
  • looking at cctx->litStart, I can see a buffer with 123HelloWorld which is exactly the content specific to the document itself that got removed when producing the gen0 dict.

Maybe one way to construct a better dictionary would be:

  • compress the batch of random and complete document (with values)
  • take K first symbols ordered by frequency descending
  • create dictionary by outputing symbol K-1, then K-2, up to 0 (I guess that if most frequent symbol is to the end of the dictionary, offsets to it would be smaller?)
  • maybe one could ask for a target dictionary size, and K would be the number of symbols needed to fill the dictionary?
What I'm not sure about
  • I don't know how having a dictionary would help with larger documents above 128KB or 256KB. Currently I'm only inserting the dictionary for the first block. Would I need to reuse the same dictionary for each 128KB block?
  • What is the best size for this dictionary? 16KB? 64KB? 128KB?
  • ZSTD_decompress_insertDictionary branches off into different implementation for lazy, greedy and so on. I'm not sure if all compression strategy can be used to produce such a dictionary?

Again, I don't care about producing the ideal dictionary that produces the smallest result possible, only something that would give me about better compression ratio, while still being able to handle documents in isolation.

ZSTD_ERROR_GENERIC failure on sample data

I've run into a bit of data which fails to compress. The failure case is a binary file in the following gist:

https://gist.github.com/mwiebe/c54c790288b8e16a7970

C:\Dev>dir bad.bin
 Volume in drive C is Windows
 Volume Serial Number is DA14-224A

 Directory of C:\Dev

2015-03-23  02:17 PM             2,469 bad.bin
               1 File(s)          2,469 bytes
               0 Dir(s)  273,715,097,600 bytes free

C:\Dev>"C:\Dev\zstd\visual\2012\x64\Release\zstd.exe" bad.bin out.bin
Error 24 : Compression error : ZSTD_ERROR_GENERIC

Segmentation fault during compression inside FSE_normalizeCount

Hi Yann,

I get a segmentation fault during compression of a specific data buffer inside FSE_normalizeCount
The specific line is the call to FSE_adjustNormSlow at https://github.com/Cyan4973/zstd/blob/dev/lib/fse.c#L696

The frame stack is all screwed up examining core file, but I managed to get the following output when run through valgrind:

==32677== Invalid write of size 8
==32677==    at 0x64DD4E7: FSE_normalizeCount (fse.c:696)
==32677==    by 0x157: ???
==32677==    by 0xFFEFF33AF: ???
==32677==    by 0x9000000FE: ???
==32677==    by 0xFFEFF31AF: ???
==32677==    by 0x40012FFFFFFFF: ???
==32677==    by 0x100FFFFFF9C: ???
==32677==  Address 0xd5 is not stack'd, malloc'd or (recently) free'd

I can reliable reproduce this fault, so I might be able to provide a test case for you given some time.

Out of bounds heap read in ZSTD_copy8

This file
https://crashes.fuzzing-project.org/zstd-oob-heap-ZSTD_copy8
causes an out of bounds heap read access in zstd. This can be seen with either address sanitizer or valgrind.

This was found with american fuzzy lop.

Address Sanitizer output:

==12888==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fc173c8104b at pc 0x0000004e939f bp 0x7ffe115e1a50 sp 0x7ffe115e1a48
READ of size 8 at 0x7fc173c8104b thread T0
    #0 0x4e939e in ZSTD_copy8 /f/zst/zstd/programs/../lib/zstd.c:158:56
    #1 0x4e939e in ZSTD_wildcopy /f/zst/zstd/programs/../lib/zstd.c:168
    #2 0x4e939e in ZSTD_execSequence /f/zst/zstd/programs/../lib/zstd.c:1337
    #3 0x4e939e in ZSTD_decompressSequences /f/zst/zstd/programs/../lib/zstd.c:1436
    #4 0x4e939e in ZSTD_decompressBlock /f/zst/zstd/programs/../lib/zstd.c:1473
    #5 0x4e68b2 in ZSTD_decompressContinue /f/zst/zstd/programs/../lib/zstd.c:1622:21
    #6 0x52dcbf in FIO_decompressFrame /f/zst/zstd/programs/fileio.c:396:23
    #7 0x52e721 in FIO_decompressFilename /f/zst/zstd/programs/fileio.c:492:21
    #8 0x530fe3 in main /f/zst/zstd/programs/zstdcli.c:352:9
    #9 0x7fc172bf4f9f in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.20-r2/work/glibc-2.20/csu/libc-start.c:289
    #10 0x4377c6 in _start (/mnt/ram/zstd/zstd+0x4377c6)

0x7fc173c8104b is located 3 bytes to the right of 141384-byte region [0x7fc173c5e800,0x7fc173c81048)
allocated by thread T0 here:
    #0 0x4be792 in __interceptor_malloc (/mnt/ram/zstd/zstd+0x4be792)
    #1 0x4e627c in ZSTD_createDCtx /f/zst/zstd/programs/../lib/zstd.c:1560:35
    #2 0x530fe3 in main /f/zst/zstd/programs/zstdcli.c:352:9

SUMMARY: AddressSanitizer: heap-buffer-overflow /f/zst/zstd/programs/../lib/zstd.c:158 ZSTD_copy8
Shadow bytes around the buggy address:
  0x0ff8ae7881b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff8ae7881c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff8ae7881d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff8ae7881e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0ff8ae7881f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0ff8ae788200: 00 00 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa
  0x0ff8ae788210: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff8ae788220: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff8ae788230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff8ae788240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0ff8ae788250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==12888==ABORTING

Types of objects it can compress

Hi,
is this an alternative to Gzip?
Also, what kind of files/objects can it compress - just base HTML or also JS, CSS, Images etc?
Lastly, can it be deployed to any OS - Apache/IIS?
Also, if I need to 'activate' this on an existing website, what steps do I need to follow?

Thread safety

Hi,

I have tried to run some tests in parallel and noticed them failing non-deterministically:

  • ZSTD_decompressContinue fails sometimes when executed in parallel with ZSTD_compress
  • ZSTD_compressContinue sometimes gives wrong results when executed in parallel with ZSTD_decompressContinue
  • and the other way around: ZSTD_decompressContinue fails when executed in parallel with ZSTD_compressContinue.

When I say fails it is with "corruption_detected" error. When I say "non-deterministically" it means that consecutive run with the exact same inputs succeeds. I was suspecting my code is not thread safe so I ran in parallel different classes that don't share any code except the libzstd binary (like the above cases) to rule out my own faults. One additional observation: decompressContinue fails only if the original size exceed some threshold, e.g. 1M for levels 1,3,6; 2M for level 9. The parallel Zstd_compress is running with random small bufferes (0-32k) when decompressContinue fails.

So some questions:

  • is there anything that should be synchronized?
  • I am still not sure if I am not doing something stupid so some advises where I should be careful are welcome.

Regards,
luben

Very slow compression speed for level 14+ on some specific datasets

I'm currently doing some tests and benchmarks on the best way to compress vectors of numerical data using zstd (and various filters to help compression, such as delta or shuffle).

I found a few oddities while trying various sample sets, and I'm not sure if this is expected or not.

note: All the tests below are done with the 0.5.1 release.

The most obvious anomaly I found was when compressing vectors of 64-bit floats produced by going from 100.0 and randomly removing a tiny amount at each step (a few tens of %) while keeping the full precision (ie: 17 decimals), i.e: the sample set looks like this (whith each value encoded as an IEEE 64-bit decimal:

XS: [
  99.9996492024556, 99.9996492024556, 99.9996492024556, 99.9996492024556, 99.9996492024556,
  99.9959685493382, 99.9934385250828, 99.9915028228664, 99.9913987684419, 99.9876946097741,
  99.9832594119447, 99.9832594119447, 99.9827409006435, 99.9827409006435, 99.9792033732376,
  99.9792033732376, 99.9792033732376, 99.9792033732376, 99.9792033732376, 99.9779770870381,
  ...,
  63.9185913211865, 63.9185913211865, 63.9185913211865, 63.9183349283913, 63.9173471531838,
  63.9129878772782, 63.9129878772782, 63.9129878772782, 63.9129878772782, 63.910876784327,
  63.910876784327, 63.9107775813923, 63.9107775813923, 63.9107775813923, 63.9107775813923,
  63.9107775813923, 63.9077360819731, 63.9077360819731, 63.9059456902976, 63.9059456902976
]

I'm also compressing the delta-encoded vector (0 = no change):

DELTA(XS): [
  99.9996492024556, 0, 0, 0, 0,
  -0.00368065311744203, -0.00253002425540672, -0.00193570221631489, -0.000104054424497235, -0.00370415866780149,
  -0.00443519782946566, 0, -0.000518511301152103, 0, -0.00353752740591062,
  0, 0, 0, 0, -0.00122628619951115,
  ...,
  -0.000363898205272051, 0, 0, -0.000256392795243698, -0.000987775207491381,
  -0.00435927590558549, 0, 0, 0, -0.00211109295120337,
  0, -9.92029346988943E-05, 0, 0, 0,
  0, -0.00304149941915455, 0, -0.00179039167556283, 0
]

Now my benchmark compresses both vectors (N=28,800) encoded as 4-byte or 8-byte elements (115KB / 230KB), using multiple codecs (lz4, zstd, zlib) and also filtering (none, blosc-like shuffle), and measure both the ratio and time to encode.

The full benchmark results can be found here: https://gist.github.com/KrzysFR/0f6835c7a8d0f19dbdc3 (warning: lots of data and ASCII charts!)

The combination of delta-encoding + shuffling on 64-bit floats with full precision induce a very visible slowdown for levels 14 up to 21, going from 15ms at level 13 to 280ms at level 14 (up to 21). This bump is still there but less visible when rounding all the numbers to keep only 3 digits.

Below are some charts that show the results.

Comparison of compression time
top: original data set with full precision, bottom: rounded to 3 decimals
image
image

The yellow line is clearly causing troubles for levels 14 and up. We can also see that ZLib has some issues with it from level 7 and up

Here it is again, but using a log scale for the time:
image

Comparison of ratios
top: original data set with full precision, bottom: rounded to 3 decimals
image
image

Full results
image

Unaligned accesses

When compiling with -fsanitize=undefined:

/buffer/zstd/zstd: /home/nemequ/local/src/squash/plugins/zstd/zstd/lib/zstd.c:918:54: runtime error: load of misaligned address 0x000000401b49 for type 'const void', which requires 8 byte alignment
0x000000401b49: note: pointer points here
 00 00 00  4c 6f 72 65 6d 20 69 70  73 75 6d 20 64 6f 6c 6f  72 20 73 69 74 20 61 6d  65 74 2c 20 63
              ^ 
/home/nemequ/local/src/squash/plugins/zstd/zstd/lib/zstd.c:183:44: runtime error: load of misaligned address 0x000000401b49 for type 'const void', which requires 4 byte alignment
0x000000401b49: note: pointer points here
 00 00 00  4c 6f 72 65 6d 20 69 70  73 75 6d 20 64 6f 6c 6f  72 20 73 69 74 20 61 6d  65 74 2c 20 63
              ^ 
/home/nemequ/local/src/squash/plugins/zstd/zstd/lib/zstd.c:185:47: runtime error: load of misaligned address 0x000000401b51 for type 'const void', which requires 8 byte alignment
0x000000401b51: note: pointer points here
 20 69 70  73 75 6d 20 64 6f 6c 6f  72 20 73 69 74 20 61 6d  65 74 2c 20 63 6f 6e 73  65 63 74 65 74
              ^ 
OK

out of bounds stack read in function HUF_readStats on malformed input

This input file
https://crashes.fuzzing-project.org/zstd-oob-stack-HUF_readStats
causes an out of bounds stack read access in zstd. To see this one needs to compile zstd with address sanitizer (-fsanitize=address in CFLAGS).

Issue was found with the help of american fuzzy lop.

This is the output from address sanitizer:

==19506==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffef5269784 at pc 0x0000004faff7 bp 0x7ffef5269520 sp 0x7ffef5269518
READ of size 4 at 0x7ffef5269784 thread T0
    #0 0x4faff6 in HUF_readStats /f/zst/zstd/programs/../lib/huff0.c:612:9
    #1 0x4fa03d in HUF_readDTableX2 /f/zst/zstd/programs/../lib/huff0.c:644:13
    #2 0x501080 in HUF_decompress4X2 /f/zst/zstd/programs/../lib/huff0.c:859:17
    #3 0x5138d2 in HUF_decompress /f/zst/zstd/programs/../lib/huff0.c:1701:23
    #4 0x4e4b39 in ZSTD_decompressLiterals /f/zst/zstd/programs/../lib/zstd.c:1078:21
    #5 0x4e4b39 in ZSTD_decodeLiteralsBlock /f/zst/zstd/programs/../lib/zstd.c:1102
    #6 0x4e6bb5 in ZSTD_decompressBlock /f/zst/zstd/programs/../lib/zstd.c:1468:23
    #7 0x4e68b2 in ZSTD_decompressContinue /f/zst/zstd/programs/../lib/zstd.c:1622:21
    #8 0x52dcbf in FIO_decompressFrame /f/zst/zstd/programs/fileio.c:396:23
    #9 0x52e721 in FIO_decompressFilename /f/zst/zstd/programs/fileio.c:492:21
    #10 0x530fe3 in main /f/zst/zstd/programs/zstdcli.c:352:9
    #11 0x7f76943b6f9f in __libc_start_main /var/tmp/portage/sys-libs/glibc-2.20-r2/work/glibc-2.20/csu/libc-start.c:289
    #12 0x4377c6 in _start (/mnt/ram/zstd/zstd+0x4377c6)

Address 0x7ffef5269784 is located in stack of thread T0 at offset 420 in frame
    #0 0x4f9edf in HUF_readDTableX2 /f/zst/zstd/programs/../lib/huff0.c:630

  This frame has 4 object(s):
    [32, 288) 'huffWeight'
    [352, 420) 'rankVal' <== Memory access at offset 420 overflows this variable
    [464, 468) 'tableLog'
    [480, 484) 'nbSymbols'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /f/zst/zstd/programs/../lib/huff0.c:612 HUF_readStats
Shadow bytes around the buggy address:
  0x10005ea452a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005ea452b0: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x10005ea452c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005ea452d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005ea452e0: f2 f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00
=>0x10005ea452f0:[04]f2 f2 f2 f2 f2 04 f2 04 f3 f3 f3 00 00 00 00
  0x10005ea45300: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x10005ea45310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005ea45320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005ea45330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10005ea45340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==19506==ABORTING

Stack buffer overflow in v043

VS2013 and Xcode711/Asan detect a stack buffer overflow in a released v0.4.3 with a specific input data.

How to repro

Use the following PVRTC4 compressed sample image with a fullbench app, called without additional parameters.
https://www.dropbox.com/s/tlgr7lxpmtiq4yw/sample.pvr?dl=0

Output with VS2013-update5, Windows7:
*** Zstandard speed analyzer  32-bits, by Yann Collet (Dec 10 2015) ***
 D:\work\zstd\release-v043\visual\2012\sample.pvr :
 1- ZSTD_compress                  :     3.4 MB/s  (  2120226)
11- ZSTD_decompress                :     7.4 MB/s  (  2796340)
31- ZSTD_decodeLiteralsBlock       :    11.8 MB/s  (    74273)
 1- ZSTD_decodeSeqHeaders          :


Run-Time Check Failure #2 - Stack around the variable 'DTableOffb' was corrupted.


>   fullbench.exe!local_ZSTD_decodeSeqHeaders(void * dst, unsigned int dstSize, void * buff2, const void * src, unsigned int srcSize) Line 244  C
    fullbench.exe!benchMem(void * src, unsigned int srcSize, unsigned int benchNb) Line 358 C
    fullbench.exe!benchFiles(char * * fileNamesTable, int nbFiles, unsigned int benchNb) Line 468   C
    fullbench.exe!main(int argc, char * * argv) Line 584    C
    [External Code] 
    [Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]  
Output with Xcode711, Clang700.1.76, enabled ASan, iOS9.1:
AddressSanitizer debugger support is active. Memory error breakpoint has been installed and you can now use the 'memory history' command.
2015-12-10 12:48:24.980 TestLZ4[2018:627417] Started
*** Zstandard speed analyzer  64-bits, by Yann Collet (Dec 10 2015) ***
Loading /var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/sample.pvr...       


 /var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/sample.pvr : 
 1- ZSTD_compress                  : 
 1- ZSTD_compress                  :     4.1 MB/s  (  2120226)
 2- ZSTD_compress                  : 
 2- ZSTD_compress                  :     4.1 MB/s  (  2120226)
 3- ZSTD_compress                  : 
 3- ZSTD_compress                  :     4.1 MB/s  (  2120226)
 4- ZSTD_compress                  : 
 4- ZSTD_compress                  :     4.1 MB/s  (  2120226)
 5- ZSTD_compress                  : 
 5- ZSTD_compress                  :     4.1 MB/s  (  2120226)
 6- ZSTD_compress                  : 
 6- ZSTD_compress                  :     4.1 MB/s  (  2120226)
 1- ZSTD_compress                  :     4.1 MB/s  (  2120226)
 1- ZSTD_decompress                : 
 1- ZSTD_decompress                :    15.0 MB/s  (  2796340)
 2- ZSTD_decompress                : 
 2- ZSTD_decompress                :    15.0 MB/s  (  2796340)
 3- ZSTD_decompress                : 
 3- ZSTD_decompress                :    15.0 MB/s  (  2796340)
 4- ZSTD_decompress                : 
 4- ZSTD_decompress                :    15.0 MB/s  (  2796340)
 5- ZSTD_decompress                : 
 5- ZSTD_decompress                :    15.0 MB/s  (  2796340)
 6- ZSTD_decompress                : 
 6- ZSTD_decompress                :    15.0 MB/s  (  2796340)
11- ZSTD_decompress                :    15.0 MB/s  (  2796340)
 1- ZSTD_decodeLiteralsBlock       : 
 1- ZSTD_decodeLiteralsBlock       :    26.1 MB/s  (    74273)
 2- ZSTD_decodeLiteralsBlock       : 
 2- ZSTD_decodeLiteralsBlock       :    26.1 MB/s  (    74273)
 3- ZSTD_decodeLiteralsBlock       : 
 3- ZSTD_decodeLiteralsBlock       :    26.1 MB/s  (    74273)
 4- ZSTD_decodeLiteralsBlock       : 
 4- ZSTD_decodeLiteralsBlock       :    26.1 MB/s  (    74273)
 5- ZSTD_decodeLiteralsBlock       : 
 5- ZSTD_decodeLiteralsBlock       :    26.1 MB/s  (    74273)
 6- ZSTD_decodeLiteralsBlock       : 
 6- ZSTD_decodeLiteralsBlock       :    26.1 MB/s  (    74273)
31- ZSTD_decodeLiteralsBlock       :    26.1 MB/s  (    74273)
 1- ZSTD_decodeSeqHeaders          : 
=================================================================
==2018==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x00016fd84ee2 at pc 0x0001000c29ec bp 0x00016fd81430 sp 0x00016fd81428
WRITE of size 1 at 0x00016fd84ee2 thread T0
    #0 0x1000c29eb in FSE_buildDTable (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x10004a9eb)
    #1 0x10009ff7f in ZSTD_decodeSeqHeaders (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x100027f7f)
    #2 0x1000bcdaf in local_ZSTD_decodeSeqHeaders (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x100044daf)
    #3 0x1000bd683 in benchMem (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x100045683)
    #4 0x1000be247 in benchFiles (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x100046247)
    #5 0x1000bee27 in zstd_start_benchmark (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x100046e27)
    #6 0x10009d15b in -[ViewController doAsyncTestButton:] (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x10002515b)
    #7 0x189ca7cfb in <redacted> (/System/Library/Frameworks/UIKit.framework/UIKit+0x4fcfb)
    #8 0x189ca7c77 in <redacted> (/System/Library/Frameworks/UIKit.framework/UIKit+0x4fc77)
    #9 0x189c8f92f in <redacted> (/System/Library/Frameworks/UIKit.framework/UIKit+0x3792f)
    #10 0x189cb03cb in <redacted> (/System/Library/Frameworks/UIKit.framework/UIKit+0x583cb)
    #11 0x189ca7013 in <redacted> (/System/Library/Frameworks/UIKit.framework/UIKit+0x4f013)
    #12 0x189c9fcdb in <redacted> (/System/Library/Frameworks/UIKit.framework/UIKit+0x47cdb)
    #13 0x189c704a3 in <redacted> (/System/Library/Frameworks/UIKit.framework/UIKit+0x184a3)
    #14 0x189c6e76b in <redacted> (/System/Library/Frameworks/UIKit.framework/UIKit+0x1676b)
    #15 0x184694543 in <redacted> (/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation+0xdc543)
    #16 0x184693fd7 in <redacted> (/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation+0xdbfd7)
    #17 0x184691cd7 in <redacted> (/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation+0xd9cd7)
    #18 0x1845c0c9f in CFRunLoopRunSpecific (/System/Library/Frameworks/CoreFoundation.framework/CoreFoundation+0x8c9f)
    #19 0x18f7fc087 in GSEventRunModal (/System/Library/PrivateFrameworks/GraphicsServices.framework/GraphicsServices+0xc087)
    #20 0x189cd8ffb in UIApplicationMain (/System/Library/Frameworks/UIKit.framework/UIKit+0x80ffb)
    #21 0x1000f724f in main (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x10007f24f)
    #22 0x199ade8b7 in <redacted> (/usr/lib/system/libdyld.dylib+0x28b7)

Address 0x00016fd84ee2 is located in stack of thread T0 at offset 12578 in frame
    #0 0x1000bcb27 in local_ZSTD_decodeSeqHeaders (/var/mobile/Containers/Bundle/Application/1FA3B3BC-B5AD-486D-BAD2-144B631187D1/TestLZ4.app/TestLZ4+0x100044b27)

  This frame has 6 object(s):
    [32, 8224) 'DTableML'
    [8480, 12576) 'DTableLL' <== Memory access at offset 12578 overflows this variable
    [12704, 14752) 'DTableOffb'
    [14880, 14888) 'dumps'
    [14912, 14920) 'length'
    [14944, 14948) 'nbSeq'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow ??:0 FSE_buildDTable
Shadow bytes around the buggy address:
  0x00014e1b0980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00014e1b0990: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00014e1b09a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00014e1b09b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00014e1b09c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x00014e1b09d0: 00 00 00 00 00 00 00 00 00 00 00 00[f2]f2 f2 f2
  0x00014e1b09e0: f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 f2 00 00 00 00
  0x00014e1b09f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00014e1b0a00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00014e1b0a10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x00014e1b0a20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Heap right redzone:      fb
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack partial redzone:   f4
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==2018==ABORTING
AddressSanitizer report breakpoint hit. Use 'thread info -s' to get extended information about the report.

(lldb) bt
* thread #1: tid = 0x992d9, 0x00000001001650c4 libclang_rt.asan_ios_dynamic.dylib`__asan::AsanDie(), queue = 'com.apple.main-thread', stop reason = Stack buffer overflow detected
    frame #0: 0x00000001001650c4 libclang_rt.asan_ios_dynamic.dylib`__asan::AsanDie()
    frame #1: 0x0000000100168b80 libclang_rt.asan_ios_dynamic.dylib`__sanitizer::Die() + 44
    frame #2: 0x0000000100163ed4 libclang_rt.asan_ios_dynamic.dylib`__asan::ScopedInErrorReport::~ScopedInErrorReport() + 336
    frame #3: 0x0000000100163c6c libclang_rt.asan_ios_dynamic.dylib`__asan::ScopedInErrorReport::~ScopedInErrorReport() + 12
    frame #4: 0x00000001001637e8 libclang_rt.asan_ios_dynamic.dylib`__asan_report_error + 3216
    frame #5: 0x00000001001641f8 libclang_rt.asan_ios_dynamic.dylib`__asan_report_store1 + 44
  * frame #6: 0x00000001000c29ec TestLZ4`FSE_buildDTable(dt=0x000000016fd83ee0, normalizedCounter=0x000000016fd818f0, maxSymbolValue=63, tableLog=10) + 856 at fse.c:373
    frame #7: 0x000000010009ff80 TestLZ4`ZSTD_decodeSeqHeaders(nbSeq=0x000000016fd85820, dumpsPtr=0x000000016fd857e0, dumpsLengthPtr=0x000000016fd85800, DTableLL=0x000000016fd83ee0, DTableML=0x000000016fd81de0, DTableOffb=0x000000016fd84f60, src=0x000000010a574800, srcSize=11133) + 3100 at zstd_decompress.c:377
    frame #8: 0x00000001000bcdb0 TestLZ4`local_ZSTD_decodeSeqHeaders(dst=0x000000010a2bc800, dstSize=2818710, buff2=0x000000010a574800, src=0x0000000108404800, srcSize=131072) + 664 at zstd_fullbench.c:243
    frame #9: 0x00000001000bd684 TestLZ4`benchMem(src=0x0000000108404800, srcSize=131072, benchNb=32) + 2116 at zstd_fullbench.c:358
    frame #10: 0x00000001000be248 TestLZ4`benchFiles(fileNamesTable=0x000000016fd85e98, nbFiles=1, benchNb=32) + 1100 at zstd_fullbench.c:468
    frame #11: 0x00000001000bee28 TestLZ4`zstd_start_benchmark(argc=2, argv=0x000000016fd85e90) + 2188 at zstd_fullbench.c:584
    frame #12: 0x000000010009d15c TestLZ4`-[ViewController doAsyncTestButton:](self=0x0000000107507880, _cmd="doAsyncTestButton:", sender=<unavailable>) + 792 at ViewController.mm:55
    frame #13: 0x0000000189ca7cfc UIKit`-[UIApplication sendAction:to:from:forEvent:] + 100
    frame #14: 0x0000000189ca7c78 UIKit`-[UIControl sendAction:to:forEvent:] + 80
    frame #15: 0x0000000189c8f930 UIKit`-[UIControl _sendActionsForEvents:withEvent:] + 416
    frame #16: 0x0000000189cb03cc UIKit`-[UIControl touchesBegan:withEvent:] + 268
    frame #17: 0x0000000189ca7014 UIKit`-[UIWindow _sendTouchesForEvent:] + 376
    frame #18: 0x0000000189c9fcdc UIKit`-[UIWindow sendEvent:] + 784
    frame #19: 0x0000000189c704a4 UIKit`-[UIApplication sendEvent:] + 248
    frame #20: 0x0000000189c6e76c UIKit`_UIApplicationHandleEventQueue + 5528
    frame #21: 0x0000000184694544 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 24
    frame #22: 0x0000000184693fd8 CoreFoundation`__CFRunLoopDoSources0 + 540
    frame #23: 0x0000000184691cd8 CoreFoundation`__CFRunLoopRun + 724
    frame #24: 0x00000001845c0ca0 CoreFoundation`CFRunLoopRunSpecific + 384
    frame #25: 0x000000018f7fc088 GraphicsServices`GSEventRunModal + 180
    frame #26: 0x0000000189cd8ffc UIKit`UIApplicationMain + 204
    frame #27: 0x00000001000f7250 TestLZ4`main(argc=1, argv=0x000000016fd87a90) + 124 at main.m:16
    frame #28: 0x0000000199ade8b8 libdyld.dylib`start + 4
(lldb) 
Possible consequences

In an existing quite a big iOS app the invocation of ZSTD_decompress (exactly this function, not its internals) may just crash after several dozens of successful calls, it looks like a problem in a corrupted stack. I didn't manage to localize a problem in a separated sample, only found an issue with a fullbench above. Still not totally sure whether both issues have the same root, but it may be.

Unexpected error for huge data

I've got unexpected error for huge (>4GiB ?) data which generated by xorshift. Here is a test case : https://gist.github.com/t-mat/a7e93d4767b991e191ea
It generates same error both on Ubuntu 14.04 (x64) / gcc 4.8.2 and Windows 7 SP1 (x64) / MSVC++2013.

source data       : 4295000064 bytes
zstd compressed   : 1350556646 bytes
zstd decompressed : 4295000064 bytes
Data error : offset @0x100003c41

HC_compress issues

Hi,

I am running into some issues with decompressing some of the results of ZSTD_HC_compress. I am using the very simple test case pasted below.

Trying to decompress the result of compressing 0 bytes buffer with level > 1 I am getting "ZSTD_error_corruption_detected" and with larger sizes buffer I am getting "ZSTD_error_srcSize_wrong". When I pass a certain threshold (15 in this case but it depends on the payload) everything starts to works correctly.

#include <stdio.h>
#include <stdlib.h>
#include <zstd.h>
#include <zstdhc.h>

int main(int argc, char **argv ) {
    char raw[20] = {1,1,1,1,1, 1,1,1,1,1, 1,1,1,1,1, 1,1,1,1,1};
    char compressed[50];
    char decompressed[50];
    size_t ccode, dcode;
    size_t size = 2;
    int level = 2;

    ccode = ZSTD_HC_compress(compressed, 50, raw, size, level);
    printf("Compression code %i\n", ccode);
    if (ZSTD_isError(ccode)) {
        printf("Compression error %s\n", ZSTD_getErrorName(ccode));
    }

    dcode = ZSTD_decompress(decompressed, 50, compressed, ccode);
    printf("Decompression code %i\n", dcode);
    if (ZSTD_isError(dcode)) {
        printf("Decompression error %s\n", ZSTD_getErrorName(dcode));
    }
    return 0;
}

test failures on big endian systems

Hi Yann,

The Debian build daemons have found some issues with the test suite on mips, powerpc and s390x systems. If you consider this a bug, the build logs are here. Otherwise, if you consider these architectures unsupported, I can remove them from the list of architectures that zstd will be built on.

Cheers,
Kevin

compilation problems witv v0.4.0

https://github.com/Cyan4973/zstd/archive/zstd-0.4.0.zip

Changed only:

define ZSTD_LEGACY_SUPPORT 0

GCC returns tens of errors (mainly undeclared functions):

gcc.exe -Wno-unknown-pragmas -Wno-sign-compare -Wno-conversion -fomit-frame-pointer -fstrict-aliasing -fforce-addr -ffast-math -O3 -DNDEBUG -DFREEARC_WIN -D__x86_64__ -D__SSE2__ -I. -DFREEARC_INTEL_BYTE_ORDER -D_UNICODE -DUNICODE -HAVE_CONFIG_H zstd/zstd.c -std=c99 -c -o zstd/zstd.o
zstd/zstd.c:608:8: error: conflicting types for 'ZSTD_compressBegin'
 size_t ZSTD_compressBegin(ZSTD_CCtx* ctx, void* dst, size_t maxDstSize)
        ^
In file included from zstd/zstd.c:70:0:
zstd/zstd_static.h:124:8: note: previous declaration of 'ZSTD_compressBegin' was here
 size_t ZSTD_compressBegin(ZSTD_CCtx* cctx, void* dst, size_t maxDstSize, int compressionLevel);
        ^
zstd/zstd.c: In function 'ZSTD_compressBegin':
zstd/zstd.c:617:24: error: 'ZSTD_magicNumber' undeclared (first use in this function)
     MEM_writeLE32(dst, ZSTD_magicNumber);
                        ^
zstd/zstd.c:617:24: note: each undeclared identifier is reported only once for each function it appears in
zstd/zstd.c: At top level:
zstd/zstd.c:774:8: error: conflicting types for 'ZSTD_compressCCtx'
 size_t ZSTD_compressCCtx(ZSTD_CCtx* ctx, void* dst, size_t maxDstSize, const void* src, size_t srcSize)

v0.5 dictionary question

So it seems that zstd can now help with serving small amounts of JSON (say from a NoSQL db), assuming that the data is similar (eg common object names)?

cSize overflow in ZSTD_compressSequences

This is with zstd as of commit 765207c
In our case(TokuDB), zstd ZSTD_isError returns true sometimes, because:

#1  0x0000000000ca8836 in ZSTD_compressSequences (dst=0x2aaac4c0623a "\034", maxDstSize=11084, seqStorePtr=Unhandled dwarf expression opcode 0xf3
)
...
Breakpoint 8, ZSTD_isError (code=18446744073709551615)

my

srcLen=10486
dstLen=11092

I guess it's a bug in zstd to cause cSize to be 2^64 -1.

Buffer bounds

I understand that the project is in experimental stage and I am not expecting to be bugs-free. So here is one bug.

Sometimes the destination buffer bounds are not checked properly (when compressing/decompressing) and overflows happens that could lead to a lot of nasty thing. Here is a code that demonstrates it:

#include <stdio.h>
#include <stdlib.h>
#include <zstd.h>

int main(int argc, char **argv ) {
    char *raw = (char *)malloc(1000);
    char compressed[20];
    char decompressed[900];
    size_t i, ccode, dcode;

    // fill it with ones
    for (i=0; i<1000; i++) raw[i] = 1;

    ccode = ZSTD_compress(compressed, 20, raw, 1000);
    printf("Compression code %i\n", ccode);
    dcode = ZSTD_decompress(decompressed, 900, compressed, ccode);
    printf("Decompression code %i\n", dcode);
    return 0;
}

Regards

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.