Coder Social home page Coder Social logo

fpc's Introduction

FPC - Fast Prefix Coder

FPC is a FAST entropy coder that uses prefix codes (huffman) and has higher compression ratio than many AC or ANS implementations for non-skewed probability distributions.

Features

  • Advanced adaptive block subdivision
  • Optimal length limited prefix codes using fast implementation of package - merge algorithm
  • Optimised for both in order and out of order cpus
  • Support for both big and little endian cpus
  • License ISC

Configuration options

  • Number of streams (default 3)
  • Max bit length (default 11)
  • Adaptive step (default 2048).Lower increases compression ration but it is slower.

Benchmark

Tested using TurboBench

silesia.tar

Core i5-4460 CPU @ 3.20GHz

      C Size  ratio%     C MB/s     D MB/s   Name            File
   124509536    58.7      25.32      32.61   fpaq0p_sh                        silesia.tar
   128317932    60.5     162.82     813.92   fpc 0                            silesia.tar
   129091908    60.9      35.89      22.03   subotin                          silesia.tar
   129654526    61.2     476.96     821.30   fpc 16                           silesia.tar
   129979639    61.3     345.70     483.55   fse                              silesia.tar
   129989013    61.3     248.55     243.08   zlibh                            silesia.tar
   130059551    61.4     295.89     591.30   rans_static16                    silesia.tar
   130134503    61.4     574.79     715.36   fsehuf                           silesia.tar
   130170979    61.4     548.16     897.28   fpc 32                           silesia.tar
   130373605    61.5     234.33     229.23   fsc                              silesia.tar
   130731327    61.7     143.04      44.31   FastAC                           silesia.tar
   131948580    62.3     104.39      75.04   FastHF                           silesia.tar
   211957977   100.0    6030.96    6103.20   memcpy                           silesia.tar

enwik8

orange pi pc plus allwinner h3

      C Size  ratio%     C MB/s     D MB/s   Name            File
    61251140    61.3       3.62       5.41   fpaq0p_sh                        enwik8
    62677385    62.7       5.14       3.19   subotin                          enwik8
    62782152    62.8      13.30      85.07   fpc 0                            enwik8
    63155812    63.2      61.85      86.17   fpc 16                           enwik8
    63188022    63.2      19.96       7.13   FastAC                           enwik8
    63193639    63.2      40.10      62.90   zlibh                            enwik8
    63202025    63.2      40.61      62.91   fse                              enwik8
    63287917    63.3      29.37      62.32   rans_static16                    enwik8
    63327425    63.3      20.31      29.96   fsc                              enwik8
    63415358    63.4      66.47      90.70   fpc 32                           enwik8
    63420890    63.4      64.20      61.11   fsehuf                           enwik8
    63648861    63.6      21.80      15.27   FastHF                           enwik8
   100000004   100.0     635.93    1099.78   memcpy                           enwik8

compile

make

fpc's People

Contributors

g0xa52a2a avatar kagiannis avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

clbr

fpc's Issues

c++ compile (g++) fails

Undefined references during linking and then some warnings. I don't understand why linkage fails, because header while has extern "C" linkage.

When I include the c file directly (#include fpc.c) in my project then it builds.

Below is the log of compile. I edited the make file to use C++ compiler and changed the std library option.

g++ -Wall -O2 -std=c++11 -DNDEBUG fpc.c cli.c -o fpc
fpc.c: In function ‘void byte_count(U8*, int, U32*, int)’:
fpc.c:412:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
412 | for(a = 0;a < (len & (~7));a += 8){
| ~~^~~~~~~~~~~~~~
fpc.c:431:9: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
431 | for(;a < len;)
| ~~^~~~~
fpc.c:433:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
433 | for(a = 0;a < sym_num;a++)
| ~~^~~~~~~~~
fpc.c: In function ‘void sort_inc(Fsym*, int)’:
fpc.c:442:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
442 | for(a = 0;a < num;a++){
| ~~^~~~~
fpc.c:460:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
460 | for(a = 0;a < num;a++){
| ~~^~~~~
fpc.c:466:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
466 | for(a = 0;a < num;a++){
| ~~^~~~~
fpc.c: In function ‘int construct_dec_table(U8*, Dnode*, int)’:
fpc.c:490:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
490 | for(a = 0;a < sym_num;a++){
| ~~^~~~~~~~~
fpc.c:505:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
505 | for(a = 0;a < sym_num;a++){
| ~~^~~~~~~~~
fpc.c: In function ‘void construct_enc_table(Enode*, Fsym*, int)’:
fpc.c:539:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
539 | for(a = 0;a < num;a++){
| ~~^~~~~
fpc.c:548:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
548 | for(a = 0;a < num;a++){
| ~~^~~~~
fpc.c: In function ‘U32 write_prefix_descr(Enode*, U8*, int)’:
fpc.c:676:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
676 | for(a = 0;a < sym_num;a++){
| ~~^~~~~~~~~
fpc.c:680:13: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
680 | while(a+1 < sym_num && previous == lookup[a+1].len)
| ~~~~^~~~~~~~~
fpc.c: In function ‘int FPC_compress_block(void*, const void*, int, int)’:
fpc.c:913:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
913 | for(a = 0;a < sym_num;a++)
| ~~^~~~~~~~~
fpc.c:924:14: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
924 | for(a = 0;a < sym_num && s[a].freq == 0;a++);
| ^~~~~~~~~
fpc.c:951:21: warning: comparison of integer expressions of different signedness: ‘U32’ {aka ‘unsigned int’} and ‘int’ [-Wsign-compare]
951 | if(compressed_size >= size){
| ~~~~~~~~~~~~~~~~^~~~~~~
fpc.c: In function ‘size_t comp_adaptive(void*, void*, size_t)’:
fpc.c:1038:41: warning: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Wsign-compare]
1038 | for(int c = 1;c <= MBLOCK && (cSTEP) <= inlen - a;c++){
| ~~~~~~~~~^~~~~~~~~~~~
fpc.c: In function ‘size_t FPC_compress(void
, void*, size_t, int)’:
fpc.c:77:23: warning: comparison of integer expressions of different signedness: ‘size_t’ {aka ‘long unsigned int’} and ‘int’ [-Wsign-compare]
77 | #define MIN(A,B) ((A) < (B)?(A):(B))
| ~~~~^~~~~
fpc.c:1095:14: note: in expansion of macro ‘MIN’
1095 | U32 step = MIN(inlen,bsize);
| ^

/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: /tmp/ccwLzlL2.o: in function dec_file(_IO_FILE*, _IO_FILE*) [clone .part.0]': cli.c:(.text+0xae): undefined reference to FPC_decompress_block'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: /tmp/ccwLzlL2.o: in function comp_file(_IO_FILE*, _IO_FILE*, int)': cli.c:(.text+0x2cc): undefined reference to FPC_compress'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: /tmp/ccwLzlL2.o: in function bench_file(_IO_FILE*, unsigned int, int)': cli.c:(.text+0x528): undefined reference to FPC_compress'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: cli.c:(.text+0x543): undefined reference to FPC_compress' /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: cli.c:(.text+0x55e): undefined reference to FPC_compress'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: cli.c:(.text+0x577): undefined reference to FPC_compress' /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: cli.c:(.text+0x5d5): undefined reference to FPC_decompress'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: cli.c:(.text+0x5f0): undefined reference to FPC_decompress' /usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: cli.c:(.text+0x60b): undefined reference to FPC_decompress'
/usr/lib64/gcc/x86_64-suse-linux/9/../../../../x86_64-suse-linux/bin/ld: cli.c:(.text+0x624): undefined reference to `FPC_decompress'
collect2: error: ld returned 1 exit status
make: *** [Makefile:5: fpc] Error 1

Bug in adaptive blocksize

#include <stdio.h>
#include "fpc.h"

static unsigned char degen[] = {
0x1f, 0x00, 0x01, 0x00, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
0xff, 0xff, 0xff, 0x3b, 0x1f, 0x02, 0xcf, 0x1c, 0xff, 0xff, 0xff, 0xff,
0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
0xd7, 0x50, 0x00, 0x00, 0x00, 0x00, 0x00,
};

int main() {

        unsigned char dst[16384];

        printf("%lu compressed to %ld\n",
                sizeof(degen),
                FPC_compress(dst, degen, sizeof(degen), 0));

        return 0;
}

valgrind ./test

==8103== Memcheck, a memory error detector
==8103== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==8103== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==8103== Command: ./test
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x40171E: FPC_compress_block (fpc.c:911)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x400646: byte_count (fpc.c:415)
==8103==    by 0x40173A: FPC_compress_block (fpc.c:914)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x4006BB: byte_count (fpc.c:415)
==8103==    by 0x40173A: FPC_compress_block (fpc.c:914)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x4006C0: byte_count (fpc.c:434)
==8103==    by 0x40173A: FPC_compress_block (fpc.c:914)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x4006D3: byte_count (fpc.c:434)
==8103==    by 0x40173A: FPC_compress_block (fpc.c:914)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x401923: FPC_compress_block (fpc.c:921)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x401B30: FPC_compress_block (fpc.c:763)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x401B89: FPC_compress_block (fpc.c:763)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x401BAA: FPC_compress_block (fpc.c:954)
==8103==    by 0x402052: comp_adaptive (fpc.c:1000)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
==8103== Conditional jump or move depends on uninitialised value(s)
==8103==    at 0x40207A: comp_adaptive (fpc.c:1080)
==8103==    by 0x4021BB: FPC_compress (fpc.c:1095)
==8103==    by 0x4021EA: main (main.c:18)
==8103==
79 compressed to 41
==8103==
==8103== HEAP SUMMARY:
==8103==     in use at exit: 0 bytes in 0 blocks
==8103==   total heap usage: 1 allocs, 1 frees, 1 bytes allocated
==8103==
==8103== All heap blocks were freed -- no leaks are possible
==8103==
==8103== For counts of detected and suppressed errors, rerun with: -v
==8103== Use --track-origins=yes to see where uninitialised values come from
==8103== ERROR SUMMARY: 102 errors from 10 contexts (suppressed: 5 from 5)

I noticed this when some results were wildly out of line (source buffer < 100 bytes, compressed result >15000 bytes).

Sparse files not compressing well

Thanks for making this. I tested it on a set of binary files, and while it beat FSE on 1/8 of them, for the rest FSE did better. Seems the ones where FSE won by a large margin are sparse.

I configured FPC for 1 stream and an adaptive step of 128.

Here are some sample files, if you're interested. Released to public domain.
https://files.catbox.moe/pbqhrb.tgz

PS: FPC did get some large wins over FSE too. By trying both, brute force style, and using the better one, the total file size for the set decreased 1.6%.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.