Coder Social home page Coder Social logo

powturbo / turbo-range-coder Goto Github PK

View Code? Open in Web Editor NEW
65.0 7.0 5.0 1.16 MB

TurboRC - Fastest Range Coder + Arithmetic Coding / Fastest Asymmetric Numeral Systems

License: GNU General Public License v3.0

C 94.84% Makefile 0.51% C++ 4.65%
range-coder entropy-coding arithmetic-coding entropy-coder arithmetic-coder encoding decoding huffman-coding asymmetric-numeral-systems data-compression

turbo-range-coder's Issues

Two iota

TRC is lovely.

  1. these encoders can write past end of buffer. Is there a maximum over-run?

  2. This seems to fix over runs:
    #define OVERFLOW(in, inlen) if(op >= out+inlen-16) { memcpy(out, in, inlen); return inlen; }

  3. It seems cleaner to specify the outlen as a parameter (inlen for decompress).
    outlen might be reduced externally for application reasons and internally for maximum overrun.
    Omitting the memcpy()
    #define OVERFLOW(in, inlen) if(op >= outbound)return 0;

Thanks!

Benchmark: Turbo-Range-Coder - i9-13900KS, DDR5-7800MHz

File enwik8 (uniform distribution)

turborc -e0 enwik8

      size   ratio     E MB/s   D MB/s function prdid='s(5)'
  61250092  61.25%      95.08    75.61  1:rc        o0                                  
  45924756  45.92%      97.13    76.44  2:rcc       o1                                  
  36861388  36.86%      87.43    65.74  3:rcc2      o2                                  
  44876688  44.88%      88.30    68.08  4:rcx       o8b =o1 context slide               
  36829776  36.83%      74.57    58.56  5:rcx2      o16b=o2 context slide               
  45655424  45.66%      42.86    30.10  9:rcms      o1 mixer/sse                        
  35905340  35.91%      33.53    26.71 10:rcm2      o2 mixer/sse                        
  49386012  49.39%      36.93    26.86 11:rcmr      o2 8b mixer/sse run                 
  49813368  49.81%      35.81    25.64 12:rcmrr     o2 8b mixer/sse run > 2             
  61488516  61.49%      90.75    73.53 13:rcrle     RLE o0                              
  45280544  45.28%      90.07    70.55 14:rcrle1    RLE o1                              
  61542888  61.54%      80.60    66.28 17:rcu3      varint8 3/5/8 bits                  
  57813600  57.81%      43.44    44.13 18:rcqlfc    QLFC                                
  65292242  65.29%      53.71    54.82 18:bec       Bit EC                              
  21009342  21.01%      29.78    65.58 20:bwt                                           
  71659272  71.66%      65.51    56.15 26:rcg-8     gamma                               
  84138152  84.14%      66.53    52.24 27:rcgz-8    gamma zigzag                        
  74083592  74.08%      83.35    62.92 28:rcr-8     rice                                
  86625880  86.63%      75.46    48.54 29:rcrz-8    rice zigzag                         
  63541692  63.54%     604.88    77.71 42:cdfsb     static/decode search                
  63541692  63.54%     605.85    82.23 43:cdfsv     static/decode division              
  63948046  63.95%     468.79    72.88 44:cdfsm     static/decode division lut          
  63541700  63.54%     711.95    74.37 45:cdfsb     static interlv/dec. search          
  61250312  61.25%     254.42    89.73 46:cdf       byte   adaptive                     
  61250320  61.25%     248.56    89.34 47:cdfi      byte   adaptive interleaved         
  69876372  69.88%     183.35    70.31 48:cdf-8     vnibble                             
  69876376  69.88%     179.62    71.14 49:cdfi-8    vnibble interleaved                 
  61342502  61.34%     171.38   253.53 56:ans       rANS interleaved                                      
 100000000 100.00%   22959.30 23141.54 59:memcpy                                        

Level 25 fails decompression on enwik8

./turborc -25 /tmp/enwik8 /tmp/enwik8.rc
ll /tmp/enwik8.rc
-rw-rw-r-- 1 fred fred 58290080 Mar 21 13:03 /tmp/enwik8.rc
rm /tmp/enwik8.rc.bak
./turborc -d /tmp/enwik8.rc /tmp/enwik8.rc.bak
ll /tmp/enwik8.rc.bak
-rw-rw-r-- 1 fred fred 100000000 Mar 21 13:04 /tmp/enwik8.rc.bak
md5sum /tmp/enwik8.rc.bak /tmp/enwik8
0f86d7c5a6180cf9584c1d21144d85b0 /tmp/enwik8.rc.bak
a1fa5ffddb56f4953e226637dabbb36a /tmp/enwik8

Issues compiling for arm64 macOS

I'm trying to compile it for my machine and having some issues:

  • Some per-source file flags in the makefile seem tied to Intel:
 $(L)anscdfs.o: $(L)anscdf.c $(L)anscdf_.h
-       $(CC) -c -O3 $(CFLAGS) -march=corei7-avx -mtune=corei7-avx -mno-aes -falign-loops=32 $(L)anscdf.c -o $(L)anscdfs.o  
+       $(CC) -c -O3 $(CFLAGS) -mno-aes -falign-loops=32 $(L)anscdf.c -o $(L)anscdfs.o  
 
 $(L)anscdfx.o: $(L)anscdf.c $(L)anscdf_.h
-       $(CC) -c -O3 $(CFLAGS) -march=haswell -falign-loops=32 $(L)anscdf.c -o $(L)anscdfx.o 
+       $(CC) -c -O3 $(CFLAGS) -falign-loops=32 $(L)anscdf.c -o $(L)anscdfx.o 
  • Architecture detection seems to look for aarch64 but not arm64, which is what the machine returns for uname -s
  • Frame pointers shouldn't be omitted for any macOS binaries
  • There are many system headers that needed to be included to different files, or else add CFLAGS+=-Wno-implicit-function-declaration
  • Issue with sse_neon.h:
./include_/sse_neon.h:232:85: error: invalid conversion between vector type 'uint64x2_t' (vector of 2 'uint64_t' values) and 'uint8x8_t' (vector of 8 'uint8_t' values) of different size
static ALWAYS_INLINE uint64_t  mm_movemask4_epu8(__m128i v) { return vgetq_lane_u64((uint64x2_t)vshrn_n_u16((uint8x16_t)v, 4), 0); } //uint8x16_t
  • Issue of using __m128i for arm64:
./include_/bitutil_.h:128:77: error: returning 'int' from a function with incompatible result type 'uint32x4_t' (vector of 4 'uint32_t' values)
static ALWAYS_INLINE __m128i mm_delta_epi64(__m128i v, __m128i sv) { return _mm_sub_epi64(v, _mm_alignr_epi8(v, sv,  8)); }

e.g. here:

  #if defined(__SSSE3__) || defined(__ARM_NEON)
#define mm_srai_epi64_63(_v_, _s_) _mm_srai_epi32(_mm_shuffle_epi32(_v_, _MM_SHUFFLE(3, 3, 1, 1)), 31)
  
static ALWAYS_INLINE __m128i mm_zzage_epi16(__m128i v) { return _mm_xor_si128( mm_slli_epi16(v,1),  mm_srai_epi16(   v,15)); }

Those last couple issues are the main ones preventing me from compiling it, the other ones were easy to patch

Does not compile on MacOSX

Hi, this does not compile on MacOSX for me.

OSX: 13.2 (22D49)
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.3.0
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

I have tried to fix it, but some issues remain.

If you like, I can make a fork and pull request for the fork? it contains some fixes I made to let it compile better. Just some missing headers. For example, it could not find malloc. But even after those fixes, it has issues I could not fix. Here below:

turborc.c:390:1: error: call to undeclared function 'rcrzsenc8'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
RCGEN2(rcrz, 8)
^
turborc.c:339:31: note: expanded from macro 'RCGEN2'
case RC_PRD_S : return p##senc##s( in, inlen, out);
^
:246:1: note: expanded from here
rcrzsenc8
^
turborc.c:390:1: error: call to undeclared function 'rcrzssenc8'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
turborc.c:340:31: note: expanded from macro 'RCGEN2'
case RC_PRD_SS: return p##ssenc##s(in, inlen, out, prm1,prm2);
^
:248:1: note: expanded from here
rcrzssenc8
^
turborc.c:497:70: error: call to undeclared function 'rccdfsmenc'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
case 44: TM("44:cdfsm static/decode division lut ",l=rccdfsmenc(in,n,out,cdf,m+1),n,l, CCPY:(m<16?rccdfsmldec(out,n,cpy, cdf, m+1):rccdfsmbdec(out,n,cpy, cdf, m+1))); break;

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.