Coder Social home page Coder Social logo

turbotranspose's Introduction

Turbo Transpose compressor filter for binary data Build Status

  • Fastest transpose/shuffle
    • Byte/Nibble transpose/shuffle for improving compression of binary data (ex. floating point data)
    • โœจ Scalar/SIMD Transpose/Shuffle 8,16,32,64,... bits
    • ๐Ÿ‘ Dynamic CPU detection and JIT scalar/sse/avx2 switching
    • 100% C (C++ headers), usage as simple as memcpy
  • Byte Transpose
    • Fastest byte transpose
  • Nibble Transpose
    • nearly as fast as byte transpose
    • more efficient in most binary data files, up to 6 times! faster than Bitshuffle
    • more robust worst case scenario than bitshuffle
  • Scalar and SIMD Transform
    • Delta encoding for sorted lists
    • Zigzag encoding for unsorted lists
    • Xor encoding

Transpose Benchmark:

  • CPU: Skylake i7-6700 3.4GHz gcc 6.2 single thread

- Speed test

Benchmark w/ 16k buffer

BOLD = pareto frontier.
c/t: cycles per 1000 bytes. E:Encode, D:Decode

    ./tpbench -s# file -B16K   (# = 8,4,2)
Size E Time c/t D Time c/t Transpose 64 bits AVX2
16.000 199 134 tpbyte 8
16.000 326 201 Blosc_shuffle 8
16.000 394 260 tpnibble 8
16.000 848 478 Bitshuffle 8
Size E Time c/t D Time c/t Transpose 32 bits AVX2
16.000 121 102 tpbyte 4
16.000 451 139 Blosc_shuffle 4
16.000 345 229 tpnibble 4
16.000 773 476 Bitshuffle 4
Size E Time c/t D Time c/t Transpose 16 bits AVX2
16.000 95 71 tpbyte 2
16.000 640 108 Blosc_shuffle 2
16.000 329 198 tpnibble 2
16.000 758 1177 Bitshuffle 2
16.000 67 67 memcpy
Transpose/Shuffle benchmark w/ large files.

MB/s: 1.000.000 bytes/second

    ./tpbench -s# file  (# = 8,4,2)
Size E Time MB/s D Time MB/s Transpose 64 bits AVX2
100.000.000 8387 9408 tpbyte 8
100.000.000 8134 8598 Blosc_shuffle 8
100.000.000 7797 9145 tpnibble 8
100.000.000 3548 3459 Bitshuffle 8
100.000.000 13366 13366 memcpy
Size E Time MB/s D Time MB/s Transpose 32 bits AVX2
100.000.000 8398 9533 tpbyte 4
100.000.000 8198 9307 tpnibble 4
100.000.000 8193 8796 Blosc_shuffle 4
100.000.000 3679 3666 Bitshuffle 4
Size E Time MB/s D Time MB/s Transpose 16 bits AVX2
100.000.000 7878 9542 tpbyte 2
100.000.000 8987 9412 tpnibble 2
100.000.000 7739 9404 Blosc_shuffle 2
100.000.000 3879 2547 Bitshuffle 2

- Compression test (transpose/shuffle+lz4)

File File size lz4 only TpByte % TpNibble % Bitshuffle %
msg_bt 266.389.432 94.5 77.2 76.5 81.6
msg_lu 194.118.968 100.4 82.7 81.0 83.7
msg_sp 290.105.856 100.4 79.2 77.5 80.2
msg_sppm 278.995.864 18.9 14.5 14.9 19.5
msg_sweep3d 125.731.224 98.7 50.7 36.7 80.4
num_brain 141.840.000 100.4 82.6 81.1 84.5
num_comet 107.347.968 92.8 83.3 78.8 76.3
num_control 159.504.744 99.6 92.2 90.9 89.4
num_plasma 35.089.600 75.2 0.7 0.7 84.5
obs_error 62.160.816 78.7 81.0 77.5 84.4
obs_info 18.930.528 92.3 75.4 70.6 82.4
obs_spitzer 198.180.864 95.4 93.2 93.7 86.4
obs_temp 39.934.272 100.4 93.1 93.8 91.7

Compile:

	git clone git://github.com/powturbo/TurboTranspose.git
    cd TurboTranspose
Linux + Windows MingW
	make
    or
	make AVX2=1
Windows Visual C++
	nmake /f makefile.vs
    or
	nmake AVX2=1 /f makefile.vs
  • benchmark with other libraries
    download or clone bitshuffle or blosc and type

      make AVX2=1 BLOSC=1
      or
      make AVX2=1 BITSHUFFLE=1
    

Testing:

  • benchmark "transpose" functions

    ./tpbench [-s#] [-z] file
    s# = element size #=2,4,8,16,... (default 4) 
    -z = only lz77 compression benchmark (bitshuffle package mandatory)
    

Function usage:

Byte transpose:

void tpenc( unsigned char *in, unsigned n, unsigned char *out, unsigned esize);
void tpdec( unsigned char *in, unsigned n, unsigned char *out, unsigned esize)

in : input buffer
n : number of bytes
out : output buffer
esize : element size in bytes (2,4,8,...)

Nibble transpose:

void tp4enc( unsigned char *in, unsigned n, unsigned char *out, unsigned esize);
void tp4dec( unsigned char *in, unsigned n, unsigned char *out, unsigned esize)

in : input buffer
n : number of bytes
out : output buffer
esize : element size in bytes (2,4,8,...)

Environment:

OS/Compiler (64 bits):
  • Linux: GNU GCC (>=4.6)
  • clang (>=3.2)
  • Windows: MinGW-w64
  • Windows: Visual C++ (>=VS2008)
Multithreading:
  • All TurboTranspose functions are thread safe

References:

Last update: 01 JUL 2017

turbotranspose's People

Contributors

powturbo avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.