Coder Social home page Coder Social logo

absolut's Introduction

Absolut

Absolut stands for "Autogenerated Bytewise SIMD-Optimized Look-Up Tables". The following is a breakdown of this jargon:

  • Bytewise Lookup Table: One-to-one mappings between sets of bytes.
  • SIMD-Optimized: Said lookup tables are implemented using SIMD (Single Instruction Multiple Data) instructions, such as PSHUFB on x86_64 and TBL on AArch64.
  • Autogenerated: This crate utilizes procedural macros to generate (if possible) SIMD lookup tables given a human-readable byte-to-byte mapping.

Why?

SIMD instructions allow for greater data parallelism when performing table lookups on bytes. This is has proved incredibly useful for high-performance data processing.

Unfortunately, SIMD table lookup instructions (or byte shuffling instructions) operate on tables too small to cover the entire 8-bit integer space. These tables typically have a size of 16 on x86_64, while on AArch64 tables of up to 64 elements are supported.

This library facilitates the generation of SIMD lookup tables from high-level descriptions of byte-to-byte mappings. The goal is to avoid the need to hardcode manually-computed SIMD lookup tables, thus enabling a wider audience to utilize these techniques more easily.

How?

Absolut is essentially a set of procedural macros that accept byte-to-byte mapping descriptions in the form of Rust enums:

#[absolut::one_hot]
pub enum JsonTable {
    #[matches(b',')]
    Comma,
    #[matches(b':')]
    Colon,
    #[matches(b'[', b']', b'{', b'}')]
    Brackets,
    #[matches(b'\r', b'\n', b'\t')]
    Control,
    #[matches(b' ')]
    Space,
    #[wildcard]
    Other,
}

The above JsonTable enum encodes the following one-to-one mapping:

Input Output
0x2C Comma
0x3A Colon
0x5B, 0x5D, 0x7B, 0x7D Brackets
0xD, 0xA, 0x9 Control
0x20 Space
* Other

Where * denotes all other bytes not explicitly mapped.

Mapping results needn't be explicitly defined as Absolut will solve for them automatically. In the previous code snippet, the expression JsonTable::Space as u8 evaluates to the output byte when performing a table lookup on 0x20.

Absolut supports multiple techniques for constructing SIMD lookup tables called algorithms. Each algorithm is implemented as a procedural macro that accepts byte-to-byte mappings described using enums with attribute-annotated variants as illustrated above with the absolut::one_hot algorithm.

Known issues

Error messages

In case a byte-to-byte mapping cannot be implemented using a given Absolut algorithm (i.e. the table is unsatisfiable) the resulting error messages won't be useful for understanding why the algorithm failed to solve for the table. Unless the user is at least vaguely familiar with how the algorithm at play works, it would be difficult for them to figure out how to change the mapping in such a way that it becomes satisfiable and stay useful for their purposes.

SIMD lookup routines

Absolut currently does not provide SIMD implementations of lookup routines for the generated lookup tables. However, the library tests contain lookup routines for SSSE3 and NEON.

License

Absolut is open-source software licensed under the terms of the MIT License.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.