Coder Social home page Coder Social logo

cgbur / arcode-rs Goto Github PK

View Code? Open in Web Editor NEW
23.0 3.0 3.0 629 KB

An arithmetic coder for Rust.

Rust 100.00%
arithmetic-coding data-compression cabac entropy-coding encoder decoding lossless lossless-data-compression lossless-compression-algorithm

arcode-rs's Introduction

Arcode

An arithmetic coder for Rust.

Crates.io Crates.io GitHub top language

About

This crate provides the an efficient implementation of an arithmetic encoder/decoder. This crate is based off the paper that describes arithmetic coding found here. This implementation features many readability and performance improvements, especially on the decoding side.

The goal of this project is not to provide an out-of-the-box compression solution. Arithmetic coding (entropy encoding) is the backbone of almost every modern day compression scheme. This crate is meant to be included in future projects that rely on an efficient entropy coder e.g. PPM, LZ77/LZ78, h265/HEVC.

Core components

There are a lot of structs available for use but for the average user there are only a few that will be used.

  • Model models of the probability of symbols. Counts can be adjusted as encoding is done to improve compression.
  • Encoder encodes symbols given a source model and a symbol.
  • Decoder decodes symbols given a source model and a bitstream.

Examples

In the git repository there is an old_complex.rs file that does context switching on a per character basis. A simpler example can be found at new_simple.rs

Input and output bitstreams

In order for arithmetic coding to work streams need to be read a bit at a time (for decoding and for the encoders output). Because of this, BitBit is required. Wrapping whatever your input is in a buffered reader/writer should greatly improve performance.

Using bitbit to create an input stream.

use arcode::bitbit::{BitReader, MSB, BitWriter};
use std::io::Cursor;

fn read_example() {
  // normally you would have a Read type with a BufReader
  let mut source = Cursor::new(vec![0u8; 4]);
  let input: BitReader<_, MSB> = BitReader::new(&mut source);
}

fn out_example() {
  // once again would be Write type with a BufWriter
  let compressed = Cursor::new(vec![]);
  let mut compressed_writer = BitWriter::new(compressed);
}

Source Model(s)

Depending on your application you could have one or many source models. The source model is relied on by the encoder and the decoder. If the decoder ever becomes out of phase with the encoder you will be decoding nonsense.

model::Builder

In order to make a source model you need to use the model::Builder struct.

use arcode::{EOFKind, Model};

fn source_model_example() {
  // create a new model that has symbols 0-256
  // 8 bit values + one EOF marker
  let mut model_with_eof = Model::builder()
    .num_symbols(256)
    .eof(EOFKind::EndAddOne)
    .build();
  
  // model for 8 bit 0 - 255, if we arent using
  // the EOF flag we can set it to NONE or let it default
  // to none as in the second example below.
  let model_without_eof = Model::builder()
    .num_symbols(256)
    .eof(EOFKind::None)
    .build();
  let model_without_eof = Model::builder().num_symbols(256).build();

  // we can also create a model for 0-255 using num_bits
  let model_8_bit = Model::builder().num_bits(8).build();
  
  // update the probability of symbol 4.
  model_with_eof.update_symbol(4);
}

Encode

Encoding some simple input

use arcode::bitbit::BitWriter;
use arcode::{ArithmeticEncoder, EOFKind, Model};
use std::io::{Cursor, Result};

/// Encodes bytes and returns the compressed form
fn encode(data: &[u8]) -> Result<Vec<u8>> {
  let mut model = Model::builder()
    .num_bits(8)
    .eof(EOFKind::EndAddOne)
    .build();

  // make a stream to collect the compressed data
  let compressed = Cursor::new(vec![]);
  let mut compressed_writer = BitWriter::new(compressed);
  
  let mut encoder = ArithmeticEncoder::new(48);

  for &sym in data {
    encoder.encode(sym as u32, &model, &mut compressed_writer)?;
    model.update_symbol(sym as u32);
  }

  encoder.encode(model.eof(), &model, &mut compressed_writer)?;
  encoder.finish_encode(&mut compressed_writer)?;
  compressed_writer.pad_to_byte()?;

  // retrieves the bytes from the writer. This will
  // be cleaner when bitbit updates. Not necessary if
  // using files or a stream
  Ok(compressed_writer.get_ref().get_ref().clone())
}

Decode

use arcode::bitbit::{BitReader, MSB};
use arcode::{ArithmeticDecoder, EOFKind, Model};
use std::io::{Cursor, Result};

/// Decompresses the data
fn decode(data: &[u8]) -> Result<Vec<u8>> {
  let mut model = Model::builder()
    .num_bits(8)
    .eof(EOFKind::EndAddOne)
    .build();

  let mut input_reader = BitReader::<_, MSB>::new(data);
  let mut decoder = ArithmeticDecoder::new(48);
  let mut decompressed_data = vec![];

  while !decoder.finished() {
    let sym = decoder.decode(&model, &mut input_reader)?;
    model.update_symbol(sym);
    decompressed_data.push(sym as u8);
  }

  decompressed_data.pop(); // remove the EOF

  Ok(decompressed_data)
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.