Coder Social home page Coder Social logo

lz77-compressor's Introduction

A Python LZ77-Compressor

A simplified implementation of the LZ77 compression algorithm in python.

Implementation

The compressor follows the implementation of the standard LZ77 compression algorithm. Using a lookahead buffer at a certain position, the longest match is found from a fixed size window of data history. If a match is found, the substring is replaced with a pointer (distance) to the start of the match, and the length of the match.

Setup and Dependencies

First, you will need to clone the repository:

  git clone https://github.com/manassra/LZ77-Compressor.git

The LZ77Compressor uses the bitarray python module as the only dependency.

This dependency can be install by using pip, the python package manager, as follows:

  pip install -r requirements.txt 

Usage

  from LZ77 import LZ77Compressor
  
  compressor = LZ77Compressor(window_size=20) # window_size is optional
Options

window_size an optional integer specifying the length of the history window. Default is 20.

Compressing Files

  input_file_path = '/Users/manassra/...'
  output_file_path = '/Users/manassra/...'
  
  # compress the input file and write it as binary into the output file
  compressor.compress(input_file_path, output_file_path)
  
  # or assign compressed data into a variable 
  compressed_data = compressor.compress(input_file_path)
Options

verbose if True, the compression description is printed to standard output.

Example: if a file has "hello hello", verbose will print the following description: <0, h> <0, e> <0, l> <0, l> <0, o> <0, > <1, 6, 5>

Decompressing Files

  input_file_path = '/Users/manassra/...'
  output_file_path = '/Users/manassra/...'
  
  # decompress the input file and write it as binary into the output file
  compressor.decompress(input_file_path, output_file_path)
  
  # or assign decompressed data into a variable 
  decompressed_data = compressor.decompress(input_file_path)

Examples

the folder /examples has the following files:

input.txt, a file containing some text (size: 231 bytes)

compressed_window_100.txt, a compressed output of it using this algorithm with window size of 100 (size: 71 bytes, 30% of original size)

decompressed.txt, the decompressed file back to its original form (same as input.txt, size: 231 bytes)

lz77-compressor's People

Contributors

manassra avatar tsattolo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

lz77-compressor's Issues

Is there a mistake in README?

In readme, it says the output verbose is as following,

Example: if a file has "hello hello", verbose will print the following description: <0, h> <0, e> <0, l> <0, l> <0, o> <0, > <1, 6, 5>

But I think the correct output should be <0, h> <0, e> <0, l> <1, 1, 1> <0, o> <0, > <1, 6, 5>, because the character 'l' appears twice, isn't it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.