Coder Social home page Coder Social logo

zopfli's People

Contributors

adamreichold avatar akhilles avatar alextmjugador avatar carols10cents avatar dwbuiten avatar fhanau avatar hello71 avatar jayxon avatar jthlim avatar khernyo avatar kornelski avatar lvandeve avatar megabyte avatar mmstick avatar mqudsi avatar mrkrzych00 avatar mtb0x1 avatar pr0methean avatar razrfalcon avatar renovate[bot] avatar rossy avatar scop avatar shepmaster avatar shssoichiro avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

zopfli's Issues

Possible to cause infinite loop?

I was just looking at the latest properties for Options and it seems it could infinite loop if both iteration_count and iterations_without_improvement are None. Sure, you would be silly to deliberately set them both to None, but I'm concerned it could quite easily happen unintentionally if you did something like this:

Options {
    iteration_count: NonZeroU64::new(get_value_from_args()),
    ..Default::default()
}

If get_value_from_args() returned 0 then you might have a problem.

Previously iteration_count wasn't an option so you were required to unwrap the NonZero. I think it was safer like this - you could always set it to NonZeroU64::MAX if you wanted to rely solely on iterations_without_improvement.

Implement Write

I'm trying to integrate Zopfli into https://crates.io/crates/zip_next, but an obstacle to this is that it's designed to let users write the file to be compressed incrementally. Thus, I can't provide a Read implementation to https://docs.rs/zopfli/latest/zopfli/fn.compress.html (even in a separate thread) except by buffering the entire file in memory; otherwise, the reader might catch up to the writer, read 0 bytes, and conclude EOF prematurely. Since the crate targets stable Rust 1.66, coroutines (where the reader would block instead of reading 0 bytes, until the actual EOF) aren't an option either. A compressing implementation of Write would solve this problem.

Mildly suspicious owner

Bejolithic is an owner of this crate, as of #1.

I just happened to notice that all 3 other crates they own are copies of other crates, renamed, stripped of attribution and relicensed:

https://crates.io/crates/forage -> https://crates.io/crates/maimo: FuzzrNet/Forage#6
https://crates.io/crates/wasmpng -> https://crates.io/crates/wasimage: datatrash/wasm-png#1
https://crates.io/crates/bbcli -> https://crates.io/crates/wingcli: losfair/blueboat#90

Probably innocuous, but thought it might be worth raising to nip in the bud potential for a supply-chain attack on users of this crate.

Make ZOPFLI_MASTER_BLOCK_SIZE an option

I have a program that creates large PNGs (up to 48 MiB of raw pixel data, but usually 12 or 16 MiB) and uses Zopfli as the deflater for PNG encoding. At these sizes, compression ratios are usually between 99.4% and 99.9%. I suspect the compression would go much faster if I could specify a larger ZOPFLI_MASTER_BLOCK_SIZE, e.g. 4 MiB, so that compression operated on fewer blocks. I'm currently using only about 20GiB RAM on a c7g.16xlarge (which has 124GiB).

DeflateEncoder should own its Options

I'm having difficulty integrating DeflateEncoder into https://github.com/Pr0methean/zip-next/ because of its lifetime parameter and the fact that it borrows Options. ZipWriter and its compression-algorithm-specific delegate GenericZipWriter both own all their members, so there isn't a natural choice of lifetime. Why not just let DeflateEncoder own the Options?

panic at reading file

thread 'main' panicked at 'ERROR: Failed to read file!: TextDecode("Found invalid encoding")', src/helpers/gen_funcs.rs:16:10

Expose more Zopfli algorithm options

The Option struct, which can be used by client code to instruct how they want Zopfli to compress data, currently has most of its fields private:

zopfli/src/lib.rs

Lines 29 to 46 in 29b9589

/// Options used throughout the program.
pub struct Options {
/* Whether to print output */
pub verbose: bool,
/* Whether to print more detailed output */
verbose_more: bool,
/*
Maximum amount of times to rerun forward and backward pass to optimize LZ77
compression cost. Good values: 10, 15 for small files, 5 for files over
several MB in size or it will be too slow.
*/
numiterations: i32,
/*
Maximum amount of blocks to split into (0 for unlimited, but this can give
extreme results that hurt compression on some files). Default value: 15.
*/
blocksplittingmax: i32,
}

However, library users may find it useful to change the default values for these private options. For example, lowering numiterations is immensely needed when dealing with big files, because otherwise the optimization could take so long.

I've been using a patch that just makes these fields pub for some time without problems, but to upstream this improvement it would be nice to add some range checks: it does not make sense to set numiterations or blocksplittingmax to too high or negative values, for example.

@shssoichiro may be interested in this improvement, as tweaking these parameters may improve performance and/or compression in OxiPNG. For example, ZopfliPNG chooses the number of iterations like this:

  options.numiterations = insize < 200000
      ? png_options->num_iterations : png_options->num_iterations_large;

Early stopping

I suspect that without too much difficulty, theorems along the lines of the following could be proven mathematically for the algorithm and debug_asserted for the implementation:

  • When an iteration of Zopfli hasn't reduced the file size, subsequent iterations won't do so either.
  • If a file's uncompressed size is N bytes, the minimum compressed size will be found within cN + d iterations for some small constants c and d (probably c < 10 and d < 10).

It would be helpful to have these applied to limit the number of iterations for small blocks, which would help with fuzz testing (where a very large iteration count and a very small file can be properties of a corner case that needs to be tested, even if having them happen in production would indicate a wrong assumption), especially given cargo fuzz's bias toward very small Vec<u8>s.

panics when compressing empty data

Hi! Attempting to compress empty data (my example uses compress_seekable but I have tested a file also) causes a panic.

fn main() {
    let cursor = std::io::Cursor::new(&[]);
    let mut out = Vec::new();
    zopfli::compress_seekable(
        &zopfli::Options::default(),
        &zopfli::Format::Gzip,
        cursor,
        &mut out,
    );
}
thread 'main' panicked at 'attempt to subtract with overflow', /home/sky/git/zopfli/src/deflate.rs:311:19
stack backtrace:
   0: rust_begin_unwind
             at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/panicking.rs:142:14
   2: core::panicking::panic
             at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/panicking.rs:48:5
   3: zopfli::deflate::calculate_block_symbol_size_small
             at /home/sky/git/zopfli/src/deflate.rs:311:19
   4: zopfli::deflate::calculate_block_symbol_size_given_counts
             at /home/sky/git/zopfli/src/deflate.rs:345:9
   5: zopfli::deflate::try_optimize_huffman_for_rle
             at /home/sky/git/zopfli/src/deflate.rs:856:20
   6: zopfli::deflate::get_dynamic_lengths
             at /home/sky/git/zopfli/src/deflate.rs:905:5
   7: zopfli::deflate::calculate_block_size
             at /home/sky/git/zopfli/src/deflate.rs:836:31
   8: zopfli::squeeze::lz77_optimal
             at /home/sky/git/zopfli/src/squeeze.rs:506:20
   9: zopfli::deflate::blocksplit_attempt
             at /home/sky/git/zopfli/src/deflate.rs:1147:17
  10: zopfli::deflate::deflate_part
             at /home/sky/git/zopfli/src/deflate.rs:164:31
  11: zopfli::deflate::deflate
             at /home/sky/git/zopfli/src/deflate.rs:104:9
  12: zopfli::gzip::gzip_compress
             at /home/sky/git/zopfli/src/gzip.rs:49:5
  13: zopfli::compress
             at /home/sky/git/zopfli/src/lib.rs:98:25
  14: zopfli::compress_seekable
             at /home/sky/git/zopfli/src/lib.rs:83:5
  15: scratch::main
             at ./src/main.rs:4:5
  16: core::ops::function::FnOnce::call_once
             at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Maintenance of this crate

Hi @kornelski, @shssoichiro, @sagacity, @mqudsi, @dfrankland, @bejolithic, and @AlexTMjugador,

I've invited you all to be admins on this repo today because you've all authored crates that depend on zopfli or significant contributions to the zopfli crate.

I started the Rust reimplementation as an experiment, and I should have acknowledged a long time ago that I wasn't really up for maintaining it beyond that.

However, you all are depending on this code! So now it's yours and you don't have to wait on me to review your PRs or fix bugs or cut new releases (crates.io invites are coming momentarily).

Please feel free to decide amongst yourselves what to do with this repo. I deliberately created a new repo rather than a fork of https://github.com/carols10cents/zopfli so that I can leave that as the experiment's archive, and this can be the canonical location for future development. I've already updated the URLs on crates.io.

I added a note to this repo's README explaining its origin, it'd be nice if that gets left in there but if you feel like taking it out, I understand :)

Enjoy your new puppy!!!!!! ๐Ÿถ ๐Ÿถ ๐Ÿถ

Dependency Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

This repository currently has no open or pending branches.

Detected dependencies

cargo
Cargo.toml
  • crc32fast 1.3.2
  • simd-adler32 0.3.7
  • typed-arena 2.0.2
  • log 0.4.20
  • proptest 1.4.0
  • proptest-derive 0.4.0
  • miniz_oxide 0.7.1
github-actions
.github/workflows/ci.yml
  • actions/checkout v4
  • dtolnay/rust-toolchain v1
  • Swatinem/rust-cache v2

  • Check this box to trigger a request for Renovate to run again on this repository

v0.7.0 release

Some work has been done in tidying up and improving the crate API lately:

  • A crash when dealing with empty files was fixed (#3).
  • The public API functions no longer require the byte source to implement Seek, or to provide the input data size beforehand (#7; this is a breaking change).
  • Some useful, but previously private Zopfli algorithm options were exposed in the API (c789bc4). In addition, the API is better documented now.
  • The crate now uses log macros to print miscellaneous debug information that previously was conditionally printed to some standard stream, which is much more flexible and suitable for both library and binary dependent crates.
  • The MSRV was documented and tested with cargo-msrv.
  • A GitHub Actions workflow for CI was defined. It now runs the golden master tests on each commit, which increases our confidence in things working fine after every change.
  • Clippy lints were fixed, and rustfmt was run through the codebase.
  • The dependency declarations in Cargo.toml were tweaked to have precise versions. This has pros and cons (see this and this), but I think that this is the best approach overall when combined with not-so-frequent dependency upgrade automation, so the rest of the ecosystem has a reasonable time window to keep up.

I think that these changes are stable and relevant enough to justify a new release, but I'd also like to know what other maintainers think about it. Should we do it? ๐Ÿ˜„

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.