zopfli-rs / zopfli Goto Github PK
View Code? Open in Web Editor NEWA Rust implementation of the Zopfli compression algorithm.
License: Apache License 2.0
A Rust implementation of the Zopfli compression algorithm.
License: Apache License 2.0
I was just looking at the latest properties for Options
and it seems it could infinite loop if both iteration_count
and iterations_without_improvement
are None
. Sure, you would be silly to deliberately set them both to None, but I'm concerned it could quite easily happen unintentionally if you did something like this:
Options {
iteration_count: NonZeroU64::new(get_value_from_args()),
..Default::default()
}
If get_value_from_args()
returned 0
then you might have a problem.
Previously iteration_count
wasn't an option so you were required to unwrap the NonZero. I think it was safer like this - you could always set it to NonZeroU64::MAX
if you wanted to rely solely on iterations_without_improvement
.
I'm trying to integrate Zopfli into https://crates.io/crates/zip_next, but an obstacle to this is that it's designed to let users write the file to be compressed incrementally. Thus, I can't provide a Read
implementation to https://docs.rs/zopfli/latest/zopfli/fn.compress.html (even in a separate thread) except by buffering the entire file in memory; otherwise, the reader might catch up to the writer, read 0 bytes, and conclude EOF prematurely. Since the crate targets stable Rust 1.66, coroutines (where the reader would block instead of reading 0 bytes, until the actual EOF) aren't an option either. A compressing implementation of Write
would solve this problem.
This SIMD accelerated crc library may be able to improve performance: https://github.com/srijs/rust-crc32fast
Not sure if this is related to the performance difference between this and zopfli-rs?
Bejolithic is an owner of this crate, as of #1.
I just happened to notice that all 3 other crates they own are copies of other crates, renamed, stripped of attribution and relicensed:
https://crates.io/crates/forage -> https://crates.io/crates/maimo: FuzzrNet/Forage#6
https://crates.io/crates/wasmpng -> https://crates.io/crates/wasimage: datatrash/wasm-png#1
https://crates.io/crates/bbcli -> https://crates.io/crates/wingcli: losfair/blueboat#90
Probably innocuous, but thought it might be worth raising to nip in the bud potential for a supply-chain attack on users of this crate.
I have a program that creates large PNGs (up to 48 MiB of raw pixel data, but usually 12 or 16 MiB) and uses Zopfli as the deflater for PNG encoding. At these sizes, compression ratios are usually between 99.4% and 99.9%. I suspect the compression would go much faster if I could specify a larger ZOPFLI_MASTER_BLOCK_SIZE, e.g. 4 MiB, so that compression operated on fewer blocks. I'm currently using only about 20GiB RAM on a c7g.16xlarge (which has 124GiB).
I'm having difficulty integrating DeflateEncoder into https://github.com/Pr0methean/zip-next/ because of its lifetime parameter and the fact that it borrows Options. ZipWriter and its compression-algorithm-specific delegate GenericZipWriter both own all their members, so there isn't a natural choice of lifetime. Why not just let DeflateEncoder own the Options?
thread 'main' panicked at 'ERROR: Failed to read file!: TextDecode("Found invalid encoding")', src/helpers/gen_funcs.rs:16:10
The Option
struct, which can be used by client code to instruct how they want Zopfli to compress data, currently has most of its fields private:
Lines 29 to 46 in 29b9589
However, library users may find it useful to change the default values for these private options. For example, lowering numiterations
is immensely needed when dealing with big files, because otherwise the optimization could take so long.
I've been using a patch that just makes these fields pub
for some time without problems, but to upstream this improvement it would be nice to add some range checks: it does not make sense to set numiterations
or blocksplittingmax
to too high or negative values, for example.
@shssoichiro may be interested in this improvement, as tweaking these parameters may improve performance and/or compression in OxiPNG. For example, ZopfliPNG chooses the number of iterations like this:
options.numiterations = insize < 200000
? png_options->num_iterations : png_options->num_iterations_large;
I suspect that without too much difficulty, theorems along the lines of the following could be proven mathematically for the algorithm and debug_assert
ed for the implementation:
It would be helpful to have these applied to limit the number of iterations for small blocks, which would help with fuzz testing (where a very large iteration count and a very small file can be properties of a corner case that needs to be tested, even if having them happen in production would indicate a wrong assumption), especially given cargo fuzz
's bias toward very small Vec<u8>
s.
Hi! Attempting to compress empty data (my example uses compress_seekable but I have tested a file also) causes a panic.
fn main() {
let cursor = std::io::Cursor::new(&[]);
let mut out = Vec::new();
zopfli::compress_seekable(
&zopfli::Options::default(),
&zopfli::Format::Gzip,
cursor,
&mut out,
);
}
thread 'main' panicked at 'attempt to subtract with overflow', /home/sky/git/zopfli/src/deflate.rs:311:19
stack backtrace:
0: rust_begin_unwind
at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/panicking.rs:142:14
2: core::panicking::panic
at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/panicking.rs:48:5
3: zopfli::deflate::calculate_block_symbol_size_small
at /home/sky/git/zopfli/src/deflate.rs:311:19
4: zopfli::deflate::calculate_block_symbol_size_given_counts
at /home/sky/git/zopfli/src/deflate.rs:345:9
5: zopfli::deflate::try_optimize_huffman_for_rle
at /home/sky/git/zopfli/src/deflate.rs:856:20
6: zopfli::deflate::get_dynamic_lengths
at /home/sky/git/zopfli/src/deflate.rs:905:5
7: zopfli::deflate::calculate_block_size
at /home/sky/git/zopfli/src/deflate.rs:836:31
8: zopfli::squeeze::lz77_optimal
at /home/sky/git/zopfli/src/squeeze.rs:506:20
9: zopfli::deflate::blocksplit_attempt
at /home/sky/git/zopfli/src/deflate.rs:1147:17
10: zopfli::deflate::deflate_part
at /home/sky/git/zopfli/src/deflate.rs:164:31
11: zopfli::deflate::deflate
at /home/sky/git/zopfli/src/deflate.rs:104:9
12: zopfli::gzip::gzip_compress
at /home/sky/git/zopfli/src/gzip.rs:49:5
13: zopfli::compress
at /home/sky/git/zopfli/src/lib.rs:98:25
14: zopfli::compress_seekable
at /home/sky/git/zopfli/src/lib.rs:83:5
15: scratch::main
at ./src/main.rs:4:5
16: core::ops::function::FnOnce::call_once
at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Hi @kornelski, @shssoichiro, @sagacity, @mqudsi, @dfrankland, @bejolithic, and @AlexTMjugador,
I've invited you all to be admins on this repo today because you've all authored crates that depend on zopfli or significant contributions to the zopfli crate.
I started the Rust reimplementation as an experiment, and I should have acknowledged a long time ago that I wasn't really up for maintaining it beyond that.
However, you all are depending on this code! So now it's yours and you don't have to wait on me to review your PRs or fix bugs or cut new releases (crates.io invites are coming momentarily).
Please feel free to decide amongst yourselves what to do with this repo. I deliberately created a new repo rather than a fork of https://github.com/carols10cents/zopfli so that I can leave that as the experiment's archive, and this can be the canonical location for future development. I've already updated the URLs on crates.io.
I added a note to this repo's README explaining its origin, it'd be nice if that gets left in there but if you feel like taking it out, I understand :)
Enjoy your new puppy!!!!!! ๐ถ ๐ถ ๐ถ
This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.
This repository currently has no open or pending branches.
Cargo.toml
crc32fast 1.3.2
simd-adler32 0.3.7
typed-arena 2.0.2
log 0.4.20
proptest 1.4.0
proptest-derive 0.4.0
miniz_oxide 0.7.1
.github/workflows/ci.yml
actions/checkout v4
dtolnay/rust-toolchain v1
Swatinem/rust-cache v2
Some work has been done in tidying up and improving the crate API lately:
Seek
, or to provide the input data size beforehand (#7; this is a breaking change).log
macros to print miscellaneous debug information that previously was conditionally printed to some standard stream, which is much more flexible and suitable for both library and binary dependent crates.cargo-msrv
.Cargo.toml
were tweaked to have precise versions. This has pros and cons (see this and this), but I think that this is the best approach overall when combined with not-so-frequent dependency upgrade automation, so the rest of the ecosystem has a reasonable time window to keep up.I think that these changes are stable and relevant enough to justify a new release, but I'd also like to know what other maintainers think about it. Should we do it? ๐
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.