Compared to the zip
command-line tool, or compared to the standard OpenJDK implementation (which is equivalent), zip-rs
has poor performance in both senses: it is slow (~3x slower than the standard implementations and equivalent compression), and it has only approximately the worst deflate compression available with zip.
Since compression can be a rate-limiting operation (in my case it is), this is a rather significant drawback. I realize that a significant part of this is due to deficiencies in libflate
, but it still is a substantial negative for this crate. If a native implementation that performs acceptably is too tricky to create, a feature wrapping libzip
would render the crate usable.
If you find some file of a dozen MB or so and name it speed.log
, the following will test the performance:
#[derive(Debug, Clone, Copy)]
pub struct CompressionResult {
pub dt: f64,
pub MBps: f64,
pub shrink: f64
}
impl CompressionResult {
pub fn new(dt: f64, MBps: f64, shrink: f64) -> CompressionResult {
CompressionResult { dt, MBps, shrink }
}
pub fn from(dt: f64, old_size: u64, new_size: u64) -> CompressionResult {
let MBps = ((old_size as f64)/(1024.0*1024.0))/dt;
let shrink = (new_size as f64)/(old_size as f64);
CompressionResult {
dt: (dt*1000.).round()/1000.,
MBps: (MBps*100.).round()/100.,
shrink: (shrink*1000.).round()/1000.
}
}
}
pub fn compress_the_target(cs: &str, ct: &str) -> CompressionResult {
use std::*;
use std::io::Write;
let source_p = path::Path::new(cs);
let target_p = path::Path::new(ct);
if target_p.exists() { fs::remove_file(target_p); }
let data = fs::read(source_p).unwrap();
let t0 = time::Instant::now();
{
let target = fs::File::create(target_p).unwrap();
let buffer = io::BufWriter::with_capacity(65536, target);
let mut zw = zip::ZipWriter::new(buffer);
zw.start_file(
source_p.file_name().unwrap().to_str().unwrap(),
zip::write::FileOptions::default().compression_method(
zip::CompressionMethod::Deflated
)
);
zw.write(data.as_ref());
zw.finish();
}
let elapsed = {
let d = t0.elapsed();
(d.as_secs() as f64) + (d.subsec_nanos() as f64)/1e9
};
let old_size = fs::metadata(source_p).unwrap().len();
let new_size = fs::metadata(target_p).unwrap().len();
CompressionResult::from(elapsed, old_size, new_size)
}
fn main() {
let compression_source = "speed.log";
let compression_target = "speed-rust.zip";
println!("{:?}", compress_the_target(compression_source, compression_target));
}
You can get the same report for the command-line version by calling it as an external process, e.g. here in Python:
import time
import subprocess
import os
cmdline_compression_source = 'speed.log'
cmdline_compression_target = 'speed-cmdline.zip'
def compress_the_target():
if os.path.isfile(cmdline_compression_target):
os.remove(cmdline_compression_target)
t0 = time.time()
subprocess.run(['zip', '-7', cmdline_compression_target, cmdline_compression_source])
elapsed = time.time() - t0
log_size = os.stat(cmdline_compression_source).st_size
zip_size = os.stat(cmdline_compression_target).st_size
rate = (log_size/(1024*1024.0))/elapsed
factor = zip_size/log_size
os.remove(cmdline_compression_target)
return (round(elapsed*1000)/1000, round(rate*100)/100, round(factor*1000)/1000)
compress_the_target()