A reimplementation of the Zopfli compression tool in Rust.

Related tags

Compression zopfli
Overview

Zopfli in Rust crates.io badge

This is a reimplementation of the Zopfli compression tool in Rust.

Carol Nichols started the Rust implementation as an experiment in incrementally rewriting a C library in Rust, keeping the project compiling at every step. For more information about that experiment, see the slides for a talk she gave about it and the repo as it was for the experiment.

How to build

To build the code, run:

$ cargo build --release

and the executable will be in target/release/zopfli.

This should work on stable or beta Rust.

You can also run make zopfli, which will run cargo build and then symlink target/release/zopfli to just zopfli in the project root; this is what the C library does and it was useful for scripting purposes during the rewrite process to keep the command and resulting artifacts the same.

Running the tests

There are some unit tests, mostly around the boundary package merge algorithm implementation in katajainen.rs, that can be run with:

$ cargo test

Golden master tests, to check that compressed files are exactly the same as the C implementation would generate, can be run using:

$ ./test/run.sh

and then checking that git reports no changes to the files in test/results.

Or you can run make test, which will run cargo test, then ./test/run.sh, and then will fail if there are any changed files according to git. Note that if you have uncommitted changes and you run this, your changes will cause this command to fail, but the tests actually passed.

Comments
  • Maintenance of this crate

    Maintenance of this crate

    Hi @kornelski, @shssoichiro, @sagacity, @mqudsi, @dfrankland, @bejolithic, and @AlexTMjugador,

    I've invited you all to be admins on this repo today because you've all authored crates that depend on zopfli or significant contributions to the zopfli crate.

    I started the Rust reimplementation as an experiment, and I should have acknowledged a long time ago that I wasn't really up for maintaining it beyond that.

    However, you all are depending on this code! So now it's yours and you don't have to wait on me to review your PRs or fix bugs or cut new releases (crates.io invites are coming momentarily).

    Please feel free to decide amongst yourselves what to do with this repo. I deliberately created a new repo rather than a fork of https://github.com/carols10cents/zopfli so that I can leave that as the experiment's archive, and this can be the canonical location for future development. I've already updated the URLs on crates.io.

    I added a note to this repo's README explaining its origin, it'd be nice if that gets left in there but if you feel like taking it out, I understand :)

    Enjoy your new puppy!!!!!! 🐶 🐶 🐶

    opened by carols10cents 6
  • Drop `Seek` or input size knowledge requirements

    Drop `Seek` or input size knowledge requirements

    Requiring input byte sources to implement the Seek trait is onerous for end-users, as most programs that generate DEFLATE streams do not impose seekability requirements. In the Unix world, it's common to pipe the output of a program as an input for a compression program, which is a non-seekable data source. In addition, achieving input seekability is often non-practical in network scenarios due to buffering resource requirements and other reasons.

    By being smarter about how we use a sliding window, we can drop seekability and exact input size knowledge requirements from the API exposed by this crate, making it readily applicable for even more usage scenarios, without affecting compression. A high-level overview of the new technique is given through code comments.

    A downside I can see of this change is that it requires a sliding window ZOPFLI_MASTER_BLOCK_SIZE bytes (≈ 1 MiB) bigger than before due to the need to temporarily store an additional uncompressed master block in memory. However, I think that the better API design makes this trade-off worth it: the additional memory usage is insignificant for the kind of computers that are most likely to run a compression algorithm as demanding as Zopfli anyway.

    This is also a breaking change, as the compress_seekable function was removed from the public API. As a minor note, the sliding window for the deflate function is no longer allocated in the stack, which avoids running out of stack memory in practical scenarios I've encountered: allocating ZOPFLI_MASTER_BLOCK_SIZE bytes on the stack is a lot, and the cost of calling the memory allocator pales in comparison to actually compressing data with Zopfli.

    enhancement 
    opened by AlexTMjugador 4
  • Mildly suspicious owner

    Mildly suspicious owner

    Bejolithic is an owner of this crate, as of https://github.com/zopfli-rs/zopfli/issues/1.

    I just happened to notice that all 3 other crates they own are copies of other crates, renamed, stripped of attribution and relicensed:

    https://crates.io/crates/forage -> https://crates.io/crates/maimo: https://github.com/FuzzrNet/Forage/issues/6 https://crates.io/crates/wasmpng -> https://crates.io/crates/wasimage: https://github.com/datatrash/wasm-png/issues/1 https://crates.io/crates/bbcli -> https://crates.io/crates/wingcli: https://github.com/losfair/blueboat/issues/90

    Probably innocuous, but thought it might be worth raising to nip in the bud potential for a supply-chain attack on users of this crate.

    opened by alecmocatta 3
  • Prevent zopfli from crashing on a single zero-byte file

    Prevent zopfli from crashing on a single zero-byte file

    Correct some logic that was not taking into account the fact that the final block could also be the first block (with a block length of 0).

    This technically closes #2 and prevents zopfli from crashing when given a single zero-length file as an input or as one of the inputs, except that I've manually verified a crash still occurs if explicitly using BlockType::Uncompressed or BlockType::Fixed. The patch works fine if using the default BlockType::Dynamic. I think those are separate bugs and would have no bearing on the correctness of this patch, but I'm not certain since this is my first time looking at the code.

    (Come to think of it, I don't think the other block types are currently tested at all in any of the tests?)

    opened by mqudsi 3
  • v0.7.0 release

    v0.7.0 release

    Some work has been done in tidying up and improving the crate API lately:

    • A crash when dealing with empty files was fixed (#3).
    • The public API functions no longer require the byte source to implement Seek, or to provide the input data size beforehand (#7; this is a breaking change).
    • Some useful, but previously private Zopfli algorithm options were exposed in the API (c789bc4f). In addition, the API is better documented now.
    • The crate now uses log macros to print miscellaneous debug information that previously was conditionally printed to some standard stream, which is much more flexible and suitable for both library and binary dependent crates.
    • The MSRV was documented and tested with cargo-msrv.
    • A GitHub Actions workflow for CI was defined. It now runs the golden master tests on each commit, which increases our confidence in things working fine after every change.
    • Clippy lints were fixed, and rustfmt was run through the codebase.
    • The dependency declarations in Cargo.toml were tweaked to have precise versions. This has pros and cons (see this and this), but I think that this is the best approach overall when combined with not-so-frequent dependency upgrade automation, so the rest of the ecosystem has a reasonable time window to keep up.

    I think that these changes are stable and relevant enough to justify a new release, but I'd also like to know what other maintainers think about it. Should we do it? :smile:

    opened by AlexTMjugador 1
  • Expose more Zopfli algorithm options

    Expose more Zopfli algorithm options

    The Option struct, which can be used by client code to instruct how they want Zopfli to compress data, currently has most of its fields private:

    https://github.com/zopfli-rs/zopfli/blob/29b9589d459905af24aecc61c1406b3d0d3c724a/src/lib.rs#L29-L46

    However, library users may find it useful to change the default values for these private options. For example, lowering numiterations is immensely needed when dealing with big files, because otherwise the optimization could take so long.

    I've been using a patch that just makes these fields pub for some time without problems, but to upstream this improvement it would be nice to add some range checks: it does not make sense to set numiterations or blocksplittingmax to too high or negative values, for example.

    @shssoichiro may be interested in this improvement, as tweaking these parameters may improve performance and/or compression in OxiPNG. For example, ZopfliPNG chooses the number of iterations like this:

      options.numiterations = insize < 200000
          ? png_options->num_iterations : png_options->num_iterations_large;
    
    enhancement 
    opened by AlexTMjugador 1
  • panics when compressing empty data

    panics when compressing empty data

    Hi! Attempting to compress empty data (my example uses compress_seekable but I have tested a file also) causes a panic.

    fn main() {
        let cursor = std::io::Cursor::new(&[]);
        let mut out = Vec::new();
        zopfli::compress_seekable(
            &zopfli::Options::default(),
            &zopfli::Format::Gzip,
            cursor,
            &mut out,
        );
    }
    
    thread 'main' panicked at 'attempt to subtract with overflow', /home/sky/git/zopfli/src/deflate.rs:311:19
    stack backtrace:
       0: rust_begin_unwind
                 at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/std/src/panicking.rs:584:5
       1: core::panicking::panic_fmt
                 at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/panicking.rs:142:14
       2: core::panicking::panic
                 at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/panicking.rs:48:5
       3: zopfli::deflate::calculate_block_symbol_size_small
                 at /home/sky/git/zopfli/src/deflate.rs:311:19
       4: zopfli::deflate::calculate_block_symbol_size_given_counts
                 at /home/sky/git/zopfli/src/deflate.rs:345:9
       5: zopfli::deflate::try_optimize_huffman_for_rle
                 at /home/sky/git/zopfli/src/deflate.rs:856:20
       6: zopfli::deflate::get_dynamic_lengths
                 at /home/sky/git/zopfli/src/deflate.rs:905:5
       7: zopfli::deflate::calculate_block_size
                 at /home/sky/git/zopfli/src/deflate.rs:836:31
       8: zopfli::squeeze::lz77_optimal
                 at /home/sky/git/zopfli/src/squeeze.rs:506:20
       9: zopfli::deflate::blocksplit_attempt
                 at /home/sky/git/zopfli/src/deflate.rs:1147:17
      10: zopfli::deflate::deflate_part
                 at /home/sky/git/zopfli/src/deflate.rs:164:31
      11: zopfli::deflate::deflate
                 at /home/sky/git/zopfli/src/deflate.rs:104:9
      12: zopfli::gzip::gzip_compress
                 at /home/sky/git/zopfli/src/gzip.rs:49:5
      13: zopfli::compress
                 at /home/sky/git/zopfli/src/lib.rs:98:25
      14: zopfli::compress_seekable
                 at /home/sky/git/zopfli/src/lib.rs:83:5
      15: scratch::main
                 at ./src/main.rs:4:5
      16: core::ops::function::FnOnce::call_once
                 at /rustc/45e2c2881d11324d610815bfff097e25c412199e/library/core/src/ops/function.rs:227:5
    note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
    
    opened by tcheinen 1
  • Configure Renovate

    Configure Renovate

    Mend Renovate

    Welcome to Renovate! This is an onboarding PR to help you understand and configure settings before regular Pull Requests begin.

    🚦 To activate Renovate, merge this Pull Request. To disable Renovate, simply close this Pull Request unmerged.


    Detected Package Files

    • Cargo.toml (cargo)
    • .github/workflows/ci.yml (github-actions)

    Configuration

    🔡 Renovate has detected a custom config for this PR. Feel free to ask for help if you have any doubts and would like it reviewed.

    Important: Now that this branch is edited, Renovate can't rebase it from the base branch any more. If you make changes to the base branch that could impact this onboarding PR, please merge them manually.

    What to Expect

    It looks like your repository dependencies are already up-to-date and no Pull Requests will be necessary right away.


    ❓ Got questions? Check out Renovate's Docs, particularly the Getting Started section. If you need any further assistance then you can also request help here.


    This PR has been generated by Mend Renovate. View repository job log here.

    opened by renovate[bot] 0
  • Performance improvement ideas

    Performance improvement ideas

    This SIMD accelerated crc library may be able to improve performance: https://github.com/srijs/rust-crc32fast Not sure if this is related to the performance difference between this and zopfli-rs?

    opened by andrews05 10
  • Dependency Dashboard

    Dependency Dashboard

    This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

    This repository currently has no open or pending branches.

    Detected dependencies

    cargo
    Cargo.toml
    • crc32fast 1.3.2
    • simd-adler32 0.3.4
    • byteorder 1.4.3
    • typed-arena 2.0.1
    • log 0.4.17
    github-actions
    .github/workflows/ci.yml
    • actions/checkout v3
    • dtolnay/rust-toolchain v1
    • Swatinem/rust-cache v2

    • [ ] Check this box to trigger a request for Renovate to run again on this repository
    opened by renovate[bot] 0
Releases(v0.7.1)
Owner
null
A Rust implementation of the Zopfli compression algorithm.

Zopfli in Rust This is a reimplementation of the Zopfli compression tool in Rust. I have totally ignored zopflipng. More info about why and how I did

Carol (Nichols || Goulding) 76 Oct 20, 2022
Like pigz, but rust - a cross platform, fast, compression and decompression tool.

?? crabz Like pigz, but rust. A cross platform, fast, compression and decompression tool. Synopsis This is currently a proof of concept CLI tool using

Seth 232 Jan 2, 2023
Obvious Unified Compression Helper is a CLI tool to help you compress and decompress files of several formats

Ouch! ouch stands for Obvious Unified Compression Helper and is a CLI tool to help you compress and decompress files of several formats. Features Usag

null 734 Dec 30, 2022
libbz2 (bzip2 compression) bindings for Rust

bzip2 Documentation A streaming compression/decompression library for rust with bindings to libbz2. # Cargo.toml [dependencies] bzip2 = "0.4" License

Alex Crichton 67 Dec 27, 2022
lzlib (lzip compression) bindings for Rust

lzip Documentation A streaming compression/decompression library for rust with bindings to lzlib. # Cargo.toml [dependencies] lzip = "0.1" License Lic

Firas Khalil Khana 8 Sep 20, 2022
Basic (and naïve) LZW and Huffman compression algorithms in Rust.

Naive implementation of the LZW and Huffman compression algorithms. To run, install the Rust toolchain. Cargo may be used to compile the source. Examp

Luiz Felipe Gonçalves 9 May 22, 2023
(WIP) Taking the pain away from file (de)compression

Ouch! ouch loosely stands for Obvious Unified Compression files Helper and aims to be an easy and intuitive way of compressing and decompressing files

Vinícius Miguel 734 Dec 30, 2022
gzp - Multi-threaded Compression

gzp - Multi-threaded Compression

Seth 123 Dec 28, 2022
Fastest Snappy compression library in Node.js

snappy !!! For [email protected] and below, please go to node-snappy. More background about the 6-7 changes, please read this, Thanks @kesla . ?? Help me to

LongYinan 103 Jan 2, 2023
Michael's Compression Algorithm

mca This repository contains a compression algorithm written by me (Michael Grigoryan). The algorithm is only able to compress and decompress text fil

Michael Grigoryan 1 Dec 19, 2022
A simple rust library to read and write Zip archives, which is also my pet project for learning Rust

rust-zip A simple rust library to read and write Zip archives, which is also my pet project for learning Rust. At the moment you can list the files in

Kang Seonghoon 2 Jan 5, 2022
A Brotli implementation in pure and safe Rust

Brotli-rs - Brotli decompression in pure, safe Rust Documentation Compression provides a <Read>-struct to wrap a Brotli-compressed stream. A consumer

Thomas Pickert 59 Oct 7, 2022
Brotli compressor and decompressor written in rust that optionally avoids the stdlib

rust-brotli What's new in 3.2 into_inner conversions for both Reader and Writer classes What's new in 3.0 A fully compatible FFI for drop-in compatibi

Dropbox 659 Dec 29, 2022
DEFLATE, gzip, and zlib bindings for Rust

flate2 A streaming compression/decompression library DEFLATE-based streams in Rust. This crate by default uses the miniz_oxide crate, a port of miniz.

The Rust Programming Language 619 Jan 8, 2023
Snappy bindings for Rust

Snappy [ Originally forked from https://github.com/thestinger/rust-snappy ] Documentation Usage Add this to your Cargo.toml: [dependencies] snappy = "

Jeff Belgum 14 Jan 21, 2022
Tar file reading/writing for Rust

tar-rs Documentation A tar archive reading/writing library for Rust. # Cargo.toml [dependencies] tar = "0.4" Reading an archive extern crate tar; use

Alex Crichton 490 Dec 30, 2022
Zip implementation in Rust

zip-rs Documentation Info A zip library for rust which supports reading and writing of simple ZIP files. Supported compression formats: stored (i.e. n

null 549 Jan 4, 2023
Pure Rust bzip2 decoder

bzip2-rs Pure Rust 100% safe bzip2 decompressor. Features Default features: Rust >= 1.34.2 is supported rustc_1_37: bump MSRV to 1.37, enable more opt

Paolo Barbolini 36 Jan 6, 2023
Ribzip2 - A bzip2 implementation in pure Rust.

ribzip2 - a comprehensible bzip2 implementation ribzip2 is command line utility providing bzip2 compression and decompression written in pure Rust. It

null 16 Oct 24, 2022