A Brotli implementation in pure and safe Rust

Overview

Brotli-rs - Brotli decompression in pure, safe Rust

Build Status Works on stable Works on beta Works on nightly

Documentation

Compression provides a <Read>-struct to wrap a Brotli-compressed stream. A consumer is thus able to read a compressed stream with usual convenience.

Changelog

v0.3.22 -> v0.3.23


Bug with literal buffer not being populated when processing uncompressed metablock, bug where a valid stream could be rejected too early as oversized, if the last bytes happened to be shortened by an OMIT-type transformation after the early check, reported and fixed by Daniel.

v0.3.21 -> v0.3.22


Bug with metablock structure not getting fully reset when encountering a new metablock in the brotli stream, reported and fixed by Daniel.

v0.3.20 -> v0.3.21


Bug with multiple metablocks, reported and fixed by Daniel.

###v0.3.19 -> v0.3.20

Worked around feature gate issue in nightly. (Thanks, Corey!)

v0.3.18 -> v0.3.19


Removed 64k big Pseudo-Code lookup table, because creating this trivial table probably took more time than making the calculcation on the fly. (Bench tests seem to suggest a 1% time savings without the lookup table)

v0.3.17 -> v0.3.18


Fixed case where a simple prefix code could have duplicate symbols.

v0.3.16 -> v0.3.17


Fixed case where a complex prefix code could have an incorrect checksum on its runlength code.

v0.3.15 -> v0.3.16


  • Fixed incorrect calculation of alphabet size for distance code.
  • Fixed evaluation where streams with excessive insert/copy lengths could be rejected early.

v0.3.14 -> v0.3.15


Fixed injection of invalid symbols in simple prefix code.

v0.3.13 -> v0.3.14


Fixed invalid block-type in switch command. (Thanks, Corey!).

v0.3.12 -> v0.3.13


Fixed uncaught non-positive distances. (Thanks, Corey!).

v0.3.11 -> v0.3.12


Fixed uncaught zero-byte in word transformation. (Thanks, Corey!).

v0.3.10 -> v0.3.11


Fixed possible arithmetic overflow in word transformation. (Thanks, Corey!).

v0.3.9 -> v0.3.10


Fixed incorrect type for runlength code. (Thanks, Corey!).

v0.3.8 -> v0.3.9


Fixed incorrect array index bound check in tree lookup. (Thanks, Corey!).

v0.3.7 -> v0.3.8


Fixed some value range checks on block types and ntree*. (Thanks, Corey!).

v0.3.6 -> v0.3.7


Went over "unreachable!()" statements, analyzed, and handled error condition properly, if they were reachable through invalid data.

v0.3.5 -> v0.3.6


Fixed a case where an invalid prefix code with all-zero codelengths could create an index-out-of-bounds panic. (Thanks, Corey!).

v0.3.4 -> v0.3.5


Fixed a case where an invalid insert-and-copy-length-code would produce a panic. (Thanks, Corey!).

v0.3.1 -> v0.3.4


Fair amount of internal small improvements, improving code quality. Fixed a couple of cases where invalid streams would lead to panics and/or infinite loops (Thanks, Corey!).

v0.3.0 -> v0.3.1


This is only a minor version bump, with no breakage in usage, but it's exciting nonetheless!

In Brotli, a lot of work is done with and by prefix codes. Through a change in the internal representation of prefix codes, it was possible to speed the reference benchmark time by a factor of ~7. The benchmark decompresses the contents of the file data/monkey.compressed.

  • With linked-list-based, recursive tree implementation:
    test bench_monkey ... bench: 866,888 ns/iter (+/- 58,119)

  • With array-based, iterative tree implementation, before max-depth constraint:
    test bench_monkey ... bench: 704,282 ns/iter (+/- 220,068)

  • With array-based, iterative tree implementation, with max-depth constraint:
    test bench_monkey ... bench: 120,745 ns/iter (+/- 16,627)

v0.2.0 -> v0.3.0


  • renamed crate compression -> brotli
  • restructured modules to avoid redundant paths like brotli::brotli::Decompressor (now it's just brotli::Decompressor)

v0.1.0 -> v0.2.0


  • Decompressor::new() now accepts a Read, as opposed to a BitReader.
Comments
  • zip file compressed with brotli doesn't roundtrip

    zip file compressed with brotli doesn't roundtrip

    https://www.dropbox.com/s/ubfjbfe0oowfvtl/svg.zip?dl=0 when compressed to brotli is https://www.dropbox.com/s/9ifkxzaxblfhz50/svg.zip.compressed?dl=0

    and it fails to roundtrip with the following message

    "data/svg.zip.compressed":
    output length = 65703
    res = Err(Error { repr: Custom(Custom { kind: InvalidData, error: StringError("More uncompressed bytes than expected in meta-block") }) })
    ===========
    

    However the copy and insert lengths appear to be the same as with the google-provided decompressor https://www.dropbox.com/s/gc6rjqtca4mi4cx/svg.zip.cerrlog.txt?dl=0

    brotli's log:

    https://www.dropbox.com/s/vptya6rxmks5f97/svg.zip.errlog.txt?dl=0

    Here's the tail end of the log from the brotli-rs binary

    Insert And Copy Length = 137
    (m_len, self.meta_block.count_output, insert_length, copy_length) = (87897, 87883, 1, 3)
    btype = 0
    [p1, p2] = RingBuffer { buf: [0, 0], pos: 0, cap: 2 }
    Context Mode = 3
    Lit = 0 0
    Count output 87884
    Count output 87887
    Insert And Copy Length = 182
    (m_len, self.meta_block.count_output, insert_length, copy_length) = (87897, 87887, 6, 8)
    Fatal (m_len, a, b, c) = (87897, 87887, 87893, 87901)
    output length = 65703
    res = Err(Error { repr: Custom(Custom { kind: InvalidData, error: StringError("More uncompressed bytes than expected in meta-block") }) })
    ===========
    
    OK ()
    
    bug 
    opened by danielrh 11
  • Panic when decompressing

    Panic when decompressing

    extern crate brotli;
    
    use std::io::{self, Read};
    use brotli::Decompressor;
    
    fn main() {
        let mut input = vec![];
        let _ = Decompressor::new(&b"\x1b\x3f\xff\xff\xdb\x4f\xe2\x99\x80\x12".to_vec() as &[u8]).read_to_end(&mut input);
    }
    

    Crash discovered using afl.rs

    bug 
    opened by frewsxcv 6
  • Access to the 'brotli' crates.io name

    Access to the 'brotli' crates.io name

    https://crates.io/crates/brotli

    Servo has ownership of this name right now, but we can grant you access if you want. You just need to log in to (and by doing so 'creating') your crates.io account.

    opened by frewsxcv 4
  • Uncompressed data is not fed into the literal_buf

    Uncompressed data is not fed into the literal_buf

    Brotli expects the literal_buf to be prepopulated with uncompressed_data, if such data is encountered before something needing context bytes.

    Currently the literal_buf is left to be 0,0 despite an uncompressed meta-block being encountered

    bug 
    opened by danielrh 2
  • brotli-rs fails to decompress item compressed with -q 4 -w 16

    brotli-rs fails to decompress item compressed with -q 4 -w 16

    I tried this on the bro binary from google and it fails to decompress with brotli-rs when compressed using -q 4 -w 16

    https://www.dropbox.com/s/il9uhagaaawxl6i/bro?dl=0

    https://www.dropbox.com/s/2lb4dzpjczbv8ll/bro.br?dl=0

    but this is true for almost every binary I checked... in fact I just wrote a checker script that simply, in a loop, compresses with bro -q X -w Y where x = {1..11} and y = {16..22} That's how I found this glitch.

    bug 
    opened by danielrh 2
  • Doesn't build on Rust nightly

    Doesn't build on Rust nightly

       Compiling brotli v0.3.19 (file:///Users/coreyf/Development/rust/brotli-rs)
    src/lib.rs:686:18: 686:33 error: const indexing is an unstable feature
    src/lib.rs:686              sum += 32 >> code_lengths[i];
                                             ^~~~~~~~~~~~~~~
    src/lib.rs:686:18: 686:33 help: in Nightly builds, add `#![feature(const_indexing)]` to the crate attributes to enable
    src/lib.rs:1512:10: 1512:60 error: const indexing is an unstable feature
    src/lib.rs:1512                 1 << BROTLI_DICTIONARY_SIZE_BITS_BY_LENGTH[copy_length]
                                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    src/lib.rs:1512:10: 1512:60 help: in Nightly builds, add `#![feature(const_indexing)]` to the crate attributes to enable
    src/lib.rs:1518:34: 1518:84 error: const indexing is an unstable feature
    src/lib.rs:1518             let transform_id = word_id >> BROTLI_DICTIONARY_SIZE_BITS_BY_LENGTH[copy_length];
                                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    src/lib.rs:1518:34: 1518:84 help: in Nightly builds, add `#![feature(const_indexing)]` to the crate attributes to enable
    error: aborting due to 3 previous errors
    Could not compile `brotli`.
    
    To learn more, run the command again with --verbose.
    
    coreyf@frewbook-pro ~/D/r/brotli-rs (master) [101]> rustc -Vv
    rustc 1.6.0-nightly (d49e36552 2015-12-05)
    binary: rustc
    commit-hash: d49e365528e026df6f56fe5eb001e81e2383fbf5
    commit-date: 2015-12-05
    host: x86_64-apple-darwin
    release: 1.6.0-nightly
    
    bug 
    opened by frewsxcv 2
  • Panic while decompressing

    Panic while decompressing

    extern crate brotli;
    
    use std::io::Read;
    use brotli::Decompressor;
    
    fn main() {
        let mut input = vec![];
        let _ = Decompressor::new(&b"\x51\xac\x00\x48\x2f\x73\x14\x01\x14\x00\x00\x01\x00\x14\x14\xff\x00\x02\x00\x00\x00\x00\x00\x64\x14\x24\x14\x14\x14\x14\x14\x80\x00\x00\x14\xff\xff\x00\x00\x14\x14\x14\x14\x14\x14\x80\x00\x80".to_vec() as &[u8]).read_to_end(&mut input);
    }
    

    Crash discovered using afl.rs

    bug 
    opened by frewsxcv 2
  • Panic while decompressing

    Panic while decompressing

    extern crate brotli;
    
    use std::io::Read;
    use brotli::Decompressor;
    
    fn main() {
        let mut input = vec![];
        let _ = Decompressor::new(&b"\x15\x3f\x60\x00\x15\x3f\x60\x00\x27\xb0\xdb\xa8\x80\x25\x27\xb0\xdb\x40\x80\x12".to_vec() as &[u8]).read_to_end(&mut input);
    }
    
    coreyf@frewbook-pro /t/meow (master) [101]> cargo run
         Running `target/debug/meow`
    thread '<main>' panicked at 'shift operation overflowed', /Users/coreyf/.cargo/git/checkouts/brotli-rs-33811c30552ad7ba/master/src/lib.rs:1162
    Process didn't exit successfully: `target/debug/meow` (exit code: 101)
    

    Crash discovered using afl.rs

    bug 
    opened by frewsxcv 2
  • Panic while decompressing

    Panic while decompressing

    extern crate brotli;
    
    use std::io::Read;
    use brotli::Decompressor;
    
    fn main() {
        let mut input = vec![];
        let _ = Decompressor::new(&b"\x11\x3f\x00\x00\x24\xb0\xe2\x99\x80\x12".to_vec() as &[u8]).read_to_end(&mut input);
    }
    
    coreyf@frewbook-pro /t/meow (master) [101]> cargo run
         Running `target/debug/meow`
    thread '<main>' panicked at 'index out of bounds: the len is 255 but the index is 391', ../src/libcollections/vec.rs:1110
    Process didn't exit successfully: `target/debug/meow` (exit code: 101)
    

    Crash discovered using afl.rs

    bug 
    opened by frewsxcv 2
  • Fixed compiler warnings.

    Fixed compiler warnings.

    This squelches the handful of compiler warnings which were being generated.

    It also swaps some assert_eq! instances for assert!s, since these can result in very lengthy console messages of essentially binary data.

    opened by Fraser999 1
  • Added a gitattributes file to ensure correct line endings are retained in test input files.

    Added a gitattributes file to ensure correct line endings are retained in test input files.

    On Windows, Git is often set up to convert Unix line-endings to Windows line-endings automatically for text files. This auto-conversion caused should_decompress_asyoulik_txt to fail for me on Windows.

    Adding the gitattributes file avoids the auto-conversion and allows the test to pass.

    opened by Fraser999 1
  • Performance 10x worse than C implementation

    Performance 10x worse than C implementation

    I've been timing the C implementation versus the rust implementation and I generally notice about a factor of 8-10x difference.

    Do you know offhand any obvious performance tradeoffs that were made in the design of this version?

    Do you have any ideas about various strategies we could employ to bring it within a factor of two, or ideally to the same speed as the C version especially for multi-megabyte files?

    I noticed no inline assembly in the C version, so I'm hoping it is possible to bring the rust version to parity.

    Have you done any profiling of the existing code or compared it with the C code?

    opened by danielrh 4
Owner
Thomas Pickert
Thomas Pickert
Brotlic (or BrotliC) is a thin wrapper around brotli.

Bindings to the brotli library featuring a low-overhead encoder and decoder, Writers and Readers for compression and decompression at customizable compression qualities and window sizes.

Aron Parker 18 Dec 9, 2022
Ribzip2 - A bzip2 implementation in pure Rust.

ribzip2 - a comprehensible bzip2 implementation ribzip2 is command line utility providing bzip2 compression and decompression written in pure Rust. It

null 16 Oct 24, 2022
Pure Rust bzip2 decoder

bzip2-rs Pure Rust 100% safe bzip2 decompressor. Features Default features: Rust >= 1.34.2 is supported rustc_1_37: bump MSRV to 1.37, enable more opt

Paolo Barbolini 36 Jan 6, 2023
banzai: pure rust bzip2 encoder

banzai banzai is a bzip2 encoder with linear-time complexity, written entirely in safe Rust. It is currently alpha software, which means that it is no

Jack Byrne 27 Oct 24, 2022
A Rust implementation of the Zopfli compression algorithm.

Zopfli in Rust This is a reimplementation of the Zopfli compression tool in Rust. I have totally ignored zopflipng. More info about why and how I did

Carol (Nichols || Goulding) 76 Oct 20, 2022
Zip implementation in Rust

zip-rs Documentation Info A zip library for rust which supports reading and writing of simple ZIP files. Supported compression formats: stored (i.e. n

null 549 Jan 4, 2023
Basic (and naïve) LZW and Huffman compression algorithms in Rust.

Naive implementation of the LZW and Huffman compression algorithms. To run, install the Rust toolchain. Cargo may be used to compile the source. Examp

Luiz Felipe Gonçalves 9 May 22, 2023
A utility that can download JavaScript and TypeScript module graphs and store them locally in a special zip file.

eszip A utility that can download JavaScript and TypeScript module graphs and store them locally in a special zip file. To create a new archive: > esz

Deno Land 162 Dec 24, 2022
A simple rust library to read and write Zip archives, which is also my pet project for learning Rust

rust-zip A simple rust library to read and write Zip archives, which is also my pet project for learning Rust. At the moment you can list the files in

Kang Seonghoon 2 Jan 5, 2022
DEFLATE, gzip, and zlib bindings for Rust

flate2 A streaming compression/decompression library DEFLATE-based streams in Rust. This crate by default uses the miniz_oxide crate, a port of miniz.

The Rust Programming Language 619 Jan 8, 2023
Like pigz, but rust - a cross platform, fast, compression and decompression tool.

?? crabz Like pigz, but rust. A cross platform, fast, compression and decompression tool. Synopsis This is currently a proof of concept CLI tool using

Seth 232 Jan 2, 2023
A Rust application that compress files and folders

Quick Storer This is a Rust application that compress files and folders. Usage Download or build the binary and place it on your desktop, or any other

AL68 & co. 1 Feb 2, 2022
SIMD Floating point and integer compressed vector library

compressed_vec Floating point and integer compressed vector library, SIMD-enabled for fast processing/iteration over compressed representations. This

Evan Chan 56 Nov 24, 2022
Lossless compressor and decompressor for numerical data using quantiles

This rust library compresses and decompresses sequences of numerical data very well. It currently supports the following data types: i32, i64, u32, u64, f32, f64. Smaller data types like i16 can be efficiently compressed by casting to i32. Timestamp support may come soon in the future.

Martin 163 Dec 14, 2022
Convenience library for reading and writing compressed files/streams

compress_io Convenience library for reading and writing compressed files/streams The aim of compress_io is to make it simple for an application to sup

Simon Heath 0 Dec 16, 2021
Obvious Unified Compression Helper is a CLI tool to help you compress and decompress files of several formats

Ouch! ouch stands for Obvious Unified Compression Helper and is a CLI tool to help you compress and decompress files of several formats. Features Usag

null 734 Dec 30, 2022
libbz2 (bzip2 compression) bindings for Rust

bzip2 Documentation A streaming compression/decompression library for rust with bindings to libbz2. # Cargo.toml [dependencies] bzip2 = "0.4" License

Alex Crichton 67 Dec 27, 2022
Snappy bindings for Rust

Snappy [ Originally forked from https://github.com/thestinger/rust-snappy ] Documentation Usage Add this to your Cargo.toml: [dependencies] snappy = "

Jeff Belgum 14 Jan 21, 2022
Tar file reading/writing for Rust

tar-rs Documentation A tar archive reading/writing library for Rust. # Cargo.toml [dependencies] tar = "0.4" Reading an archive extern crate tar; use

Alex Crichton 490 Dec 30, 2022