SIMD Floating point and integer compressed vector library

Overview

compressed_vec

crate documentation CircleCI

Floating point and integer compressed vector library, SIMD-enabled for fast processing/iteration over compressed representations.

This is a compressed vec library, rather than a compression library. What does that mean? A compression library takes some uncompressed data and provides essentially compress() and decompress() functions. Typically you have to decompress data to be able to do anything with it, resulting in extra latency and allocations.

On the other hand, this compressed vec library allows you to iterate over and process the compressed representations directly. It is designed to balance fast iteration and SIMD processing/filtering, while compressing vectors to within 2x of the best columnar compression technology such as Apache Parquet, using techniques such as delta and XOR encoding. Applications:

  • Database engines
  • Large in-memory data processing
  • Games and other apps needing fast access to large quantities of FP vectors/matrices

Performance Numbers

Numbers are from my laptop: 2.9 GHz Core i9, 6/12 cores, 12MB L3, AVX2; from running cargo bench vector, which benchmarks a filter-and-count-matches operation directly on encoded/compressed vectors.

Vector type(s) Elements/sec Raw GBs per sec
u32 dense (no sparsity) 1.7 Gelems/s 6.8 GB/s
u32 sparse (99% zeros) 13.9 Gelems/s 55.6 GB/s
Two u32 vectors (sparse + dense)* 1.3-5.2 Gelems/s 5-20 GB/s
u64 vector, dense 955M - 1.1 Gelems/s 7.6 - 9.1 GB/s
f32, XOR encoded, 60% density 985 Melems/s 3.9 GB/s
  • The two u32 vector filtering speed (using MultiVectorFilter) depends on the order of the vectors. It is faster to filter the sparse vector first.

Creation, Iteration

To create an f32 compressed vector:

    use compressed_vec::VectorF32XorAppender;
    let mut appender = VectorF32XorAppender::try_new(2048).unwrap();
    let encoded_bytes = appender.encode_all(vec![1.0, 1.5, 2.0, 2.5]).unwrap();

The simplest way to iterate on this compressed vector (note this does not allocate at all):

    use compressed_vec::VectorReader;
    let reader = VectorReader::<f32>::try_new(&encoded_bytes[..]).unwrap();
    let sum = reader.iterate().sum::<f32>();   // Yay, no allocations!

Filtering and SIMD Processing

iterate() is the easiest API to go through individual elements of the compressed vector, but it is not the fastest. Fast data processing, such as done in the filter-and-count benchmarks above, are performed using Sinks, which are defined in the sink module. Sinks operate on a SIMD word at a time, and the sink API is designed for inlining.

For example, let's say that we want to add 2.5 to the f32 vector above, and then write out the results to a Vec<f32>. Internally, XOR encoding and decoding is performed (using a sink). The sinks can be stacked during decoding, for an almost entirely SIMD pipeline: - XorSink (used automatically for f32 decoding) - AddConstSink (to add 2.5, again done using SIMD) - VecSink (writes output to a normal Vec)

    use compressed_vec::{VectorReader, AddConstSink, VecSink};
    let reader = VectorReader::<f32>::try_new(&encoded_bytes[..]).unwrap();
    let mut vecsink = VecSink::<f32>::new();
    let mut addsink = AddConstSink::new(2.5f32, &mut vecsink);
    reader.decode_to_sink(&mut addsink).unwrap();
    println!("And the transformed vector is: {:?}", vecsink.vec);

Vector Format

Details of the vector format can be found here.

The vector format follows columnar compression techniques used throughout the big data and database world, and roughly follows the Google Procella paper with its custom Artus format:

  • Compression within 2x of ZSTD while operating directly on the data
    • Compression for this format is within 2x of Parquet, but is written to allow filtering and operating on the data directly without needing a separate decompression step for the entire vector
  • Multi-pass encoding
    • The VectorAppender collects min/max and other stats on the raw data and uses it to decide on the best encoding strategy (delta, etc.)
  • Exposing dictionary indices to query engine and aggressive pushdown to the data format
    • The format is designed to filter over dictionary codes, which speeds up filtering
    • The use of sections allows for many optimizations for filtering. For example, null sections and constant sections allow for very fast filter short-circuiting.

Collaboration

Please reach out to me to collaborate!

You might also like...
Like pigz, but rust - a cross platform, fast, compression and decompression tool.

🦀 crabz Like pigz, but rust. A cross platform, fast, compression and decompression tool. Synopsis This is currently a proof of concept CLI tool using

A Rust application that compress files and folders

Quick Storer This is a Rust application that compress files and folders. Usage Download or build the binary and place it on your desktop, or any other

Obvious Unified Compression Helper is a CLI tool to help you compress and decompress files of several formats

Ouch! ouch stands for Obvious Unified Compression Helper and is a CLI tool to help you compress and decompress files of several formats. Features Usag

Lane-Associated Vector (LAV): Portable SIMD vector trait as GAT of SIMD lane trait.

lav Lane-Associated Vector (LAV): Portable SIMD vector trait as GAT of SIMD lane trait. NOTE: This crate requires nightly Rust. Provides SIMD lane tra

Fixed point to floating point (and back) conversion utility

fixed2float Simple utility for fixed point to real number conversions, using the VisSim (Fxm.b) and Q (Qm.n) notations. Usage as a dependency of your

Extended precision integer Rust library. Provides signed/unsigned integer 256 to 2048.

Extended precision integer Rust library. Provides signed/unsigned integer 256 to 2048.

Optimize floating-point expressions for accuracy
Optimize floating-point expressions for accuracy

Herbie automatically improves the error of floating point expressions. Visit our website for tutorials, documentation, and an online demo. Herbie has

Enable floating point exceptions with backtraces.
Enable floating point exceptions with backtraces.

Batman Have you ever written a function like this that was hard to debug? fn main() { let signal = [""; 16].join(&format!("{}", f64::sqrt(50.3 - 5

Multiple precision floating point numbers implemented purely in Rust.

Multiple precision floating point numbers implemented purely in Rust. Rationale There are several notable implementations of numbers with increased pr

Teleport is a simple application for sending files from Point A to Point B

Teleporter Teleporter is a small utility in the vein of netcat to send files quickly from point A to point B. It is more convenient than netcat in tha

📡Proxy HTTP/1.1 requests over a sensitive point-to-point link
📡Proxy HTTP/1.1 requests over a sensitive point-to-point link

ptproxy Motivation What's this? Why do I need this? What's a sensitive network link? What's wrong with a VPN? What's wrong with HTTP[S]? What's wrong

A bit vector with the Rust standard library's portable SIMD API

bitsvec A bit vector with the Rust standard library's portable SIMD API Usage Add bitsvec to Cargo.toml: bitsvec = "x.y.z" Write some code like this:

Linked Atomic Random Insert Vector: a thread-safe, self-memory-managed vector with no guaranteed sequential insert.
Linked Atomic Random Insert Vector: a thread-safe, self-memory-managed vector with no guaranteed sequential insert.

Linked Atomic Random Insert Vector Lariv is a thread-safe, self-memory-managed vector with no guaranteed sequential insert. It internally uses a linke

Convenience library for reading and writing compressed files/streams

compress_io Convenience library for reading and writing compressed files/streams The aim of compress_io is to make it simple for an application to sup

Merge together and efficiently time-sort compressed .pcap files stored in AWS S3 object storage (or locally) to stdout for pipelined processing.

Merge together and efficiently time-sort compressed .pcap files stored in AWS S3 object storage (or locally) to stdout for pipelined processing. High performance and parallel implementation for 10 Gbps playback throughput with large numbers of files (~4k).

tectonicdb is a fast, highly compressed standalone database and streaming protocol for order book ticks.

tectonicdb crate docs.rs crate.io tectonicdb tdb-core tdb-server-core tdb-cli tectonicdb is a fast, highly compressed standalone database and streamin

A Quest to Find a Highly Compressed Emoji :shortcode: Lookup Function

Highly Compressed Emoji Shortcode Mapping An experiment to try and find a highly compressed representation of the entire unicode shortcodes-to-emoji m

A floating, tag-based window manager written in Rust
A floating, tag-based window manager written in Rust

worm worm is a floating, tag-based window manager for X11. It is written in the Rust programming language, using the X11RB library. Install cargo buil

VCF/BCF [un]compressed [un]indexed

This is a small library that attempts to allow efficient reading of VCF and BCF files that are either compressed or uncompressed and indexed or not. n

Comments
  • NibblePackMedFixedSect not completing decoding to sink

    NibblePackMedFixedSect not completing decoding to sink

    The code below fails on the last assert: assert_eq!(sink.values, input);. It seems like the last 8 u64 values are not written to the sink.

    Other inputs (e.g. the commented out input) seem to not exhibit this behavior. This might be triggered by some edge case.

    Alternatively, maybe I'm not understanding how NibblePackMedFixedSect and Section256Sink should interact - but I'm pretty sure I do.

    Thanks!

    use compressed_vec::{
        section::{FixedSectReader, FixedSectionWriter, NibblePackMedFixedSect},
        sink::Section256Sink,
    };
    
    fn main() {
        let mut buf: [u8; 8196] = [0; 8196];
    
        //let input: &[u64] = &[
        //    10682, 262, 361, 338, 1620, 5231, 3668, 850, 1861, 5012, 3201, 1615, 178, 5102, 2697, 981,
        //    704, 1971, 4632, 2481, 1122, 7395, 4954, 480, 4919, 5734, 2776, 2890, 9841, 486, 2300,
        //    1213, 1811, 2988, 702, 272, 3402, 9653, 9262, 4785, 27, 3867, 11568, 5761, 1769, 4456,
        //    4903, 2640, 1310, 7593, 2628, 612, 1500, 11383, 13558, 5535, 1512, 3948, 267, 1004, 1081,
        //    7863, 5522, 9683, 2108, 8214, 6795, 718, 8660, 8385, 8558, 6113, 1869, 6180, 2740, 3675,
        //    624, 4414, 7107, 2213, 7064, 622, 4225, 1944, 2379, 3539, 4056, 5099, 1384, 3562, 2596,
        //    3529, 146, 4257, 1454, 4166, 7574, 6437, 8944, 871, 11499, 6378, 5295, 1148, 2687, 637,
        //    1098, 3182, 1632, 1143, 769, 636, 3661, 4218, 2053, 877, 3705, 10386, 4183, 5245, 2258,
        //    5056, 3455, 3477, 1601, 8634, 4061, 6162, 14001, 10630, 6878, 12633, 827, 9940, 1813, 4999,
        //    86, 3114, 1633, 8135, 7658, 6395, 7762, 1581, 537, 158, 1717, 985, 6889, 4274, 786, 6762,
        //    4948, 5343, 2726, 3503, 3593, 2913, 7946, 2151, 2832, 1775, 7711, 12284, 5265, 4605, 1277,
        //    1096, 3455, 8104, 4203, 8046, 6651, 5326, 9219, 5580, 3364, 3165, 358, 3138, 1309, 2485,
        //    15528, 7979, 5191, 447, 751, 2834, 3067, 2688, 2115, 5958, 7547, 3042, 5847, 9394, 7685,
        //    2810, 5621, 1296, 89, 1704, 4016, 587, 2675, 2291, 7394, 2753, 2147, 2203, 3212, 1844,
        //    2727, 3196, 3813, 3824, 2035, 4109, 9236, 6747, 6141, 6150, 3849, 6766, 5159, 3166, 3553,
        //    3330, 6426, 4739, 8799, 11030, 4191, 2352, 3695, 2581, 4502, 176, 86, 1685, 4734, 3428,
        //    8417, 102, 1506, 1397, 175, 2855, 5420, 7193, 13334, 4939, 3163, 3858, 4083, 5592,
        //];
        let input: &[u64] = &[
            13531, 7392, 4283, 6675, 7938, 941, 585, 2133, 2076, 5459, 392, 2292, 3250, 1825, 5508,
            6017, 5757, 672, 4624, 132, 402, 2981, 323, 4899, 8416, 1134, 4771, 9372, 3423, 6327, 1200,
            2324, 4299, 471, 1140, 5719, 3880, 10134, 4807, 1968, 4211, 4176, 6339, 1587, 9182, 144,
            12399, 5322, 9022, 5813, 6, 18032, 11663, 4331, 3, 3821, 4310, 4603, 5898, 1861, 2447, 452,
            2277, 5166, 470, 7767, 5067, 9082, 7012, 6223, 2557, 3980, 3638, 3006, 8407, 7006, 2607,
            6941, 4070, 1987, 9729, 10650, 6208, 11815, 4306, 3415, 6438, 1687, 3253, 2682, 645, 7900,
            10613, 27, 1131, 3078, 10048, 12075, 1502, 4404, 6598, 9191, 1057, 1125, 2442, 6261, 1697,
            9730, 6559, 5416, 2481, 2937, 6102, 1334, 5035, 4466, 3435, 3878, 797, 4110, 10547, 8950,
            1613, 241, 3913, 1563, 2719, 11268, 6433, 1495, 4726, 2072, 6455, 1651, 11, 6286, 2349,
            769, 6002, 8415, 6550, 877, 1845, 3706, 7287, 3437, 1924, 5488, 3732, 7175, 1100, 4858,
            3565, 1538, 5769, 1292, 3899, 3898, 5354, 33, 4835, 8852, 7507, 94, 3992, 3822, 15755,
            7538, 4285, 7928, 1154, 6991, 3513, 6162, 6323, 3522, 2783, 4858, 449, 2140, 1891, 2375,
            5864, 1120, 7935, 4548, 4851, 2100, 2345, 6842, 3508, 5753, 5440, 7403, 7540, 7407, 2290,
            3079, 6388, 3555, 1908, 3850, 477, 8699, 2314, 8643, 8138, 1117, 6544, 8067, 8328, 201,
            2523, 3797, 1754, 839, 6116, 1177, 2035, 103, 1950, 2941, 4466, 4799, 6622, 9989, 411,
            5080, 15, 3622, 6915, 1418, 2923, 5922, 190, 1603, 29, 2828, 4951, 875, 2031, 7626, 5507,
            3030, 2261, 4734, 2647, 2799, 529, 3186, 6761, 68244, 65351, 2124, 5974, 2569,
        ];
    
        // NibblePackMedFixedSect accepts slices of len 256
        assert_eq!(input.len(), 256);
    
        // Encode the 256 slice...
        let size = NibblePackMedFixedSect::gen_stats_and_write(&mut buf[..], 0, &input).unwrap();
    
        // Then decode it to a 256 sink, expecting the decoded = input
        let mut sink = Section256Sink::<u64>::new();
        let section = NibblePackMedFixedSect::<u64>::try_from(&buf[..size]).unwrap();
        section.decode_to_sink(&mut sink).unwrap();
        assert_eq!(sink.values, input);
    }
    
    opened by fsolleza 1
  • Expected behaviour?

    Expected behaviour?

    Hello,

    use compressed_vec::VectorU64Appender; use compressed_vec::VectorReader;

    let data: Vec = (0..100000).collect(); let mut appender = VectorU64Appender::try_new(data.len()).unwrap(); let result = appender.encode_all(data.clone()).unwrap(); let reader = VectorReader::::try_new(&result[..]).unwrap(); let result: Vec = reader.iterate().collect(); assert_eq!(data, result);

    , above data and result are not equal.

    opened by sivad77 0
  • serde support?

    serde support?

    It will be very useful if the compressed vec can be serialized using serde, which is beneficial when the vector needs to be transferred over limited bandwidth network for example.

    opened by awaited-hare 1
Owner
Evan Chan
Big data hacker, Scala, Rust, telemetry, Spark, Kafka, Cassandra. Columnar compression expert.
Evan Chan
A simple rust library to read and write Zip archives, which is also my pet project for learning Rust

rust-zip A simple rust library to read and write Zip archives, which is also my pet project for learning Rust. At the moment you can list the files in

Kang Seonghoon 2 Jan 5, 2022
Zopfli Compression Algorithm is a compression library programmed in C to perform very good, but slow, deflate or zlib compression.

Zopfli Compression Algorithm is a compression library programmed in C to perform very good, but slow, deflate or zlib compression.

Google 3.2k Jan 6, 2023
Fastest Snappy compression library in Node.js

snappy !!! For [email protected] and below, please go to node-snappy. More background about the 6-7 changes, please read this, Thanks @kesla . ?? Help me to

LongYinan 103 Jan 2, 2023
A library to create zip files on a non-seekable writer

A library to create zip files on a non-seekable writer

nyantec GmbH 2 Mar 17, 2022
A utility that can download JavaScript and TypeScript module graphs and store them locally in a special zip file.

eszip A utility that can download JavaScript and TypeScript module graphs and store them locally in a special zip file. To create a new archive: > esz

Deno Land 162 Dec 24, 2022
Basic (and naïve) LZW and Huffman compression algorithms in Rust.

Naive implementation of the LZW and Huffman compression algorithms. To run, install the Rust toolchain. Cargo may be used to compile the source. Examp

Luiz Felipe Gonçalves 9 May 22, 2023
A Brotli implementation in pure and safe Rust

Brotli-rs - Brotli decompression in pure, safe Rust Documentation Compression provides a <Read>-struct to wrap a Brotli-compressed stream. A consumer

Thomas Pickert 59 Oct 7, 2022
Brotli compressor and decompressor written in rust that optionally avoids the stdlib

rust-brotli What's new in 3.2 into_inner conversions for both Reader and Writer classes What's new in 3.0 A fully compatible FFI for drop-in compatibi

Dropbox 659 Dec 29, 2022
DEFLATE, gzip, and zlib bindings for Rust

flate2 A streaming compression/decompression library DEFLATE-based streams in Rust. This crate by default uses the miniz_oxide crate, a port of miniz.

The Rust Programming Language 619 Jan 8, 2023
Lossless compressor and decompressor for numerical data using quantiles

This rust library compresses and decompresses sequences of numerical data very well. It currently supports the following data types: i32, i64, u32, u64, f32, f64. Smaller data types like i16 can be efficiently compressed by casting to i32. Timestamp support may come soon in the future.

Martin 163 Dec 14, 2022