Roaring bitmap implementation for Rust

Overview

RoaringBitmap travis-badge release-badge docs-badge rust-version-badge

This is not yet production ready. The API should be mostly complete now.

This is a Rust port of the Roaring bitmap data structure, initially defined as a Java library and described in Better bitmap performance with Roaring bitmaps.

Rust version policy

This crate only supports the current stable version of Rust, patch releases may use new features at any time.

Developing

This project uses clippy and denies warnings in CI builds. To ensure your changes will be accepted please check them with cargo clippy (available via cargo install clippy on nightly rust) before submitting a pull request (along with cargo test as usual).

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you shall be dual licensed as above, without any additional terms or conditions.

Comments
  • feat: implement optimised insert_range

    feat: implement optimised insert_range

    Implements an optimised insert_range() function for efficiently adding consecutive elements to the set.

    Previously the easiest way to do this was with extends() or calling insert() in a loop. To insert 100,000 elements this took about 1,000us - with the insert_range() this completes in ~2us.

    Bechmarks

    insert_range() vs. insert() in a loop inserting 0..N:

    add_range/roaring/10    time:   [191.80 ns 194.15 ns 197.16 ns]
                            thrpt:  [50.720 Melem/s 51.506 Melem/s 52.137 Melem/s]
                     change:
                            time:   [-58.031% -57.320% -56.475%] (p = 0.00 < 0.05)
                            thrpt:  [+129.75% +134.30% +138.27%]
                            Performance has improved.
    						
    add_range/pre_populated_roaring/10
                            time:   [177.77 ns 180.03 ns 182.21 ns]
                            thrpt:  [54.881 Melem/s 55.547 Melem/s 56.252 Melem/s]
                     change:
                            time:   [-36.465% -34.812% -33.125%] (p = 0.00 < 0.05)
                            thrpt:  [+49.532% +53.404% +57.393%]
                            Performance has improved.
    
    add_range/roaring/100   time:   [342.94 ns 346.12 ns 351.19 ns]
                            thrpt:  [284.75 Melem/s 288.92 Melem/s 291.60 Melem/s]
                     change:
                            time:   [-85.032% -84.862% -84.654%] (p = 0.00 < 0.05)
                            thrpt:  [+551.63% +560.58% +568.10%]
                            Performance has improved.
    
    add_range/pre_populated_roaring/100
                            time:   [366.68 ns 371.57 ns 376.29 ns]
                            thrpt:  [265.76 Melem/s 269.13 Melem/s 272.72 Melem/s]
                     change:
                            time:   [-83.662% -83.417% -83.156%] (p = 0.00 < 0.05)
                            thrpt:  [+493.70% +503.04% +512.08%]
                            Performance has improved.
    
    add_range/roaring/1000  time:   [1.6535 us 1.6579 us 1.6630 us]
                            thrpt:  [601.33 Melem/s 603.16 Melem/s 604.78 Melem/s]
                     change:
                            time:   [-91.569% -91.513% -91.457%] (p = 0.00 < 0.05)
                            thrpt:  [+1070.6% +1078.3% +1086.1%]
                            Performance has improved.
    
    add_range/pre_populated_roaring/1000
                            time:   [1.7225 us 1.7491 us 1.7824 us]
                            thrpt:  [561.03 Melem/s 571.72 Melem/s 580.56 Melem/s]
                     change:
                            time:   [-94.873% -94.807% -94.737%] (p = 0.00 < 0.05)
                            thrpt:  [+1800.0% +1825.8% +1850.5%]
                            Performance has improved.
    
    add_range/roaring/5000  time:   [317.68 ns 319.43 ns 321.60 ns]
                            thrpt:  [15.547 Gelem/s 15.653 Gelem/s 15.739 Gelem/s]
                     change:
                            time:   [-99.669% -99.665% -99.661%] (p = 0.00 < 0.05)
                            thrpt:  [+29373% +29744% +30135%]
                            Performance has improved.
    
    add_range/pre_populated_roaring/5000
                            time:   [844.07 ns 907.00 ns 966.49 ns]
                            thrpt:  [5.1733 Gelem/s 5.5127 Gelem/s 5.9237 Gelem/s]
                     change:
                            time:   [-98.691% -98.541% -98.365%] (p = 0.00 < 0.05)
                            thrpt:  [+6015.9% +6753.8% +7541.4%]
                            Performance has improved.
    
    add_range/roaring/10000 time:   [373.12 ns 373.91 ns 374.77 ns]
                            thrpt:  [26.683 Gelem/s 26.745 Gelem/s 26.801 Gelem/s]
                     change:
                            time:   [-99.736% -99.734% -99.732%] (p = 0.00 < 0.05)
                            thrpt:  [+37214% +37481% +37710%]
                            Performance has improved.
    
    add_range/pre_populated_roaring/10000
                            time:   [936.65 ns 1.0032 us 1.0666 us]
                            thrpt:  [9.3760 Gelem/s 9.9681 Gelem/s 10.676 Gelem/s]
                     change:
                            time:   [-99.237% -99.132% -99.035%] (p = 0.00 < 0.05)
                            thrpt:  [+10258% +11415% +13008%]
                            Performance has improved.
    
    add_range/roaring/100000
                            time:   [1.4464 us 1.4534 us 1.4613 us]
                            thrpt:  [68.431 Gelem/s 68.806 Gelem/s 69.135 Gelem/s]
                     change:
                            time:   [-99.866% -99.864% -99.861%] (p = 0.00 < 0.05)
                            thrpt:  [+71984% +73250% +74639%]
                            Performance has improved.
    
    add_range/pre_populated_roaring/100000
                            time:   [1.8819 us 1.9888 us 2.1003 us]
                            thrpt:  [47.612 Gelem/s 50.283 Gelem/s 53.137 Gelem/s]
                     change:
                            time:   [-99.864% -99.849% -99.831%] (p = 0.00 < 0.05)
                            thrpt:  [+59136% +66238% +73485%]
                            Performance has improved.
    

    The API exposed copies the remove_range() function, accepting a Range<u64> (a half-open range) in order to allow including u32::MAX.

    opened by domodwyer 22
  • The `insert_range` method does not properly handle boundary condition

    The `insert_range` method does not properly handle boundary condition

    Progress

    • [x] Switch inner parameters to RangeBounds
      • [x] insert_range
      • [x] remove_range
    • [x] Fix boundary condition in insert_range/remove_range
    • [x] Add more tests

    Experiments

    #[test]
    fn insert_range_multi() {
        let mut bitmap = RoaringBitmap::new();
        bitmap.insert_range(0..((1_u64 << 16) + 1));
        println!("{:?}", bitmap);
    }
    
    #[test]
    fn insert_multi() {
        let mut bitmap = RoaringBitmap::new();
        for i in 0..((1 << 16) + 1) {
            bitmap.insert(i);
        }
        println!("{:?}", bitmap);
    }
    
    #[test]
    fn insert_range_single() {
        let mut bitmap = RoaringBitmap::new();
        bitmap.insert_range(0..(1_u64 << 16));
        println!("{:?}", bitmap); // would panic here
    }
    
    #[test]
    fn insert_single() {
        let mut bitmap = RoaringBitmap::new();
        for i in 0..(1 << 16) {
            bitmap.insert(i);
        }
        println!("{:?}", bitmap);
    }
    

    single partition insertion

    After insert 0..(1_u64 << 16) to the bitmap, we expect only one container(bitmap), whose key is 0, exists in the bitmap.

    But actually we got:

    insert_range_single

    multiply partition insertion

    After insert 0..((1_u64 << 16) + 1) to the bitmap, we expect one container(bitmap), whose key is 0, followed by a container(array), whose key is 1.

    But we got:

    insert_range_multi

    bug 
    opened by oliverdding 21
  • Introduce a test that highlight a serialization bug

    Introduce a test that highlight a serialization bug

    Hello,

    I found a bug while using roaring in the next version of the MeiliSearch engine, it appears that the serialize function doesn't correctly represent the integers in the serialized format.

    I haven't had time to dive into the serialise function yet. Could you take a look please?

    Note that the original integers array was 77MB as an array text, I found a way to reduce that to 4096 values but there surely is a smaller input than this.

    Thank you for your help :)

    opened by Kerollmops 20
  • Replace set ops literal methods (`intersect_with`...) by the Rust std ops traits (`BitAnd`...)

    Replace set ops literal methods (`intersect_with`...) by the Rust std ops traits (`BitAnd`...)

    This pull request fixes #95 by implementing better versions of the set operations (union, intersection, difference, and symmetric difference) by efficiently using the standard ops traits (BitOr, BitAnd, Sub, and BitXor) with the concept of ownership. For example, if an operation must be done on two owned types, the library will take the owned internal store instead of cloning them.

    In a previous pull request we made the first step toward a more efficient version of the RoaringBitmap standard ops traits, but it was not applied to the Container and Store internal types, so the result was good but not enough.

    We also decided it was mandatory to deprecate the raw operation methods (union_with, intersect_with, difference_with, symmetric_difference_with) as those were only exposing one way to do the operations (mutable with borrowed) when users could sometime provide two owned bitmaps and get performance gains by using the appropriate ops trait variant. The only downside of eventually removing these raw methods is the loss in clarity, for me, at least, it is easier to understand intersect_with than &= or BitAndAssign::bitand_assign.

    opened by Kerollmops 18
  • Use trailing zeros in bitmap to_array conversion

    Use trailing zeros in bitmap to_array conversion

    This PR implements an optimization on bitmap to_array conversion by utilizing the trailing_zeros method.

    Curious if theres a good reason to not implement it this way. Im still learning rust so let me know if this looks off.

    this version is faster when there are gaps between set bits and performs about the same when all 64 bits are set. cc: @Nemo157

    opened by damnMeddlingKid 17
  • Introduce a set of benchmarks with real datasets

    Introduce a set of benchmarks with real datasets

    In this PR we introduce better benchmarks that use the official real-roaring-datasets repository, specifically the Wikileaks sorted files.

    We moved all of the benchmarks in a dedicated library in the benchmarks directory, this helps in isolating the required dependencies from the main crate. In this crate, we added a build.rs file that downloads and unzips the real-world datasets in the OUT_DIR folder (a folder inside the benchmarks/target directory).

    We introduced this new benchmarks crate in the README, to make sure that new contributors are aware of it and will more likely run them when modifying the crate and proposing changes.

    enhancement 
    opened by Kerollmops 15
  • Add scalar optimizations from CRoaring / arXiv:1709.07821 section 3

    Add scalar optimizations from CRoaring / arXiv:1709.07821 section 3

    Purpose

    This PR adds some optimizations from CRoaring as outlined in arXiv:1709.07821 section 3

    Overview

    • All inserts and removes are now branchless (!in arXiv:1709.0782, in CRoaring)
    • Section 3.1 was already implemented, except for BitmapIter. This is covered in #125
    • Implement Array-Bitset aggregates as outlined in section 3.2
      • Also branchless 😎
    • Tracks bitmap cardinality while performing bitmap-bitmap ops
      • This is a deviation from CRoaring, and will need to be benchmarked further before this Draft PR is ready
      • Curious to hear what you think about this @lemire
    • In order to track bitmap cardinality the len field had to moved into Store::Bitmap
      • This is unfortunately a cross cutting change
    • Store was quite large (LoC) and had many responsibilities. The largest change in this draft is decomposing Store such hat it's field variants are two new container types: each responsible for maintaining their invariants and implementing ops
      • Bitmap8K keeps track of it's cardinality
      • SortedU16Vec maintains its sorting
      • Store now only delegates to these containers
      • My hope is that this will be useful when implementing run containers. 🤞
      • Unfortunately so much code was moved this PR is HUGE

    Out of scope

    • Inline ASM for Array-Bitset aggregates
    • Section 4 (explicit SIMD). As noted by the paper authors: The compiler does a decent job of autovectorization, though not as good as hand-tuned

    Notes

    • I attempted to emulate the inline ASM Array-Bitset aggregates by using a mix of unsafe ptr arithmetic and x86-64 intrinsics, hoping to compile to the same instructions. I was unable to get it under 13 instructions per iteration (compared to the papers 5). While it was an improvement, I abandoned the effort in favor of waiting for the asm! macro to stabilize. https://github.com/rust-lang/rust/issues/72016
    opened by saik0 14
  • Use Vec::retain instead of Vec::remove

    Use Vec::retain instead of Vec::remove

    There are many places where we use the Vec::remove method. It can bring a lot of performances issues as every time this function is called, the right part of the Vec (after the removed element) is shifted to the left by one, when a lot of element must be removed, we should always use Vec::retain which is more optimized for this in the sense that it will avoid calling a lot the copy function for when multiple elements must be removed at the same time.

    I found that doing this change in the Store::intersect_with method for the Array-Array operation gave me much better performances (7m intersected with 3m and intersected with 4m again), moving from 240ms to 55ms. I am sure that we can optimize the current implementation more.

    • [x] Stores intersections.
    • [x] Stores differences.
    • [ ] Stores symmetric differences.
    opened by Kerollmops 14
  • Maintainership

    Maintainership

    Given that I'm not using this library and don't really have time to review PRs for it, I would like to hand over maintainership to someone (or a group) that does and can care for it properly.

    @Kerollmops it seems to me that you might be interested?

    @lemire do you have any thoughts on moving this repo under the aegis of https://github.com/RoaringBitmap?

    opened by Nemo157 13
  • Modify the from_sorted_iter and append methods to return a Result

    Modify the from_sorted_iter and append methods to return a Result

    This PR fixes #103, which means that the from_sorted_iter and append methods of the RoaringBit/Treemap types now return a Result. The Err value corresponds to the number of values that we appended to the set until we found an erroneous value, it eases debugging.

    These two methods were containing bugs in the sense that they were silently discarding invalid values instead of panicking, this bug was introduced by #99 which introduced a non-panicking push method.

    A possible improvement could be to introduce a new error type, something like a NonSortedIntegers struct with a stopped_at/valid_until method that returns the u64 we currently return as the Err.

    @MarinPostma could you please take a look at this PR? If you got the time?

    breaking 
    opened by Kerollmops 12
  • Improve RoaringBitmap ref with ref set operations

    Improve RoaringBitmap ref with ref set operations

    This is a follow-up of the #97 PR. It improves the specific &RoaringBitmap with &RoaringBitmap union, intersection difference and, symmetric difference operations.

    The previous implementation was cloning the smallest or biggest bitmap depending on the operation and applying an _assign operation on it, this was not quite efficient. In this new implementation, we no more clone the containers Vec but rather collect the required Containers from the left-hand side or the right-hand side.

    The performance gain is around 20% 🎉 🚀

    Capture d’écran 2021-06-02 à 15 08 02 Capture d’écran 2021-06-02 à 15 08 13 enhancement 
    opened by Kerollmops 12
  • Optimization: investigate faster search in array

    Optimization: investigate faster search in array

    Some ideas for improving the binary search for array stores: https://dirtyhandscoding.wordpress.com/2017/08/25/performance-comparison-linear-search-vs-binary-search/

    IMO the two should be combined - binary search should be used to narrow down the range to the point where linear search is faster, then continue with that.

    opened by RaduBerinde 0
  • Implement Hash

    Implement Hash

    Would it be possible to implement the Hash trait for RoaringBitmap? It looks like this can be derived automatically (via ArrayStore/BitmapStore, Store, and Container). Alternatively, perhaps something like the hash function in pyroaring would be more suitable?

    opened by michaelmior 6
  • Remove the dependency to the `retain_mut` crate

    Remove the dependency to the `retain_mut` crate

    We are depending on the retain_mut crate which has been deprecated on May 20 2022 as the standard library integrated this function since 1.61. However, we don't want to remove this dependency right now as it would force us to bump the MSRV to 1.61, we were in 1.56.1.

    I would like to bump the MSRV to 1.61 but not right now, maybe in 6 months.

    /remind me that must remove the retain_mut dependency in 6 months

    breaking reminder 
    opened by Kerollmops 1
  • Precompute correct store type for operations

    Precompute correct store type for operations

    This optimization has been unblocked by #168

    For operations that could cause a conversion between store types we can first compute the cardinality of the op. The operation can then continue, using the correct store type, removing the need to ensure_correct_store.

    This will make the best-case slower, but the worst-case faster. Delta for average-case will be data dependent.

    opened by saik0 0
  • Heuristics for array-array ops with disjoint bounds

    Heuristics for array-array ops with disjoint bounds

    Motivation

    For the cost of a few cheap instructions and one branch: we are able to avoid the computation entirely.

    I was inspired by the "broad phase" of collision detection used in various physics simulations. First shapes that might collide are detected by checking if their axis-aligned bounded boxes intersect. Only then perform the more expensive computation to check if the shapes intersect. This proposed scheme is similar, but over a single dimension.

    Overview

    Let the bounds of a container be [min,max]

    The bounds of array containers are cheap to compute: They are the first and last elements, both are guaranteed to be in cache.

    Determining if the bounds are disjoint should be cheap to compute.

    Operators

    Intersection

    If the bounds are disjoint the intersection is empty. We can return an empty Vec, which is non-allocating. For assignment, clear the existing vec, which equivalent to setting it's len.

    Difference

    If the bounds are disjoint the difference is equal to the minuend. Either a clone, or a no-op in the case of assignment.

    Union

    If the bounds are disjoint the union can be computed simply by copying the entire contents of both operands. Of course: beginning with the array with the smaller first element.

    Symmetric difference

    If the bounds are disjoint it follows that the intersection is empty. This is equivalent to a union.

    Extra notes

    1. This may also apply to run containers, I am still working on digesting the runs paper
    2. This would be applicable to bitsets only if min and max did not require linear scans. It might be worth experimenting to evaluate if tracking min/max on-the-fly as with cardinality to enable this optimization would be a net-perfoamnce gain (but requiring an extra 4 bytes per bitmap). My speculative guess is no, but data sometimes surprises me.

    Perhaps @lemire can illuminate us as to whether or not this has been experimented with

    opened by saik0 0
Releases(v0.10.1)
Owner
Roaring bitmaps: A better compressed bitset
Roaring bitmaps are compressed bitmaps. They can be hundreds of times faster. (Picture credit: tambako)
Roaring bitmaps: A better compressed bitset
Doubly-Linked List Implementation in Rust

Doubly-Linked List Implementation in Rust Purely for educational and recreational purposes. For real world production please use std::collections::Lin

Tsoding 9 Jul 28, 2022
Hash Table Implementation in Rust

Hash Table Implementation in Rust Purely for educational and recreational purposes. For real world production please use std::collections::HashMap. Fo

Tsoding 11 Sep 6, 2022
Rust implementation of Adobe's stlab::forest data structure

skog Swedish for "forest". A pure rust implementation of Adobe's stlab::forest data structure. Introduction A "forest" is a tree-like data structure.

Julian 1 Dec 8, 2021
A lightweight Rust library for BitVector Rank&Select operations, coupled with a generic Sparse Array implementation

A lightweight Rust library for BitVector Rank&Select operations, coupled with a generic Sparse Array implementation

Alperen Keleş 5 Jun 20, 2022
enum-map enum-map xfix/enum-map [enum-map] — An optimized map implementation for enums using an array to store values.

enum-map A library providing enum map providing type safe enum array. It is implemented using regular Rust arrays, so using them is as fast as using r

Konrad Borowski 57 Dec 19, 2022
A hashmap implementation, which uses hashset, and keys are contained within values.

A hashmap implementation, which uses hashset, and keys are contained within values.

Piotr Mikulski 2 Nov 29, 2022
Rust-algorithm-club - Learn algorithms and data structures with Rust

Rust Algorithm Club ?? ?? This repo is under construction. Most materials are written in Chinese. Check it out here if you are able to read Chinese. W

Weihang Lo 360 Dec 28, 2022
Ternary search tree collection in rust

tst Ternary search tree collection in rust with similar API to std::collections as it possible. Ternary search tree is a type of trie (sometimes calle

Alexey Pervushin 20 Dec 7, 2022
Array helpers for Rust's Vector and String types

array_tool Array helpers for Rust. Some of the most common methods you would use on Arrays made available on Vectors. Polymorphic implementations for

Daniel P. Clark 69 Dec 9, 2022
Generic array types in Rust

generic-array This crate implements generic array types for Rust. Requires minumum Rust version of 1.36.0, or 1.41.0 for From<[T; N]> implementations

Bartłomiej Kamiński 325 Dec 25, 2022
A priority queue for Rust with efficient change function.

PriorityQueue This crate implements a Priority Queue with a function to change the priority of an object. Priority and items are stored in an IndexMap

null 139 Dec 30, 2022
K-dimensional tree in Rust for fast geospatial indexing and lookup

kdtree K-dimensional tree in Rust for fast geospatial indexing and nearest neighbors lookup Crate Documentation Usage Benchmark License Usage Add kdtr

Rui Hu 152 Jan 2, 2023
Rust Persistent Data Structures

Rust Persistent Data Structures Rust Persistent Data Structures provides fully persistent data structures with structural sharing. Setup To use rpds a

Diogo Sousa 883 Dec 31, 2022
Rust crate to extend io::Read & io::Write types with progress callbacks

progress-streams Rust crate to provide progress callbacks for types which implement io::Read or io::Write. Examples Reader extern crate progress_strea

Pop!_OS 19 Dec 3, 2022
Parameterized routing for generic resources in Rust

Usher Usher provides an easy way to construct parameterized routing trees in Rust. The nodes of these trees is naturally generic, allowing Usher to le

Isaac Whitfield 34 Oct 22, 2022
RiteLinked - LinkedHashMap & LinkedHashSet in Rust

RiteLinked -- HashMap-like containers that hold their key-value pairs in a user controllable order RiteLinked provides more up to date versions of Lin

Rite Database 52 Aug 19, 2022
Rust library for string parsing of basic data structures.

afmt Simple rust library for parsing basic data structures from strings. Usage You can specify string formats to any strucute, via the use of the fmt

Eduard 4 May 8, 2021
SegVec data structure for rust. Similar to Vec, but allocates memory in chunks of increasing size.

segvec This crate provides the SegVec data structure. It is similar to Vec, but allocates memory in chunks of increasing size, referred to as "segment

Jacob Ryan McCollum 30 Dec 16, 2022
Serde is a framework for serializing and deserializing Rust data structures efficiently and generically.

Serde is a framework for serializing and deserializing Rust data structures efficiently and generically.

null 6.5k Dec 31, 2022