Succinct data structures in Rust

Overview

sucds: Succinct data structures in Rust

Documentation Crates.io License: MIT

sucds contains some succinct data structures written in Rust.

Data structures

So far, the following data structures are implemented. All of them are yet another Rust ports of implementations of C++ Succinct library by Ottaviano.

  • BitVector
    • Bit vector supporting some utilities such as update, chunking, predecessor, and successor.
  • RsBitVector
    • Rank/select data structure over bit vector through Vigna's rank9 and hinted selection technique.
  • EliasFano
    • Compressed monotone sequence through Elias-Fano encoding.
  • DArray
    • Constant-time select data structure over integer set through dense array technique by Okanohara and Sadakane.

Usage

To use sucds, depend on it in your Cargo manifest:

# Cargo.toml

[dependencies]
sucds = "0.1"

Licensing

This library is free software provided under MIT.

Comments
  • high peak memory consumption of WaveletMatrix::from_ints

    high peak memory consumption of WaveletMatrix::from_ints

    Building a 205 MB wavelet matrix using WaveletMatrix::from_ints temporarily consumes 3.9 GB memory, which may be too much for small VMs. Is there a way to reduce memory consumption while constructing a wavelet matrix?

    Screenshot from 2022-11-21 10-02-05

    opened by KonradHoeffner 7
  • Reduce memory consumption in wavelet matrix builder

    Reduce memory consumption in wavelet matrix builder

    This PR uses CompactVector instead of Vec for buffers in WaveletMatrixBuilder, allowing us to specify the minimum number of bits to store an input integer for working memory efficiency.

    I measured the peak resident set size during constructing WaveletMatrix from $2^{26}$ random bytes (code is here).

    # v0.5.0
    $ /usr/bin/time ../target/release/wmb
    [bench/src/wmb.rs:16] wm.len() = 67108864
    4.09user 0.41system 0:04.52elapsed 99%CPU (0avgtext+0avgdata 1152920maxresident)k
    0inputs+0outputs (0major+35199minor)pagefaults 0swaps
    
    # New version (will be v0.6.0)
    $ /usr/bin/time ../target/release/wmb
    [bench/src/wmb.rs:17] wm.len() = 67108864
    5.83user 0.10system 0:05.93elapsed 99%CPU (0avgtext+0avgdata 244280maxresident)k
    0inputs+0outputs (0major+68665minor)pagefaults 0swaps
    

    The working memory was reduced from 1.1 GB to 0.24 GB, although the speed was slower.

    In this PR, I removed the function WaveletMatrix::from_ints() because this can lead to a memory inefficient construction way.

    opened by kampersanda 0
  • Simplify API

    Simplify API

    Current APIs contain some redundant functions. For example, EliasFano::from_bits and EliasFano::from_bitvec can be generalized.

    For maintenance cost, such functions should be unified.

    opened by kampersanda 0
  • Remove reference from input iterator

    Remove reference from input iterator

    The current version implements input iterators taking references such as EliasFano::from_bits.

    But, for primitive types, call-by-value is enough and call-by-reference would not be useful.

    opened by kampersanda 0
  • Remove serde

    Remove serde

    In the previous version, the library implements serialization/deserialization with Serde.

    This PR removed the dependency and added some alternative functions.

    opened by kampersanda 0
  • Use unsafe for faster operations

    Use unsafe for faster operations

    Basically, sucds does not recommend to use unsafe approaches such as get_unchecked because deserialization from any data is supported, making it easy to build data structures that contain invalid values.

    However, unsafe approaches can be safely used to some parts by prechecking.

    opened by kampersanda 0
  • Implement rank/select bit vector trait

    Implement rank/select bit vector trait

    Related to #13

    There are several implementations of rank/select bit vector data structure, and some of them have already been / will be implemented in this library.

    Since there are data structures, like wavelet matrix, uses rank/select bit vector as a building block, it is better that implementing trait for rank/select bit vector as below and let user be able to choose concrete implementations:

    trait RsBitVector {
        fn rank0(&self, pos: usize) -> usize;
        fn rank1(&self, pos: usize) -> usize;
        fn select0(&self, k: usize) -> usize;
        fn select1(&self, k: usize) -> usize;
    ...
    }
    

    @kampersanda How do you think about above? If it is roughly OK, I will create PR to discuss this more concretely.

    opened by hirosassa 3
  • Implement Balanced Parentheses

    Implement Balanced Parentheses

    SSIA

    This would be accomplished by porting https://github.com/ot/succinct/blob/master/bp_vector.hpp, although it is necessary to understand the data structure first.

    opened by kampersanda 0
Releases(v0.6.0)
Owner
Shunsuke Kanda
TRIE Pokémon
Shunsuke Kanda
Rust Persistent Data Structures

Rust Persistent Data Structures Rust Persistent Data Structures provides fully persistent data structures with structural sharing. Setup To use rpds a

Diogo Sousa 883 Dec 31, 2022
A proof of concept implementation of cyclic data structures in stable, safe, Rust.

A proof of concept implementation of cyclic data structures in stable, safe, Rust. This demonstrates the combined power of the static-rc crate and the

null 157 Dec 28, 2022
Rust library for string parsing of basic data structures.

afmt Simple rust library for parsing basic data structures from strings. Usage You can specify string formats to any strucute, via the use of the fmt

Eduard 4 May 8, 2021
Serde is a framework for serializing and deserializing Rust data structures efficiently and generically.

Serde is a framework for serializing and deserializing Rust data structures efficiently and generically.

null 6.5k Dec 31, 2022
Algorithms and Data Structures of all kinds written in Rust.

Classic Algorithms in Rust This repo contains the implementation of various classic algorithms for educational purposes in Rust. Right now, it is in i

Alexander González 49 Dec 14, 2022
Library containing various Data Structures implemented using Rust.

rust-data-structures Library containing various Data Structures implemented using Rust. Running You can test the library by running cargo test, an exa

c1m50c 1 Jan 6, 2022
A library for comparing data structures in Rust, oriented toward testing

Delta: Structural differencing in Rust The delta crate defines the trait Delta, along with a derive macro for auto-generating instances of this trait

John Wiegley 19 Oct 7, 2022
A library for comparing data structures in Rust, oriented toward testing

The comparable crate defines the trait [Comparable], along with a derive macro for auto-generating instances of this trait for most data types. Primar

John Wiegley 19 Oct 7, 2022
Collection of Data Structures in Rust

cds - Collection of Data Structures !!! UNDER CONSTRUCTION !!! The version v0.0.1 is a crates.io placeholder. License Licensed under either of Apache

Rafael Buchbinder 2 May 23, 2022
Coding-challenge - Algorithms and Data-structures, problems and solutions in Rust language using cargo-workspaces

Coding Challenge LeetCode/Hackerrank e.t.c Using this as an opportunity to improve my knowledge of rust lang If you found this repo useful to you, add

Tolumide Shopein 17 Apr 24, 2022
Common data structures and algorithms in Rust

Contest Algorithms in Rust A collection of classic data structures and algorithms, emphasizing usability, beauty and clarity over full generality. As

Aram Ebtekar 3.3k Dec 27, 2022
Rust data structures and client for the PubChem REST API

pubchem.rs Rust data structures and client for the PubChem REST API. ?? Usage ?? Compound Create a Compound to query the PubChem API for a single comp

Martin Larralde 2 Jan 18, 2022
Dade is data definition for Rust structures.

dade dade is data definition for Rust structures. For the easy handle of data, the following will support it. Data validation. Data schema conforms Js

odd 3 May 1, 2022
NixEl is a Rust library that turns Nix code into a variety of correct, typed, memory-safe data-structures

?? NixEL Lexer, Parser, Abstract Syntax Tree and Concrete Syntax Tree for the Nix Expressions Language. NixEl is a Rust library that turns Nix code in

Kevin Amado 56 Dec 29, 2022
rust_aads - Rust Algorithms And Data Structures

rust_aads - Rust Algorithms And Data Structures rust_aads is an open repository with algorithms and data structures, used in computer science and comp

stepa 2 Dec 15, 2022
This repository aims to organize codes related to data structures in Rust. 🦀

Rust Data Structure A project with the objective of introducing the main concepts about data structure using Rust! Explore the docs and learn Rust » R

guto 12 Apr 3, 2023
Obake is a procedural macro for declaring and maintaining versioned data-structures.

Obake is a procedural macro for declaring and maintaining versioned data-structures. The name 'obake' is taken from the Japanese 'お化け (おばけ)', a class of supernatural beings in Japanese folklore that shapeshift.

Nathan Corbyn 174 Dec 26, 2022
Garbage Collector(Hyaline- Safe Memory Reclaimation) for lock free data structures

Hyaline-SMR This crate provides garbage collection using hyaline algorithm for building concurrent data structures. When a thread removes an object fr

Abishek 2 Dec 21, 2022
Library containing implementations of various sequential data-structures.

Library containing implementations of various sequential data-structures.

c1m50c 0 May 15, 2022