Compact, efficient data structures in contiguous byte arrays

Overview

Sokoban

Compact, efficient data structures in contiguous byte arrays.

Benchmarks

Based on simple benchmarks, the naive performance of Sokoban data structures are on par with, but slightly slower than, the Rust Standard Library.

test bench_tests::bench_sokoban_avl_tree_insert_1000_u128             ... bench:     134,301 ns/iter (+/- 4,033)
test bench_tests::bench_sokoban_avl_tree_insert_1000_u128_stack       ... bench:     134,135 ns/iter (+/- 3,620)
test bench_tests::bench_sokoban_avl_tree_insert_20000_u128            ... bench:   2,744,853 ns/iter (+/- 158,364)
test bench_tests::bench_sokoban_avl_tree_remove_u128                  ... bench:     355,992 ns/iter (+/- 22,770)
test bench_tests::bench_sokoban_critbit_insert_1000_u128              ... bench:      90,306 ns/iter (+/- 590)
test bench_tests::bench_sokoban_critbit_insert_1000_u128_stack        ... bench:      76,819 ns/iter (+/- 661)
test bench_tests::bench_sokoban_critbit_insert_20000_u128             ... bench:   2,839,050 ns/iter (+/- 207,241)
test bench_tests::bench_sokoban_critbit_remove_1000_u128              ... bench:      97,366 ns/iter (+/- 6,124)
test bench_tests::bench_sokoban_hash_map_insert_1000_u128             ... bench:      46,828 ns/iter (+/- 1,928)
test bench_tests::bench_sokoban_hash_map_insert_1000_u128_stack       ... bench:      46,686 ns/iter (+/- 1,691)
test bench_tests::bench_sokoban_hash_map_insert_20000_u128            ... bench:   1,492,742 ns/iter (+/- 43,362)
test bench_tests::bench_sokoban_hash_map_remove_1000_u128             ... bench:      59,896 ns/iter (+/- 1,782)
test bench_tests::bench_sokoban_red_black_tree_insert_1000_u128       ... bench:      69,574 ns/iter (+/- 8,581)
test bench_tests::bench_sokoban_red_black_tree_insert_1000_u128_stack ... bench:      66,057 ns/iter (+/- 8,853)
test bench_tests::bench_sokoban_red_black_tree_insert_20000_u128      ... bench:   1,905,406 ns/iter (+/- 25,546)
test bench_tests::bench_sokoban_red_black_tree_remove_1000_u128       ... bench:     128,889 ns/iter (+/- 13,508)
test bench_tests::bench_std_btree_map_insert_1000_u128                ... bench:      51,353 ns/iter (+/- 10,240)
test bench_tests::bench_std_btree_map_insert_20000_u128               ... bench:   1,535,224 ns/iter (+/- 21,645)
test bench_tests::bench_std_btree_map_remove_1000_u128                ... bench:     131,879 ns/iter (+/- 19,325)
test bench_tests::bench_std_hash_map_insert_1000_u128                 ... bench:      38,775 ns/iter (+/- 237)
test bench_tests::bench_std_hash_map_insert_20000_u128                ... bench:     797,904 ns/iter (+/- 10,719)
test bench_tests::bench_std_hash_map_remove_1000_u128                 ... bench:      57,452 ns/iter (+/- 364)

Why compact data structures?

For most applications, there is no reason to look past the Rust standard library for data structures. However, when the application has limited or expensive memory and is bottlenecked by performance, programmers will often need to design custom solutions to address those constraints. These types of constraints come up quite frequently in high frequency trading, embedded systems, and blockchain development.

Enter Sokoban: A library of data structures designed to simplify this exact problem.

Generic Node Allocator

Almost all data structures can be represented by some sort of connected graph of nodes and edges. The node-allocator module implements a raw node allocation data structure for contiguous buffers. Each entry in the buffer must contain objects of the same underlying type. Each entry will also have a fixed number of registers that contain metadata relating to the current node. These registers will usually be interpreted as graph edges.

#[repr(C)]
#[derive(Copy, Clone)]
pub struct NodeAllocator<
    T: Default + Copy + Clone + Pod + Zeroable,
    const MAX_SIZE: usize,
    const NUM_REGISTERS: usize,
> {
    /// Size of the allocator
    pub size: u64,
    /// Furthest index of the allocator
    bump_index: u32,
    /// Buffer index of the first element in the free list
    free_list_head: u32,
    pub nodes: [Node<NUM_REGISTERS, T>; MAX_SIZE],
}

#[repr(C)]
#[derive(Copy, Clone)]
pub struct Node<T: Copy + Clone + Pod + Zeroable + Default, const NUM_REGISTERS: usize> {
    /// Arbitrary registers (generally used for pointers)
    /// Note: Register 0 is ALWAYS used for the free list
    registers: [u32; NUM_REGISTERS],
    value: T,
}

The templated NodeAllocator object is flexible primitive data structure for implementing more complex types. Here's how one might use the NodeAllocator to implement a doubly-linked list:

// Register aliases
pub const PREV: u32 = 0;
pub const NEXT: u32 = 1;

#[derive(Copy, Clone)]
pub struct DLL<T: Default + Copy + Clone + Pod + Zeroable, const MAX_SIZE: usize> {
    pub head: u32,
    pub tail: u32,
    allocator: NodeAllocator<T, MAX_SIZE, 2>,
}

The DLL is essentially just a node allocator with 2 registers per node. These registers represent the prev and next pointers of a DLL node. The logic for how edges are created and removed are specific to the type, but the allocator struct provides an interface for implementing arbitrary types that have this property (trees and graphs).

Comments
  • add generic heap implementation

    add generic heap implementation

    PR to add a generic heap data structure that fits into fixed buffers.

    Tried to make the tests comparable to data structures implemented with NodeAllocator.

    opened by fabioibanez 2
  • Add avl tree implementation

    Add avl tree implementation

    PR to add an AVL tree implementation using the NodeAllocator. AVL trees (theoretically) offer faster lookups as trees are more strictly balanced compared to red-black ones – although they can be slower at insert/remove operations given that they require more rotations to maintain the tree balanced.

    The tests included are the same ones as the red-black tree. Adding lookup tests for both avl and red-black trees can illustrate the difference in performance:

    for _ in 0..200 {
        for k in keys.iter() {
            assert!(tree.get(k) != None);
        }
    }
    
    opened by febo 2
  • Jxiao/fix rbt bugs and add test

    Jxiao/fix rbt bugs and add test

    Massive refactor:

    1. Fixed glaring bug in insert: rotated the child node instead of the parent
    2. Resolved a number of issues in remove:
    • Add special logic to handle the case where the pivot node is a leaf to allow the fix_remove function to properly rebalance (a)
    • Explicitly handle the case where the removed node is a leaf node (b)
    • Added comments
    1. Added specialized test cases for inserts and remove
    2. Added functions to pretty print the tree as well as verify that it follows the RBT property
    opened by jarry-xiao 0
  • Improvements to bench tests

    Improvements to bench tests

    This PR includes changes to the bench tests to:

    1. Change the label of bench_sokoban_avl_tree_remove_u128 to bench_sokoban_avl_tree_remove_1000_u128 to match the name of the other remove tests.
    2. Move the insert step outside of the benchmark on the remove tests, so the measure only includes the time to remove the items.
    3. Add lookup tests.
    test bench_tests::bench_sokoban_avl_tree_insert_1000_u128             ... bench:      95,849 ns/iter (+/- 3,441)
    test bench_tests::bench_sokoban_avl_tree_insert_1000_u128_stack       ... bench:     108,724 ns/iter (+/- 1,801)
    test bench_tests::bench_sokoban_avl_tree_insert_20000_u128            ... bench:   2,175,486 ns/iter (+/- 26,504)
    test bench_tests::bench_sokoban_avl_tree_lookup_20000_u128            ... bench:     946,922 ns/iter (+/- 22,870)
    test bench_tests::bench_sokoban_avl_tree_remove_1000_u128             ... bench:       1,874 ns/iter (+/- 139)
    test bench_tests::bench_sokoban_critbit_insert_1000_u128              ... bench:      65,728 ns/iter (+/- 1,217)
    test bench_tests::bench_sokoban_critbit_insert_1000_u128_stack        ... bench:      66,342 ns/iter (+/- 2,142)
    test bench_tests::bench_sokoban_critbit_insert_20000_u128             ... bench:   1,963,832 ns/iter (+/- 84,912)
    test bench_tests::bench_sokoban_critbit_lookup_20000_u128             ... bench:   1,652,995 ns/iter (+/- 28,951)
    test bench_tests::bench_sokoban_critbit_remove_1000_u128              ... bench:       2,110 ns/iter (+/- 22)
    test bench_tests::bench_sokoban_hash_map_insert_1000_u128             ... bench:      28,473 ns/iter (+/- 476)
    test bench_tests::bench_sokoban_hash_map_insert_1000_u128_stack       ... bench:      28,533 ns/iter (+/- 473)
    test bench_tests::bench_sokoban_hash_map_insert_20000_u128            ... bench:   1,033,449 ns/iter (+/- 10,907)
    test bench_tests::bench_sokoban_hash_map_lookup_20000_u128            ... bench:     825,924 ns/iter (+/- 35,330)
    test bench_tests::bench_sokoban_hash_map_remove_1000_u128             ... bench:      13,045 ns/iter (+/- 665)
    test bench_tests::bench_sokoban_red_black_tree_insert_1000_u128       ... bench:      49,026 ns/iter (+/- 549)
    test bench_tests::bench_sokoban_red_black_tree_insert_1000_u128_stack ... bench:      49,218 ns/iter (+/- 859)
    test bench_tests::bench_sokoban_red_black_tree_insert_20000_u128      ... bench:   1,450,428 ns/iter (+/- 24,906)
    test bench_tests::bench_sokoban_red_black_tree_lookup_20000_u128      ... bench:   1,043,305 ns/iter (+/- 141,596)
    test bench_tests::bench_sokoban_red_black_tree_remove_1000_u128       ... bench:       2,275 ns/iter (+/- 143)
    test bench_tests::bench_std_btree_map_insert_1000_u128                ... bench:      39,719 ns/iter (+/- 465)
    test bench_tests::bench_std_btree_map_insert_20000_u128               ... bench:     971,612 ns/iter (+/- 12,728)
    test bench_tests::bench_std_btree_map_lookup_20000_u128               ... bench:     987,608 ns/iter (+/- 22,162)
    test bench_tests::bench_std_btree_map_remove_1000_u128                ... bench:       1,201 ns/iter (+/- 10)
    test bench_tests::bench_std_hash_map_insert_1000_u128                 ... bench:      20,465 ns/iter (+/- 462)
    test bench_tests::bench_std_hash_map_insert_20000_u128                ... bench:     484,015 ns/iter (+/- 9,921)
    test bench_tests::bench_std_hash_map_lookup_20000_u128                ... bench:     315,190 ns/iter (+/- 12,477)
    test bench_tests::bench_std_hash_map_remove_1000_u128                 ... bench:      11,380 ns/iter (+/- 217)
    
    test result: ok. 0 passed; 0 failed; 0 ignored; 28 measured; 0 filtered out; finished in 61.36s
    
    opened by febo 0
  • add succinct data structures

    add succinct data structures

    It looks like using succinct data structures in this context might be a good fit. Especially, if you are looking for tries, filters with range queries functionality.

    opened by KirillLykov 1
Releases(v0.3.0)
svgcleaner could help you to clean up your SVG files from the unnecessary data.

svgcleaner svgcleaner helps you clean up your SVG files, keeping them free from unnecessary data. Table of Contents Purpose Goals Alternatives Charts

Evgeniy Reizner 1.5k Jan 9, 2023
Graph data structure library for Rust.

petgraph Graph data structure library. Supports Rust 1.41 and later. Please read the API documentation here Crate feature flags: graphmap (default) en

null 2k Jan 9, 2023
A fast, lightweight and extensible implementation of a graph data structure.

fast-graph A fast, lightweight and extensible implementation of a graph data structure. Note ⚠️ There will be some breaking changes in the coming 1-2

Henrik 34 Jul 6, 2024
concat-arrays: a rust macro for concatenating fixed-size arrays

concat-arrays: a rust macro for concatenating fixed-size arrays This crate defines concat_arrays!, a macro that allows you to concatenate arrays.

Andrew Cann 5 May 25, 2021
Rust lib for a Vec-like structure that can store different types of different sizes contiguous with each other in memory.

hvec In memory of Anna Harren, who coined the term turbofish - which you'll see a lot of if you use this crate. The main purpose of this crate is the

Vi 2 Oct 23, 2022
wait-free 4-level 64-bit pagetable for contiguous low-contention concurrent metadata

pagetable Wait-free 4-level page table that maps from a u64 key to an &AtomicU64 value. Page fan-out is 2^16. If a key doesn't exist, intermediate pag

Komora 21 Nov 20, 2022
This library provides a data view for reading and writing data in a byte array.

Docs This library provides a data view for reading and writing data in a byte array. This library requires feature(generic_const_exprs) to be enabled.

null 2 Nov 2, 2022
Fast, efficient, and robust memory reclamation for concurrent data structures

Seize Fast, efficient, and robust memory reclamation for concurrent data structures. Introduction Concurrent data structures are faced with the proble

Ibraheem Ahmed 240 Dec 23, 2022
An efficient method of heaplessly converting numbers into their string representations, storing the representation within a reusable byte array.

NumToA #![no_std] Compatible with Zero Heap Allocations The standard library provides a convenient method of converting numbers into strings, but thes

Michael Murphy 42 Sep 6, 2022
🐎 A fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. (Python wrapper for daachorse)

python-daachorse daachorse is a fast implementation of the Aho-Corasick algorithm using the compact double-array data structure. This is a Python wrap

Koichi Akabe 11 Nov 30, 2022
A compact generational arena data structure for Rust.

Compact Generational Arena This crate provides Arena<T>, a contiguous growable container which assigns and returns IDs to values when they are added t

Chevy Ray Johnston 17 Dec 6, 2022
A Rust library for building modular, fast and compact indexes over genomic data

mazu A Rust library for building modular, fast and compact indexes over genomic data Mazu (媽祖)... revered as a tutelary deity of seafarers, including

COMBINE lab 6 Aug 15, 2023
stringsext - search for multi-byte encoded strings in binary data

title stringsext - search for multi-byte encoded strings in binary data stringsext is a Unicode enhancement of the GNU strings tool with additional fu

Jens Getreu 89 Dec 14, 2022
Fast suffix arrays for Rust (with Unicode support).

suffix Fast linear time & space suffix arrays for Rust. Supports Unicode! Dual-licensed under MIT or the UNLICENSE. Documentation https://docs.rs/suff

Andrew Gallant 207 Dec 26, 2022
This crate allows writing a struct in Rust and have it derive a struct of arrays layed out in memory according to the arrow format.

Arrow2-derive - derive for Arrow2 This crate allows writing a struct in Rust and have it derive a struct of arrays layed out in memory according to th

Jorge Leitao 29 Dec 27, 2022
Vim plugin to quickly parse strings into arrays.

butcher Vim plugin to quickly parse strings into arrays. It is painful to write arrays in any programming language, so butcher makes it easy for you.

null 5 Dec 31, 2021
Rust implementations of Fast Fourier Transform convolution and correlation for n-dimensional arrays

fftconvolve Rust implementations of Fast Fourier Transform convolution and correlation for n-dimensional arrays Examples 1-dimensional use fftconvolve

Rhys Newell 33 Jan 5, 2023
Serde support for n-dimensional arrays from self-describing formats

serde-ndim Overview This crate provides a way to serialize and deserialize arrays of arbitrary dimensionality from self-described formats such as JSON

Ingvar Stepanyan 5 Apr 3, 2023
Extension trait to chunk iterators into const-length arrays.

const-chunks This crate provides an extension trait that lets you chunk iterators into constant-length arrays using const generics. See the docs for m

Louis Gariépy 6 Jun 12, 2023