VLists in rust

Overview

Running

WORK IN PROGRESS!

(0. curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh)
1. cargo test

Motivation

In the urbit memory model every cell is stored as, effectively,

struct _u3a_boxed_cell {
    c3_w  siz_w;  // size of this box (6)
    c3_w  use_w;  // reference count
    struct u3a_cell {
        c3_w    mug_w;  // hash
        u3_noun hed;    // car
        u3_noun tal;    // cdr
    }
    c3_w  siz_w;  // size of this box (still 6)
}

Assuming cells are something like 80% of the loom, this means at any given point a quarter of nonfree memory consists of the bytes 0x0000_0006 repeated over and over and over
which has at times struck me as… inelegant

(use_w is almost always 1, mug_w is frequently uninitialized at 0, but there are certainly merits in including them, if perhaps not once per byte of a "tape" linked-list or usually-nybble of nock formula. cdr - we'll get to that)

One simple way to reclaim those siz bytes is to store them in the pointer: e.g. devote alternating 4kb pages to cells and noncells, such that a bit test can discriminate them. (You can of course still have indirect atoms and other structures >4kb, they just have to begin on an odd-numbered page). This produces a physical memory(cache etc) savings of 25%, and about breaks even on virtual memory(better the closer you are to 50% of your boxes being cells), at the cost of some fragmentation(and 25% more address use if you have only cells or noncells)

A more flexible, but slower, scheme is to use a bitvector of type tags - even a full page granularity of 2^19 bits is only 65kb, fitting easily into L2 cache(at the cost of whatever other things you might on the margin want in L2 cache).

The pesky cdr field

Of course we are not constrained to only the type tags currently present in u3 (cell vs indirect atom). An ancient idea in the lisp world is CDR Coding, where you store [A B C D] as a linear array, despite its logical structure of [A [B [C D]]]

One can imagine a third "long cell" type, struct u3a_quad { u3_noun car; u3_noun cadr; u3_noun caddr; u3_noun cdddr}, for storing tuples of 3 or 4 elements. (Aligned so that a pointer to its cadr is distinguishable from a pointer to is car, respectively.) This introduces overhead when only 3 elements are present, though not more than if you were to store them as a pair linking to a second pair - but what is "overhead"? The use and mug have been swept under the rug.
A minimal answer is to add a header for use_dot; use_cdr; use_cddr; mug_dot; mug_cdr; mug_cddr, which rolls back the space savings to only the cdr pointers.
The three use can also be consolidated to a total "references anywhere inside this structure" count, at the cost of occasionally keeping whole quad-cells whose only live data is a [caddr cdddr] at the end. As for mug, caching only mug_cddr and forbidding placing quad-cell dot/cdr pointers in another quad-cell's car or cadr positions would maintain the constant-time bound. (You can, after all, always allocate a regular cell instead.)

Vlists

This repository implements an elaboration on the "quad cell" scheme, the VList Datastructure. The core conceit is to store a list ~[1 2 3 4 5 6 7 8 9] as an exponentially flatter sequence

a: {(-:! +<:!) +>-:1 +>+<:2 +>+>-:3 +>+>+<:4 +>+>+>-:5 +>+>+>+<:&b}
b: {-:6 +<:7 +>-:8 +>+:&c}
c: {-:9 +:~}

That is, whenever you find yourself consing onto the middle of a list segment, insert your car as the previous element(if unoccupied) and return a pointer to that; if you cons onto an entire list segment, allocate a new segment twice as big - about the current length of the entire list. This uses at most twice as many words as list elements(compare the original linked list scheme which always uses twice as many words), averaging 1.5x. Placing a bound of 1 page on segment size restricts the maximum absolute overhead to that one page. And now whenever you have a list with a thousand elemends, they're mostly sequential in memory - operations like lent and snag admit log N jets, and unoptimized cdr in general-purpose code leaves the cache line prefetcher much happier.

Metadata

Reference counts are kept in a per-page parallel array of u16 (overflow scheme tbd for adversarial inputs), figuring that the locality savings for the simple case of read-only access of outer-road data(e.g. library code) outweigh the cache misses on updating them.
Further work could include implementing "immutable bean" borrow-inference in the bytecode compiler to further reduce reference count thrashing.

mug is not presently implemented, but could work in a manner similar to rc.

History

Originally implemented as a gist which may be easier to follow.

Bytes?

In the canonical urbit allocator, (trip 'abcdefghijklmnop') producing "abcdefghijklmnop" takes a 16-byte indirect atom value(+16 byte overhead), and converts it into a 384-byte linked list value, either a 12x or 24x increase. Under the proposed scheme(assuming small-but-indirect atoms are stuffed into the corresponsing power-of-two segment slabs and pointer-tagged), the 16 byte value(+ 2 byte reference count) will as a linked list occupy ~64 bytes - we've gotten rid of the cdr, but the car still stores each character as a 31-byte direct atom.
You could instead imagine yet another pointer tag for "byte vector, but pretend it's a linked list" - car is "read the character pointer to a direct atom", cdr is "increment pointer, and if it ends up too round replace with ~", is_atom is false and successor is crash, technically that's all you need for nock. This suggests a broader field of data-encoding-jets, to match nock execution jets - but this "readme" is getting long as is :)

Comments
  • Simplify grab_page()

    Simplify grab_page()

    Create a new Iterator-implementing type to iterate over the contents of a page, and have grab_page() return that, rather than doing a bunch of messy type casts in grab_page() itself.

    opened by ronreg-ribdev 4
  • Build the obvious version of unify-as-isEqual

    Build the obvious version of unify-as-isEqual

    • [ ] Write the unify function without any performance improvements
    • [ ] cons 1 2 = cons 1 2
    • [ ] list 1 thru 10 == list 1 thru 10
    • [ ] list of 10 fresh copies of 00 == ... and is not = list of (something else)

    Write test cases in nock, or call directly? In nock has more of a sense of progress but... requires more working things. Don't do that yet..

    Example test case

        #[test]
        fn decrement(){
            use Op::*;
            let store = heap();
            let prog = program(store,&[
                ...stuff
            ]);                        
            assert_eq!(?, nybble(...));
        }                              
    
    // TODO detect when things are oif the same optimized representation type and detect when it has an equality special case
    // TODO add bit streams - and then need equality for them
    // TODO heap: &mut Store, actually unify
    fn unify(heap: &Store, a: Elem, b: Elem) -> bool {
        if a == b { return true }
        use Noun::*;
        match (a.into(),b.into()) {
            //TODO test any of this
            (Ind(a), Ind(b)) => {
                if len(a) != len(b) { return false }
                for (ax,bx) in iter::zip(a.iter(), b.iter()) {
                    if ax != bx { return false }
                }
                return true
            },
            (Cel(a), Cel(b)) => {
                if len(a) != len(b) { return false }
                for (ax,bx) in iter::zip(a.iter(), b.iter()) {
                    if !unify(ax, bx) { return false }
                }
                return true
            },
            _ => false
        }
    }
    
    opened by compwron 0
  • implement minimal hint

    implement minimal hint

    implement minimal hint that just prints text to console

                    Op::Hint => {
                        do_cons();
                        println!("hint: {:?}", Noun::from(subj));
                        //TODO implement hints
                        subj = stack.pop().expect("Underflow: no stack after hint");
                    }
    
    opened by compwron 0
  • fix lose: be able to run code without leaking memory

    fix lose: be able to run code without leaking memory

    Doing this may involve changing ... data structures... in order to be able to free...

    Be able to run code without leaking memory - by implementing freeing memory and call it

    • [ ] Make this method compile
    • [ ] Add a test to the main tests that freeing list of integers works fine before changing this
    • [ ] test: after running some nock that returns 0 or something, there's not more memory in use than before you ran the nock. In order to test this, might need to add another function to detect how much total memory is being used

    There will need to be code in nock.rs that may just be an exact copy of this? that will be the recursive version of the lost function... maybe using public interfaces for some % ...

    The commented out code frees lists of integers. We are trying to free lists of pointers, which involves some kind of architectural decisions... reckonings... contemplation...

        // fn lose(&mut self, mut idx: Index) {
        //     loop {
        //         let Index(page, list, _item) = idx;
        //         self.rc[page as usize][list as usize] -= 1;
        //         if self.rc[page as usize][list as usize] > 0 { break; }
        //         //TODO fix used
        //         if Some(Ok(idx)) = self.cdr(idx){} else { break; }
        //         //TODO free pages ever?
        //     }
        // }
    
    +    #[test]
    +    fn reclaims_memory(){
    +        use Op::*;
    +        let store = heap();
    +        let original_used = store.used_bytes();
    +        assert_eq!(0, nybble(store, 1, &[Dup, Cons, Lit(0)]));
    +        assert_eq!(original_used, store.used_bytes());
    +    }
    
    opened by compwron 0
  • [blocked] test overlarge data

    [blocked] test overlarge data

    This is blocked on - right now the pointer is statically sliced into index and subindex of a page page size is oging to have to be page size * the smallest element * how many elements you can have pages should be bigger in theory, unless the linker complains... not wuote 4GB "for one thing, you have to put the kernel somewhere..." xD pointer is statically sliced into 8bit index and 8bit subindex ... 2 things its being simplified for - 1. if pages are small, can allocate a few things and have it overflow into the next page and test for that 2. debug representation doesn't have a giant pile of zeros 3. don't need to allocate too much to test what happens when you run out (or fake it)

    One of the blockers is- things are small to be easier to work with and debug You can just leave the larger chunk sizes commented out it- in the enum of all the different page types... here are 4 being used, lots that aren't...

    //TODO 4096 maybe  L: this should be a test case - allocate a large thing that doesn't fit
    const PAGE_SIZE: usize = 128;
    
    blocked 
    opened by compwron 0
Owner
Anton Dyudin
Anton Dyudin
Ternary search tree collection in rust

tst Ternary search tree collection in rust with similar API to std::collections as it possible. Ternary search tree is a type of trie (sometimes calle

Alexey Pervushin 20 Dec 7, 2022
Array helpers for Rust's Vector and String types

array_tool Array helpers for Rust. Some of the most common methods you would use on Arrays made available on Vectors. Polymorphic implementations for

Daniel P. Clark 69 Dec 9, 2022
Generic array types in Rust

generic-array This crate implements generic array types for Rust. Requires minumum Rust version of 1.36.0, or 1.41.0 for From<[T; N]> implementations

Bartłomiej Kamiński 325 Dec 25, 2022
A priority queue for Rust with efficient change function.

PriorityQueue This crate implements a Priority Queue with a function to change the priority of an object. Priority and items are stored in an IndexMap

null 139 Dec 30, 2022
K-dimensional tree in Rust for fast geospatial indexing and lookup

kdtree K-dimensional tree in Rust for fast geospatial indexing and nearest neighbors lookup Crate Documentation Usage Benchmark License Usage Add kdtr

Rui Hu 152 Jan 2, 2023
Roaring bitmap implementation for Rust

RoaringBitmap This is not yet production ready. The API should be mostly complete now. This is a Rust port of the Roaring bitmap data structure, initi

Roaring bitmaps: A better compressed bitset 552 Jan 1, 2023
Rust Persistent Data Structures

Rust Persistent Data Structures Rust Persistent Data Structures provides fully persistent data structures with structural sharing. Setup To use rpds a

Diogo Sousa 883 Dec 31, 2022
Rust crate to extend io::Read & io::Write types with progress callbacks

progress-streams Rust crate to provide progress callbacks for types which implement io::Read or io::Write. Examples Reader extern crate progress_strea

Pop!_OS 19 Dec 3, 2022
Parameterized routing for generic resources in Rust

Usher Usher provides an easy way to construct parameterized routing trees in Rust. The nodes of these trees is naturally generic, allowing Usher to le

Isaac Whitfield 34 Oct 22, 2022
RiteLinked - LinkedHashMap & LinkedHashSet in Rust

RiteLinked -- HashMap-like containers that hold their key-value pairs in a user controllable order RiteLinked provides more up to date versions of Lin

Rite Database 52 Aug 19, 2022
A proof of concept implementation of cyclic data structures in stable, safe, Rust.

A proof of concept implementation of cyclic data structures in stable, safe, Rust. This demonstrates the combined power of the static-rc crate and the

null 157 Dec 28, 2022
Doubly-Linked List Implementation in Rust

Doubly-Linked List Implementation in Rust Purely for educational and recreational purposes. For real world production please use std::collections::Lin

Tsoding 9 Jul 28, 2022
Rust library for string parsing of basic data structures.

afmt Simple rust library for parsing basic data structures from strings. Usage You can specify string formats to any strucute, via the use of the fmt

Eduard 4 May 8, 2021
Hash Table Implementation in Rust

Hash Table Implementation in Rust Purely for educational and recreational purposes. For real world production please use std::collections::HashMap. Fo

Tsoding 11 Sep 6, 2022
SegVec data structure for rust. Similar to Vec, but allocates memory in chunks of increasing size.

segvec This crate provides the SegVec data structure. It is similar to Vec, but allocates memory in chunks of increasing size, referred to as "segment

Jacob Ryan McCollum 30 Dec 16, 2022
Serde is a framework for serializing and deserializing Rust data structures efficiently and generically.

Serde is a framework for serializing and deserializing Rust data structures efficiently and generically.

null 6.5k Dec 31, 2022
Blazing fast immutable collection datatypes for Rust.

imbl Blazing fast immutable collection datatypes for Rust. This is a fork of the im crate, which appears to be unmaintained. (This fork is not current

null 38 Jul 11, 2022
Algorithms and Data Structures of all kinds written in Rust.

Classic Algorithms in Rust This repo contains the implementation of various classic algorithms for educational purposes in Rust. Right now, it is in i

Alexander González 49 Dec 14, 2022
Use Rust to solve questions on Leetcode.

Rust for Leetcode 本仓库用于记录我使用Rust刷Leetcode的代码,尽量做到日更。 题目列表 编号 名称 题目类型 题解 678 有效的括号字符串 栈 valid-parenthesis-string.rs 461 汉明距离 位运算 hamming-distance.rs 62

Rui Li 1 Mar 3, 2022