Exploration of using Storage instead of Allocator to parameterize collections in Rust

Overview

storage-poc aims at exploring the usage of custom Storages, rather than custom Allocators.

Goals

This is a Proof-of-Concept aiming at:

  • Demonstrating the technical feasibility.
  • Showcasing the usage of storages for various collections, with different requirements.
  • Sketching out a potential API.

This experiment does not intend to provide a production-ready solution.

Why not Allocators?

The currently proposed API for allocators is centered around NonNull, that is a pointer.

Creating a Box based on a custom allocator means storing the pointer within the Box. If the allocator were to ever move, the pointer within the Box would become dangling.

This limitation prevents storing the content of the Box, and other collections, inline. And there are many benefits to doing so:

  • Imagine returning a Box<dyn Future<_>, InlineStorage>: abstract type, dynamically dispatched, without memory allocation.
  • Imagine storing Box<dyn FnOnce(), InlineStorage>: you can build a task queue, with no memory allocation in sight.
  • Imagine creating a const LOOKUP: BTreeMap<K, V, InlineStorage> = /**/;: no memory allocation, hence const able.

Abstracting away pointers may even allow storing those collections in shared memory.

Interested? Continue reading!

How to navigate?

The repository contains 3 parts:

  • traits.rs sketches out the API of the necessary storage traits.
  • collections sketches out how to adapt a few known collections with disparate needs to demonstrate the usage of the traits, in practice.
  • The other modules are implementations of the traits:
    • allocator.rs implementations simply adapt an Allocator.
    • inline.rs implementations store everything inline.
    • alternative.rs combines 2 storages, either of which is active at the moment. It is used to implement the small.rs family of storages.
    • fallback.rs combines 2 storages, using both simultaneously, with a preference for the first -- which should be cheaper.

What is the API?

The traits.rs defines an API around 2 axes:

  • Single vs Multi: whether the storage allocates a single item at a time, or allows juggling multiple items.
  • Element vs Range: whether the storage allocates a single element at a time, or allocates a range of elements.

The API was created to be higher-level than the Allocator API. Being higher-level makes it easier to use, and allows optimizations in the implementation.

The SingleElementStorage is (stripped down):

pub trait SingleElementStorage : ElementStorage {
    fn create<T: Pointee>(&mut self, value: T) -> Result<Self::Handle<T>, T>;

    fn allocate<T: ?Sized + Pointee>(&mut self, meta: T::MetaData) -> Result<Self::Handle<T>, AllocError>;
}

pub trait ElementStorage {
    type Handle<T: ?Sized + Pointee> : Clone + Copy;

    unsafe fn destroy<T: ?Sized + Pointee>(&mut self, handle: Self::Handle<T>);

    unsafe fn deallocate<T: ?Sized + Pointee>(&mut self, handle: Self::Handle<T>);

    unsafe fn get<T: ?Sized + Pointee>(&self, handle: Self::Handle<T>) -> NonNull<T>;
}

The Handle<T> is the magic allowing inline storage: they are something to be converted to an ephemeral pointer via storage.get(handle), but not a pointer themselves, and therefore supports relocations of the underlying storage.

From there, the lifecyle is rather obvious: create, get a few times, and finally destroy.

Note: the Pointee trait comes from RFC2580, as implemented in rfc2580, it's essentially a trait to split a pointer into pointer-to-data and meta-data and put it back together later.

And the SingleRangeStorage is (stripped down):

pub trait SingleRangeStorage : RangeStorage {
    fn allocate<T>(&mut self, capacity: Self::Capacity) -> Result<Self::Handle<T>, AllocError>;
}

pub trait RangeStorage {
    type Handle<T> : Clone + Copy;

    type Capacity : Capacity;

    fn maximum_capacity<T>(&self) -> Self::Capacity;

    unsafe fn deallocate<T>(&mut self, handle: Self::Handle<T>);

    unsafe fn get<T>(&self, handle: Self::Handle<T>) -> NonNull<[MaybeUninit<T>]>;

    unsafe fn try_grow<T>(&mut self, _handle: Self::Handle<T>, _new_capacity: Self::Capacity) -> Result<Self::Handle<T>, AllocError> {
        Err(AllocError)
    }

    unsafe fn try_shrink<T>(&mut self, _handle: Self::Handle<T>, _new_capacity: Self::Capacity) -> Result<Self::Handle<T>, AllocError> {
        Err(AllocError)
    }
}

Once again a Handle<T> to allow inline storage, and this time a Capacity to allow using a smaller type than usize when possible.

The trait is lower-level, as it makes no assumption about which portions of the range are initialized or not, leaving it up to the caller. This is necessary since Vec and VecDeque have different invariants here.

Can we replace the std collections tomorrow?

There is one blocker currently: the custom Box implementation is not coercible.

CoerceUnsized and Unsize are locked-down, and the custom implementation of RFC2580 therefore cannot implement CoerceUnsized to be able to unsize SizedMetadata<[u8; 3]> into SliceMetadata<[u8]>. This in turn prevents Box from being coercible. Oops.

For the purpose of this Proof-of-Concept, a stand-in coerce method has been added to the ElementStorage trait where the implementation is essentially to go from handle to pointer, let the compiler coerce it, and then go back to handle from there. Then a similar coerce method is implemented on Box, and things just work.

Then, there is the issue that the implementation relies on Generic Associated Types, which are quite unstable, though good enough that the code compiles and runs without issues on nightly... as long as one doesn't try to coerce-unsize the Box. It may be prudent to wait for the go-ahead of the compiler developers before switching any collection.

That's all folks!

And thanks for reading.

You might also like...
An automated CLI tool that optimizes gas usage in Solidity smart contracts, focusing on storage and function call efficiency.

Solidity-Gas-Optimizoor An high performance automated CLI tool that optimizes gas usage in Solidity smart contracts, focusing on storage and function

Safe, fast, small crypto using Rust

THE SOFTWARE IS PROVIDED "AS IS" AND BRIAN SMITH AND THE AUTHORS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES

X25519 elliptic curve Diffie-Hellman key exchange in pure-Rust, using curve25519-dalek.
X25519 elliptic curve Diffie-Hellman key exchange in pure-Rust, using curve25519-dalek.

x25519-dalek A pure-Rust implementation of x25519 elliptic curve Diffie-Hellman key exchange, with curve operations provided by curve25519-dalek. This

QuickDash A modern alternative to QuickSFV using Rust.
QuickDash A modern alternative to QuickSFV using Rust.

QuickDash A modern alternative to QuickSFV using Rust. It's supports BLAKE3 and BLAKE2 hashes, CRC32, MD5, SHA1, SHA2, SHA3, xxHash The docs for user

A prototype project integrating jni rust into Kotlin and using protobuf to make them work together

KotlinRustProto a prototype project integrating jni rust into Kotlin and using protobuf to make them work together How to start add a RPC call in Droi

Port path module (and tests) of nodejs to rust using the same algorithms.

rusty_nodejs_path Port path module (and tests) of nodejs to rust using the same algorithms. crates.io Documents Progress posix path.basename(path[, ex

 A pure-Rust implementation of Bulletproofs using Ristretto.
A pure-Rust implementation of Bulletproofs using Ristretto.

Bulletproofs The fastest Bulletproofs implementation ever, featuring single and aggregated range proofs, strongly-typed multiparty computation, and a

Implementation of Proof of Existence consensus using Substrate Framework, Frame, Pallets, RUST

Substrate Node Template A fresh FRAME-based Substrate node, ready for hacking 🚀 Getting Started Follow the steps below to get started with the Node T

A blazingly fast, ShareX uploader coded in Rust (using actix web) which utilizes AES-256-GCM-SIV to securely store uploaded content.

Magnesium Oxide ❔ What is this? Magnesium-Oxide (MGO) is a secure file uploader with support for ShareX. 🌠 Features 🔥 Blazingly fast uploads and enc

Comments
  • Inline storage types need to use `UnsafeCell` or a `get_mut` resolver

    Inline storage types need to use `UnsafeCell` or a `get_mut` resolver

    <SingleElement as SingleElementStorage>::get allows you to go from &SingleElement to &mut T, violating &'s mutability restrictions. The same goes for any other inline storage. There are two solutions to this:

    • Use UnsafeCell in inline storages (pessimizes Box<T, InlineStorage>), or
    • Add a get_mut(&mut self, handle: Handle<T>) -> NonNull<T> resolver, and require that mutable pointee access goes through get_mut unless a marker trait SharedMutabilityStorage is implemented.

    (To allow e.g. split_at_mut style use cases, it could be get_mut(&mut self, [Handle<T>; N]) -> [NonNull<T>; N], get_mut(&mut self, Handle<T>) -> (&mut Self, NonNull<T>), or even just get(*mut self, Handle<T>) -> NonNull<T>.)

    opened by CAD97 6
  • Inline Storage - Size / Alignment compile-time assertions.

    Inline Storage - Size / Alignment compile-time assertions.

    One worthy goal would be to be able to perform compile-time size/alignment assertions for inline storage without changing the final api, and retaining the storage-agnostic goal of the final container objects. Heap storage should not do any assertions.

    My approach is to use generic associated types as assertion objects and have the final create function perform the check using the assertion object directly. This way each associated assertion is a policy for the final create function. For all storages the assertion is performed as a trait requirement on create, the trick is to make the assertion always pass for Heap storages but possibly-fail for Inline storages.

    Here is a POC of this.

    Unfortunately having the check being part of the signature means that all call sites will need to propagate the constraint even if we know that eg on HeapStorage it will always pass. The compiler first checks the trait bounds and refuses to evaluate the bound itself. This means that if we attempt to make an alias with HeapStorage the constraint will have to leak to user code as well.

    Eg a user can't write an external create function:

    fn create<T>(val : T) -> RawBox<T, HeapStorage> {
        RawBox::create(val)
    }
    

    even though the trait is always valid. Instead they have to write the following:

    fn create<T>(val : T) -> RawBox<T, HeapStorage> where [u8 ; size_assert::<HeapStorage, S>()] : Sized {
        RawBox::create(val)
    }
    

    it would be nice if the compiler attempted to evaluate the trait first. The above should be able to reduce to:

    [u8 ; {HeapStorage::Asserter<T>::VALUE as usize} - 1] : Sized =>
    // At this point AlwaysTrueAsserter is parameterized but we can avoid it.
    [u8 ; {AlwaysTrueAsserter::VALUE as usize} - 1] : Sized =>
    [u8 ; {true as usize} - 1] : Sized =>
    [u8 ; 0] : Sized
    

    So the fact that I have to specify the bound seems a bit unneeded to me.

    opened by liarokapisv 2
Owner
null
Darwinia networks' tracing runtime override collections

Darwinia Runtime Overrides USAGE: runtime-overrides [OPTIONS] --runtime <CHAIN> OPTIONS: -h, --help Print help information

Darwinia Network 4 Aug 3, 2022
Secure storage for cryptographic secrets in Rust

secrets secrets is a library to help Rust programmers safely held cryptographic secrets in memory. It is mostly an ergonomic wrapper around the memory

Stephen Touset 165 Dec 22, 2022
Rust API Client for ImageKit.io a file storage and image processing service

Rust API Client for ImageKit.io a file storage and image processing service Usage You must retrieve your Public and Private Keys from the ImageKit Dev

Esteban Borai 4 Jul 31, 2022
Hyperswitch Card Vault is an open-source sensitive information storage system built on Rust.

Tartarus - Rust Locker Overview The Hyperswitch Card Vault (Tartarus) is a highly performant and a secure vault to save sensitive data such as payment

Juspay Technologies 9 Nov 23, 2023
Easy to use cryptographic framework for data protection: secure messaging with forward secrecy and secure data storage. Has unified APIs across 14 platforms.

Themis provides strong, usable cryptography for busy people General purpose cryptographic library for storage and messaging for iOS (Swift, Obj-C), An

Cossack Labs 1.6k Dec 30, 2022
A file storage service

hashfs A file storage service How to use it? Start up the service at terminal # You can specify the storage root path and access domain when you start

null 17 Dec 18, 2021
Demo: Connect Phala's Fat Contract to external storage services, both centralized (Amazon s3) and decentralized .

This demo shows how to connect Phala's Fat Contract to external storage services, both centralized (Amazon s3) and decentralized (Arweave/Filecoin thr

Christopher Fok 2 Aug 30, 2022
Project Masterpass is a deterministic databaseless key management algorithm, aimed to help those who cannot protect their encryption keys in storage

Project Masterpass (working title) Attention! This project is still under heavy development, and SHOULD NOT be used in practice, as the algorithms cou

Gyorgy Wang 2 Sep 11, 2022
A simple key-value store with a log-structured, append-only storage architecture where data is encrypted with AES GCM.

akvdb A simple key-value store with a log-structured, append-only storage architecture where data is encrypted with AES GCM. Modified from the actionk

Olle W 3 Oct 10, 2022
Koofr Vault is an open-source, client-side encrypted folder for your Koofr cloud storage offering an extra layer of security for your most sensitive files.

Koofr Vault https://vault.koofr.net Koofr Vault is an open-source, client-side encrypted folder for your Koofr cloud storage offering an extra layer o

Koofr 12 Dec 30, 2022