A flexible, simple to use, immutable, clone-efficient String replacement for Rust

Overview

flexstr

Crate Docs Build codecov MSRV

A flexible, simple to use, immutable, clone-efficient String replacement for Rust. It unifies literals, inlined, and heap allocated strings into a single type.

Table of Contents

Overview

Rust is great, but it's String type is optimized as a mutable string buffer, not for typical string use cases. Most string use cases don't modify their contents, often need to copy strings around as if they were cheap like integers, typically concatenate instead of modify, and often end up being cloned with identical contents. Additionally, String isn't able to wrap a string literal without additional allocation and copying forcing a choice between efficiency and storing two different types.

I believe Rust needs a new string type to unify usage of both literals and allocated strings for typical string use cases. This crate includes a new string type that is optimized for those use cases, while retaining the usage simplicity of String.

Example

String constants are easily wrapped into the unified string type. String data is automatically inlined when possible otherwise allocated on the heap.

See documentation or Usage section for more examples.

use flexstr::{local_str, LocalStr, ToLocalStr};

fn main() {
  // Use `local_str` macro to wrap literals as compile-time constants
  const STATIC_STR: LocalStr = local_str!("This will not allocate or copy");
  assert!(STATIC_STR.is_static());

  // Strings up to 22 bytes (on 64-bit) will be inlined automatically
  // (demo only, use macro or `from_static` for literals as above)
  let inline_str = "inlined".to_local_str();
  assert!(inline_str.is_inline());

  // When a string is too long to be wrapped/inlined, it will heap allocate
  // (demo only, use macro or `from_static` for literals as above)
  let rc_str = "This is too long to be inlined".to_local_str();
  assert!(rc_str.is_heap());
}

Installation

Optional features:

  • fast_format = enables local_ufmt! and shared_ufmt! format!-like macros for very fast formatting (with some limitations)
  • fp_convert = Convert floating point types directly into a FlexStr
  • int_convert = Convert integer types directly into a FlexStr
  • serde = Serialization support for FlexStr
  • std = enabled by default (use default-features=false to enable #[no_std])
[dependencies.flexstr]
version = "0.9"
features = ["fast_format, fp_convert", "int_convert", "serde"]

How Does It Work?

Internally, FlexStr uses a union with these variants:

  • Static - A simple wrapper around a static string literal (&'static str)
  • Inline - An inlined string (no heap allocation for small strings)
  • Heap - A heap allocated (reference counted) string

The type automatically chooses the best storage and allows you to use them interchangeably as a single string type.

Features

  • Optimized for immutability and cheap cloning
  • Allows for multiple ownership of the same string memory contents
  • Serves as a universal string type (unifying literals and allocated strings)
  • Doesn't allocate for literals and short strings (64-bit: up to 22 bytes)
  • The same inline size as a String (64-bit: 24 bytes)
  • Optional serde serialization support (feature = "serde")
  • Compatible with embedded systems (supports #[no_std])
  • Efficient conditional ownership (borrows can take ownership without allocation/copying)
  • Both single threaded compatible (LocalStr) and multi-thread safe (SharedStr) options
  • All dependencies are optional and based on feature usage
  • It is simple to use!

Types

NOTE: Both types are identical in handling both literals and inline strings. The only difference occurs when a heap allocation is required.

  • LocalStr - ultra-fast usage in the local thread
    • Heap storage based on Rc
  • SharedStr- provides Send / Sync for multithreaded use
    • Heap storage based on Arc

Usage

Hello World

use flexstr::local_str;

fn main() {
  // From literal - no copying or allocation
  let world = local_str!("world!");

  println!("Hello {world}");
}

Creation Scenarios

use flexstr::{local_str, LocalStr, IntoSharedStr, IntoLocalStr, ToLocalStr};

fn main() {
  // From literal - no runtime, all compile-time
  const literal: LocalStr = local_str!("literal");

  // From borrowed string - Copied into inline string
  let owned = "inlined".to_string();
  let str_to_inlined = owned.to_local_str();

  // From borrowed String - copied into `str` wrapped in `Rc`
  let owned = "A bit too long to be inlined!!!".to_string();
  let str_to_wrapped = owned.to_local_str();

  // From String - copied into inline string (`String` storage released)
  let inlined = "inlined".to_string().into_local_str();

  // From String - `str` wrapped in `Rc` (`String` storage released)
  let counted = "A bit too long to be inlined!!!".to_string().into_local_str();

  // *** If you want a Send/Sync type you need `SharedStr` instead ***

  // From LocalStr wrapped literal - no copying or allocation
  let literal2 = literal.into_shared_str();

  // From LocalStr inlined string - no allocation
  let inlined = inlined.into_shared_str();

  // From LocalStr `Rc` wrapped `str` - copies into `str` wrapped in `Arc`
  let counted = counted.into_shared_str();
}

Passing FlexStr to Conditional Ownership Functions

This has always been a confusing situation in Rust, but it is easy with FlexStr since multi ownership is cheap. By passing as &LocalStr instead of &str, you retain the option for very fast multi ownership.

use flexstr::{local_str, IntoLocalStr, LocalStr};

struct MyStruct {
  s: LocalStr
}

impl MyStruct {
  fn to_own_or_not_to_own(s: &LocalStr) -> Self {
    let s = if s == "own me" {
      // Since a wrapped literal, no copy or allocation
      s.clone()
    } else {
      // Wrapped literal - no copy or allocation
      local_str!("own me")
    };

    Self { s }
  }
}

fn main() {
  // Wrapped literals - compile time constant
  const S: LocalStr = local_str!("borrow me");
  const S2: LocalStr = local_str!("own me");

  let struct1 = MyStruct::to_own_or_not_to_own(&S);
  let struct2 = MyStruct::to_own_or_not_to_own(&S2);

  assert_eq!(S2, struct1.s);
  assert_eq!(S2, struct2.s);
}

Make Your Own String Type

All you need to do is pick a storage type. The storage type must implement Deref, From<&str>, and Clone. Pretty much all smart pointers do this already.

NOTE:

Custom concrete types need to specify a heap type with an exact size of two machine words (16 bytes on 64-bit, and 8 bytes on 32-bit). Any other size parameter will result in a runtime panic error message on string creation.

use flexstr::{FlexStrBase, Repeat, ToFlex};

type BoxStr = FlexStrBase<Box<str>>;

fn main() {
  // Any need for a heap string will now be allocated in a `Box` instead of `Rc`
  // However, the below uses static and inline storage...because we can!
  let my_str = BoxStr::from_static("cool!").repeat_n(3);
  assert_eq!(my_str, "cool!cool!cool!");
}

Performance Characteristics

  • Clones are cheap and never allocate
    • At minimum, they are just a copy of the union and at max an additional reference count increment
  • Literals are just wrapped when used with into() and never copied
  • Calling into() on a String will result in an inline string (if short) otherwise copied into a str wrapped in Rc/Arc (which will allocate, copy, and then release original String storage)
  • into_local_str() and into_shared_str() are equivalent to calling into() on both literals and String (they are present primarily for let bindings so there is no need to declare a type)
  • to_local_str() and to_shared_str() are meant for taking ownership of borrowed strings and always copy into either an inline string (for short strings) or an Rc/Arc wrapped str (which will allocate)
  • to_string always copies into a new String
  • Conversions back and forth between SharedStr and LocalStr using into() are cheap when using wrapped literals or inlined strings
    • Inlined strings and wrapped literals just create a new union wrapper
    • Reference counted wrapped strings will always require an allocation and copy for the new Rc or Arc

Benchmarks

In general, inline/static creates are fast but heap creates are a tiny bit slower than String. Clones are MUCH faster and don't allocate/copy. Other operations (repeat, additions, etc.) tend to be about the same performance, but with some nuance depending on string size.

Full benchmarks

Downsides

There is no free lunch:

  • Due to usage of Rc (or Arc), when on-boarding String it will need to reallocate and copy
  • Due to the union wrapper, every string operation has the overhead of an extra branching operation
  • Since LocalStr is not Send or Sync, there is a need to consider single-threaded (LocalStr) and multi-threaded (SharedStr) use cases and convert accordingly

Status

This is currently beta quality and still needs testing. The API may very possibly change but semantic versioning will be followed.

License

This project is licensed optionally under either:

Comments
  • Minor unsafe code cleanup suggestion

    Minor unsafe code cleanup suggestion

    I've been meaning to give unions a try on kstring and finally did. A couple of differences in my implementation

    • I extract out the padding to a dedicated type. This helps enforce the unsafe invariant at compile time because "nothing" can access the MaybeUninit (yeah technically you can since its in the same file but it'll be more obvious)
    • I created an unused TagVariant that exists just to extract the tag from the union. I feel this makes the intent of the code clearer than arbitrarily using a specific union variant to access the tag.

    If not interested; thats fine. I just figured I'd share in case you felt it improved things.

    enhancement wontfix 
    opened by epage 7
  • Store the length in the tag?

    Store the length in the tag?

    This could allow FlexStr to store 23 bytes, instead of 22 bytes. Unsure if there is extra overhead from doing this that would could make the benefits iffy.

    One step further, compact_str stores the length, tag. and the final byte in the 24th byte, leveraging unused bits in a terminating UTF-8 character.

    wontfix 
    opened by epage 5
  • Could not compile flexstr

    Could not compile flexstr

    I'm unable to compile flexStr. cargo reports 64 errors such as:

    error: type parameters must be declared prior to const parameters
      --> /workspace/.cargo/registry/src/github.com-1ecc6299db9ec823/flexstr-0.9.1/src/traits.rs:12:75
       |
    12 | pub trait Repeat<const SIZE: usize, const PAD1: usize, const PAD2: usize, HEAP> {
       |                 ----------------------------------------------------------^^^^- help: reorder the parameters: lifetimes, then types, then consts: `<HEAP, const SIZE: usize, const PAD1: usize, const PAD2: usize>`
    
    error: type parameters must be declared prior to const parameters
      --> /workspace/.cargo/registry/src/github.com-1ecc6299db9ec823/flexstr-0.9.1/src/traits.rs:17:63
       |
    17 | impl<const SIZE: usize, const PAD1: usize, const PAD2: usize, HEAP> Repeat<SIZE, PAD1, PAD2, HEAP>
       |     ----------------------------------------------------------^^^^- help: reorder the parameters: lifetimes, then types, then consts: `<HEAP, const SIZE: usize, const PAD1: usize, const PAD2: usize>`
    
    error: type parameters must be declared prior to const parameters
      --> /workspace/.cargo/registry/src/github.com-1ecc6299db9ec823/flexstr-0.9.1/src/traits.rs:35:63
       |
    35 | impl<const SIZE: usize, const PAD1: usize, const PAD2: usize, HEAP> Repeat<SIZE, PAD1, PAD2, HEAP>
       |     ----------------------------------------------------------^^^^- help: reorder the parameters: lifetimes, then types, then consts: `<HEAP, const SIZE: usize, const PAD1: usize, const PAD2: usize>`
    

    My env:

    cargo 1.58.0 (f01b232bc 2022-01-19)
    release: 1.58.0
    commit-hash: f01b232bc7f4d94f0c4603930a5a96277715eb8c
    commit-date: 2022-01-19
    host: x86_64-unknown-linux-gnu
    libgit2: 1.3.0 (sys:0.13.23 vendored)
    libcurl: 7.80.0-DEV (sys:0.4.51+curl-7.80.0 vendored ssl:OpenSSL/1.1.1l)
    os: Ubuntu 20.04 (focal) [64-bit]
    
    opened by CyriacBr 4
  • FlexStr should implement Borrow<str>

    FlexStr should implement Borrow

    It would be good if FlexStr would implement std::borrow::Borrow<str>. Right now you cannot use it as the key in a hashmap and look up the associated value with a regular &str.

    bug 
    opened by oliver-giersch 3
  • Fix typo in features list

    Fix typo in features list

    It took me a while to realize what the issue was, as the error output was correct, but also easy to miss the actual typo/reason:

    the package `x` depends on `flexstr`, with features: `fast_format, fp_convert` but `flexstr` does not have these features.
    
    opened by asaaki 0
  • Update compact_str to v0.4

    Update compact_str to v0.4

    Context In the most recent version of compact_str, we renamed CompactStr to CompactString. You can continue to use CompactStr but there is a deprecation warning on it.

    Changes This PR updates compact_str to v0.4, renaming uses of CompactStr to CompactString to prevent the warning

    opened by ParkMyCar 2
  • Byte and OsStr variants?

    Byte and OsStr variants?

    I'm still undecided on what type I'll use in clap (Cow<'static, T> or a more specialized crate) but I'll be needing both str and OsStr support.

    On reddit, it sounded like there is also interest in byte strings. Hopefully byte strings and OsStr will be merged soon but unsure if/when that'll happen.

    enhancement 
    opened by epage 5
  • Custom Ref/Cow to preserve `&'static str`?

    Custom Ref/Cow to preserve `&'static str`?

    This was also one of they key early features for KString so that liquid could pass around data throughout the program and avoid allocations for this data.

    enhancement 
    opened by epage 13
Owner
Scott Meeuwsen
Scott Meeuwsen
An annotated string type in Rust, made up of string slices

A string type made up of multiple annotated string slices.

Togglebit 3 Dec 29, 2022
An efficient method of heaplessly converting numbers into their string representations, storing the representation within a reusable byte array.

NumToA #![no_std] Compatible with Zero Heap Allocations The standard library provides a convenient method of converting numbers into strings, but thes

Michael Murphy 42 Sep 6, 2022
A simple string interner / symbol table for Rust projects.

Symbol Interner A small Rust crate that provides a naïve string interner. Consult the documentation to learn about the types that are exposed. Install

Ryan Chandler 1 Nov 18, 2021
A simple string parsing utility library for Rust, supporting no_std contexts.

strp Utility library for parsing data from an input string, or stdin if built with the std feature. Supports no_std contexts when built without the st

iqon 5 Nov 3, 2022
A mostly drop-in replacement for mercantile written w/ rust, plus several other util(e)ities.

utiles utiles = utils + tiles A mostly drop-in replacement for mercantile written w/ rust, plus several other util(e)ities. Installation pip install u

jesse 5 Jun 20, 2023
Simple and efficient time representation in Rust.

timens-rs Simple and efficient timestamp representation. The main objective being interoperability with OCaml Core_kernel.Time_ns. A significant part

Laurent Mazare 7 Oct 17, 2022
A simple, efficient Rust library for handling asynchronous job processing and task queuing.

job_queue Setup cargo add job_queue Usage Create a job use job_queue::{Error, Job, typetag, async_trait, serde}; #[derive(Debug, serde::Deserialize,

Georges KABBOUCHI 3 Nov 30, 2023
Minimal, flexible framework for implementing solutions to Advent of Code in Rust

This is advent_of_code_traits, a minimal, flexible framework for implementing solutions to Advent of Code in Rust.

David 8 Apr 17, 2022
Demonstration of flexible function calls in Rust with function overloading and optional arguments

Table of Contents Table of Contents flexible-fn-rs What is this trying to demo? How is the code structured? Named/Unnamed and Optional arguments Mecha

Tien Duc (TiDu) Nguyen 81 Nov 3, 2022
Minimal, flexible & user-friendly X and Wayland tiling window manager with rust

SSWM Minimal, flexible & user-friendly X and Wayland tiling window manager but with rust. Feel free to open issues and make pull requests. [Overview]

Linus Walker 19 Aug 28, 2023
Extent - replacement for std::ops::{Range,RangeInclusive}

extent This is an alternative to std::ops::{Range,RangeInclusive}, avoiding the quirks of those types (non-Copy, inability to produce empty inclusive

Graydon Hoare 33 Sep 27, 2022
A box full of utilities, a unworthy replacement for coreutils / busybox / toybox.

Gearbox A box full of utilities, a unworthy replacement for coreutils / busybox / toybox. List of content How to setup Systems Ubuntu Arch How to buil

Gemlock 5 Jul 21, 2022
messloc is a drop in replacement for malloc that can transparently recover from memory fragmentation without any changes to application code.

messloc is a drop in replacement for malloc that can transparently recover from memory fragmentation without any changes to application code. Goals Al

null 11 Dec 10, 2022
Concrete is a simple programming language specifically crafted for creating highly scalable systems that are reliable, efficient, and easy to maintain.

The Concrete Programming Language Most ideas come from previous ideas - Alan C. Kay, The Early History Of Smalltalk In the realm of low-level programm

Lambdaclass 126 Nov 22, 2024
Concatenate Amazon S3 files remotely using flexible patterns

S3 Concat This tool has been migrated into s3-utils, please use that crate for future updates. A small utility to concatenate files in AWS S3. Designe

Isaac Whitfield 33 Dec 15, 2022
Flexible snowflake generator, reference snoyflake and leaf.

Flexible snowflake generator, reference snoyflake and leaf.

Egccri 2 May 6, 2022
Akinator clone in Rust with Tauri, written in 2 days

Rustinator Akinator clone written in Rust + Tauri in 2 days How to play Install Rust Install Tauri Open a cmd/Terminal in this folder. Run cd src-taur

Face 5 Jul 17, 2023
Open-source Rewind.ai clone written in Rust and Vue running 100% locally with whisper.cpp

mind-overflow Open-source Rewind.AI clone built with Tauri and Vue. Leverages whisper.cpp for Speech-to-Text and (wip: llama.cpp for Text generation a

Maxime Dolores 4 Aug 9, 2023
A small tool to clone git repositories to a standard location, organised by domain name and path.

A small tool to clone git repositories to a standard location, organised by domain name and path. Runs on BSD, Linux, macOS, Windows, and more.

Wesley Moore 68 Dec 19, 2022