tiny_id is a Rust library for generating non-sequential, tightly-packed short IDs.

Related tags

Utilities tiny_id
Overview

tiny_id

wokflow state crates.io docs.rs

tiny_id is a Rust library for generating non-sequential, tightly-packed short IDs.

Most other short ID generators just string together random digits. Due to the birthday problem, that approach is prone to collisions. For example, a four-digit alphabetic code has a 50% chance of collision after 800 codes.

tiny_id uses a linear congruential generator to generate codes which do not overlap while retaining only a small, constant-sized piece of state. For the same four-digit alphabetic code, tiny_id has a 0% chance of collision until all 456,976 possible codes have been generated.

These codes are indended for use-cases where it's desirable to have short, human-readable codes such that two codes generated in a row are no more likely to resemble each other than codes that are not. It should not be used in cases where the codes need to be non-guessable. They also do not guard against a German tank problem-type analysis by someone sufficiently motivated.

How to use it

Basic use

use tiny_id::ShortCodeGenerator;

fn main() {
    // The length of generated codes
    let length: usize = 6;

    // Create a generator. The generator must be mutable, because each
    // code generated updates its state.
    let mut generator = ShortCodeGenerator::new_alphanumeric(length);

    // Generate the next short code, and update the internal generator state.
    let code = generator.next_string();
    assert_eq!(length, code.len());
}

Alphabets

, // but gen is a ShortCodeGenerator , so we need to call // next() instead, which returns a Vec . let result: Vec = gen.next(); assert_eq!(length, result.len()); } ">
use tiny_id::ShortCodeGenerator;

fn main() {
    let length: usize = 6;

    // There are several built-in alphabets with convenience constructors.
    
    // Numeral digits (0-9), like "769458".
    ShortCodeGenerator::new_numeric(length);

    // Numeral digits and lowercase letters (a-z), like "l2sx2b".
    ShortCodeGenerator::new_lowercase_alphanumeric(length);

    // Numeral digits, lowercase, and uppercase letters, like "qAh6Gg".
    ShortCodeGenerator::new_alphanumeric(length);

    // Uppercase letters only, like "MEPQOD".
    ShortCodeGenerator::new_uppercase(length);

    // You can also provide an alphabet with any unicode characters.
    // I hope you don't use it for emoji, but you could:
    ShortCodeGenerator::with_alphabet(
        "๐Ÿ˜›๐Ÿต๐Ÿ˜Ž".chars().collect(),
        length
    );

    // The generator can also be used with non-char types, as long
    // as they implement Copy.
    let mut gen = ShortCodeGenerator::with_alphabet(
        vec![true, false],
        length
    );

    // next_string() is only implemented on ShortCodeGenerator
      
       ,
      
    // but gen is a ShortCodeGenerator
      
       , so we need to call
      
    // next() instead, which returns a Vec
      
       .
      
    let result: Vec<bool> = gen.next();
    assert_eq!(length, result.len());
}

Exhaustion Strategies

Eventually, all short code generators reach a point where they run out of codes of a given length. There are three options for what to do when this happens:

  • Increment the length. This corresponds to ExhaustionStrategy::IncreaseLength, which is the default.
  • Cycle. This repeats the cycle of codes from the beginning. The order of codes is the same in every cycle. Corresponds to ExhaustionStrategy::Cycle.
  • Panic. In the spirit of fail-fast, this panics when all codes have been used, for cases where exhaustion is unexpected and assumed by the rest of the program not to happen. Corresponds to ExhaustionStrategy::Panic.
use tiny_id::{ShortCodeGenerator, ExhaustionStrategy};

fn main() {
    // Increase length (default).
    
    let mut gen = ShortCodeGenerator::new_uppercase(2);

    for _ in 0..(26*26) {
        let result = gen.next();
        assert_eq!(2, result.len());
    }

    // We've exhausted all options, so our next code will be longer.
    let result = gen.next();
    assert_eq!(3, result.len());

    // Cycle.
    
    let mut gen = ShortCodeGenerator::new_uppercase(2)
        .exhaustion_strategy(ExhaustionStrategy::Cycle);
    
    let first = gen.next();

    for _ in 0..(26*26-1) {
        let result = gen.next();
        assert_eq!(2, result.len())
    }

    // We've exhausted all options, so our next code will be a
    // repeat of the first.
    let result = gen.next();
    assert_eq!(first, result);
}

How it works

A linear congruential generator (LCG) is a simple, non-cryptographic pseudorandom number generator.

LCGs are interesting because they generate numbers in a cycle, and the length of that cycle as a function of the parameters to the LCG is well-studied. In particular, one thing that's well understood is how to make an LCG that generates the numbers 1..m with a cycle size of m, i.e., to generate a permutation of the numbers 1..m. This is called the Hull-Dobell Theorem.

For an alphabet of size N and an ID length of L, there are N ^ L possible codes. We can convert back and forth between the numbers 1 .. N^L and those codes by treating the codes as base-N representations of the numbers.

Combining these two facts, our approach is:

  • Using the Hull-Dobell Theorem, construct an LCG such that it will โ€œvisitโ€ every number from 1 to N^L in some random(ish) cycle.
  • Using the base conversion method, turn each of these numbers into a short ID.

Notes

Note that the ShortCodeGenerator object itself contains a small amount of state, which is updated every time a short code is generated. Short codes must be generated by the same ShortCodeGenerator object in order to avoid collisions.

With the serde crate option, enabled by default, ShortCodeGenerator objects can be serialized to any format supported by serde. This can be used to persist the state of a generator for later use. (If you are using a custom alphabet, the type of that alphabet must also be serializable.)

The total number of possible codes (alphabet size to the power of length) must fit in a u64. If you're working with large enough codes and alphabets that that's a problem, you probably don't need this library anyway (as random collisions will be more rare).

Randomness is only used during the construction of ShortCodeGenerator. Code generation itself is entirely deterministic based on the current generator state.

All operations use constant time and space, except for ShortCodeGenerator construction. Construction technically has time complexity superlinear to the cardinality of the alphabet provided. For reasonable alphabet sizes (say, <1000), this should be negligible.

If you provide an alphabet rather than use one of the built-in alphabets, that alphabet must not contain any repeated entries. This is not enforced by the library, but failure to abide will result in collisions.

Partitioning

If you need two machines to be able to issue short IDs without coordinating, one approach would be to:

  1. Create an initial ShortCodeGenerator state on the first machine, and serialize it.
  2. Deserialize the state on the second machine, and generate exactly one code to advance the state.
  3. Every time a short code is needed on either machine, first generate and throw away exactly one code, and then generate the code.

This will ensure that each code is only used once (the first machine will use the even-indexed codes, and the second machine will use the odd-indexed ones.)

This can work with an arbitrary number of machines, as long as the number is known in advance, but becomes less efficient with large numbers of partitions.

You might also like...
A high level diffing library for rust based on diffs
A high level diffing library for rust based on diffs

Similar: A Diffing Library Similar is a dependency free crate for Rust that implements different diffing algorithms and high level interfaces for it.

A reactive DOM library for Rust in WASM

maple A VDOM-less web library with fine-grained reactivity. Getting started The recommended build tool is Trunk. Start by adding maple-core to your Ca

transmute-free Rust library to work with the Arrow format

Arrow2: Transmute-free Arrow This repository contains a Rust library to work with the Arrow format. It is a re-write of the official Arrow crate using

Cross-platform Window library in Rust for Tauri. [WIP]

Cross-platform application window creation library in Rust that supports all major platforms like Windows, macOS, Linux, iOS and Android. Built for you, maintained for Tauri.

A library in Rust for theorem proving with Intuitionistic Propositional Logic.

Prop Propositional logic with types in Rust. A library in Rust for theorem proving with Intuitionistic Propositional Logic. Supports theorem proving i

A Rust library for constructing tilings of regular polygons
A Rust library for constructing tilings of regular polygons

tiling tiling is a library for constructing tilings of regular polygons and their dual tilings. Resources Documentation Tilings by regular polygons Li

miette is a diagnostic library for Rust. It includes a series of traits/protocols that allow you to hook into its error reporting facilities, and even write your own error reports!
miette is a diagnostic library for Rust. It includes a series of traits/protocols that allow you to hook into its error reporting facilities, and even write your own error reports!

miette is a diagnostic library for Rust. It includes a series of traits/protocols that allow you to hook into its error reporting facilities, and even write your own error reports!

Generative arts library in Rust
Generative arts library in Rust

Generative Generative (WIP) is 2D generational arts creation library written in Rust. Currently it is in nascent stage and is somewhat unstable. Examp

Membrane is an opinionated crate that generates a Dart package from a Rust library. Extremely fast performance with strict typing and zero copy returns over the FFI boundary via bincode.

Membrane is an opinionated crate that generates a Dart package from a Rust library. Extremely fast performance with strict typing and zero copy returns over the FFI boundary via bincode.

Comments
  • Deserialized generators are not deterministic across length increases.

    Deserialized generators are not deterministic across length increases.

    When the exhaustion strategy IncreaseLength is used (the default), a new generator is chosen with a new seed using the system RNG. This means that these generators are non-deterministic, which is not the desired behavior.

    As a solution, I'm making the following changes:

    • ShortCodeGenerator contains a deterministic RNG, which is seeded once initially and then never again.
    • If an attempt is made to increase the length of a generator, the behavior depends on whether the (new) getrandom crate feature is enabled:
      • If it is, a system random seed is used, mirroring the prior behavior.
      • If it is not, the code will panic with a message that points to this issue.

    This problem applies to versions 0.1.3 and earlier.

    opened by paulgb 1
  • Partitioning between multiple threads

    Partitioning between multiple threads

    Hi, I found your crate while searching for non-colliding short alphanumeric id generator, and it seems like this will fit very well for my purpose.

    One problem is that I have several threads that will be generating ids concurrently and I cannot have collision between those threads.

    To avoid collisions, I could make the generator as a global variable behind a lock, but I think this approach might be slow.

    Cloning the incremented generator when each thread gets created, and skipping by the number of total threads every generation sounds like a better approach, but I was wondering if you would be willing to implement this natively (as mentioned in README.md).

    opened by chocological00 1
Owner
Paul Butler
Paul Butler
Generate short, memorable phrases for throw-away names.

Generates three-word phrases of the form intensifier-adjective-noun, just like GitHub default repo names.

null 6 Dec 25, 2021
A library for generating TypeScript definitions from rust code.

Tsify is a library for generating TypeScript definitions from rust code. Using this with wasm-bindgen will automatically output the types to .d.

Madono Haru 60 Dec 30, 2022
Experimental Rust tool for generating FFI definitions allowing many other languages to call Rust code

Diplomat is an experimental Rust tool for generating FFI definitions allowing many other languages to call Rust code. With Diplomat, you can simply define Rust APIs to be exposed over FFI and get high-level C, C++, and JavaScript bindings automatically!

null 255 Dec 30, 2022
Generating Permutations with Rust

rust_permutations Generating Permutations with Rust Permutation Input: a string of unknown length that can contain only three characters: 0, 1, or . F

Flavio Rasseli 1 Oct 25, 2021
Type-check non-existing `Phantom` code for Fun And Profit

Sometimes you may want to write Rust code that ought to be type-checked (e.g., borrow-checked) in the same fashion as real Rust code even though that code is never intended to be run / to affect or even reach code generation.

Daniel Henry-Mantilla 4 Jun 5, 2022
A tool of generating and viewing dice roll success distributions.

AZDice A GUI tool for generating and visualising dice roll probability distributions. Aims Intended to help people trying to get game balance just rig

null 13 Mar 2, 2021
Ever wanted to torture your CPU by generating every image possible?

dumpsterfire Ever wanted to torture your CPU by generating every image possible? Well, now you can! This thing is worse than mining Bitcoin, since the

null 2 Jun 14, 2022
A library to compile USDT probes into a Rust library

sonde sonde is a library to compile USDT probes into a Rust library, and to generate a friendly Rust idiomatic API around it. Userland Statically Defi

Ivan Enderlin 40 Jan 7, 2023
A Rust library for calculating sun positions

sun A rust port of the JS library suncalc. Install Add the following to your Cargo.toml [dependencies] sun = "0.2" Usage pub fn main() { let unixti

Markus Kohlhase 36 Dec 28, 2022
A cross-platform serial port library in Rust.

Introduction serialport-rs is a general-purpose cross-platform serial port library for Rust. It provides a blocking I/O interface and port enumeration

Bryant Mairs 143 Nov 5, 2021