fanum tax 64-bit integers with LEB128

Overview

rizz64

Fanum* tax 64-bit integers.

* Fanum is a popular streamer who taxes his friends by taking bites of their food.


This crate provides an efficient and powerful variable-length integer codec using LEB128 compression.

Why rizz64 is helpful

It's super efficient in representing smaller numbers with fewer bytes. Instead of a fixed-size solution like u64, this form of encoding can store values using less space or memory.

Basic metrics:

Decimal Powers of 2 # of Bytes Needed
0..127 2^0-1..2^7-1 1
128..16383 2^7..2^14-1 2
16384..2097151 2^14..2^21-1 3
2097152..268435455 2^21..2^28-1 4

and so on... to compute the number of bytes required, you can use the following:

The upper bound of log2(max(x, 1)) / 7, where x is your number. This is derived from the fact that each byte can represent 7 bits. rizz64 uses a base-2 logarithm to determine the magnitude in bits. We call max(x, 1) to ensure this value could never be 0. We take the upper bound and round to the nearest whole number.

How encoding works

For a given u64, we process the number by groups of 7 bits. For every group, if there are more than 7 bits still to be encoded, we loop.

In the loop, the least significant 7 bits are extracted and packed into a buffer. The 8th bit is the most significant bit (MSB); we set the MSB to 1 to indicate there are more groups ahead. Then, we shift right (>>=) by 7 bits and we process the next group.

After looping, the remaining bits of the number are packed into the buffer. This is the last byte and since we're halting, we don't set the MSB to 1.

Literature

LEB128
sqlite variable-length ints
varint.go

Todo

  • Go outside
You might also like...
Rust implementations of finite-field arithmetic based on big integers with configurable limb sizes

multiprecision TODO: rewrite readme implement CIOS for 16-bit limbs All operations are in little-endian form (where the digit in memory at the smalles

A tiny 32 bit kernel written in Rust
A tiny 32 bit kernel written in Rust

rustboot A tiny 32 bit kernel written in Rust. I was inspired to download Rust and try to do this after seeing zero.rs - a stub that lets Rust program

A bit-packed k-mer representation (and relevant utilities) for rust

K-mer class for rust The purpose of this repository is to build a simple library that exposes a bit-packed k-mer class for use in rust-based bioinform

Fast conversion between linear float and 8-bit sRGB

fast-srgb8 Small crate implementing fast conversion between linear float and 8-bit sRGB. Includes API for performing 4 simultaneous conversions, which

A little bit fast and modern Ruby version manager written in Rust
A little bit fast and modern Ruby version manager written in Rust

A little bit fast and modern Ruby version manager written in Rust Features Pure Rust implementation not using ruby-build Cross-platform support (macOS

Tiny Rust crate to iterate bit combinations

bit_combi_iter bit_combi_iter is a small dependency-free crate to enumerate all bit combinations less than given unsigned integer value keeping 1s in

Controls the RGB on the keyboard for the Legion 5 Pro from Lenovo. Mostly used for learning a bit of rust.

L5P Keyboard RGB Control Program A fun little experiment. Probably contains bugs. ⚠️ Use at your own risk, the developer is not responsible for any da

kani - 64 bit multiboot toy OS written in Rust

kani - 64 bit multiboot toy OS written in Rust Sorry, no GUI, Serial only.

A naive native 128-bit cityhash v102 implementation

Naive CityHash naive-cityhash is a naive native 128-bit cityhash v102 implementation for clickhouse*. Contact Chojan Shang - @PsiACE - psiace@outlook.

Advent of Code 2021, also an attempt to practice a bit of Rust.

Advent of Code 2021 Advent of Code 2021 (my first one!), also an attempt to practice a bit of Rust. Running (Assuming that the respective inputs are i

StdFuzzer - StdFuzzer is the reference implementation of a generic bit-level fuzzer with LibAFL

StdFuzzer StdFuzzer is the reference implementation of a generic bit-level fuzzer with LibAFL Building Build with $ cargo build --release Compiling a

A bit vector with the Rust standard library's portable SIMD API

bitsvec A bit vector with the Rust standard library's portable SIMD API Usage Add bitsvec to Cargo.toml: bitsvec = "x.y.z" Write some code like this:

Isn't it time to be a bit nicer to rustc?

politeness-macro Aren't we all too rude to computers? Isn't it time to bring a bit more politeness into our programming? Shouldn't we be a bit nicer t

An Intel HAXM powered, protected mode, 32 bit, hypervisor addition calculator, written in Rust.

HyperCalc An Intel HAXM powered, protected mode, 32 bit, hypervisor addition calculator, written in Rust. Purpose None 😏 . Mostly just to learn Rust

wait-free 4-level 64-bit pagetable for contiguous low-contention concurrent metadata

pagetable Wait-free 4-level page table that maps from a u64 key to an &AtomicU64 value. Page fan-out is 2^16. If a key doesn't exist, intermediate pag

Bit fields and masks for rust!

ubits Bit fields and masks for rust! Provides a macro for generating bit field types complete with flags and some helpful trait implementations. Suppo

Steggy CLI Tool - hides data within the least significant bit of an image
Steggy CLI Tool - hides data within the least significant bit of an image

Written in Rust, features a simple cli and a client-side webapp. This tool hides data within the least significant bit of an image. Obfuscation techniques are utilized to make the

Fast encoder/decoder for the lossless DTM 16 bit image format

DTM Image Format Fast encoder/decoder for the DTM image format. The DTM image format is a 16-bit lossless image format supporting one to four channels

Bitpack a boolean into a pointer using bit magic.

ptr-bool tl;dr: a pointer and boolean with the same size as a pointer. A convenience crate used to bitpack a boolean and pointer into the same eight b

Comments
  • feat: skip bound checks by iterating

    feat: skip bound checks by iterating

    Instead of running:

    loop {
         unsafe { *buf.get_mut_unchecked(i) = .... } 
    }
    

    The rust compiler seems to skip this bound check when shoving this into an iterator that is:

    for i in 0..buf.len() {
         // same stuff here
    }
    

    compiler explorer: https://godbolt.org/z/49P653o9j

    Performance

    Running benchmarks indicate significantly faster execution times.

    Benchmarking Log/write_u64 1152921504606846976: Collecting 10
    Log/write_u64 1152921504606846976
                            time:   [321.67 ps 325.25 ps 329.86 ps]
                            change: [-91.933% -91.871% -91.799%] (p = 0.00 < 0.05)
                            Performance has improved.
    Found 12 outliers among 100 measurements (12.00%)
    

    It also seems to beat out the leb128 crate by significant measures:

    Benchmarking Log/leb128::write::unsigned 1152921504606846976:
    Log/leb128::write::unsigned 1152921504606846976
                            time:   [1.6255 ns 1.6417 ns 1.6615 ns]
    Found 11 outliers among 100 measurements (11.00%)
      5 (5.00%) high mild
      6 (6.00%) high severe
    

    Note a more robust benchmarking report is needed to fully assert these claims

    opened by friendlymatthew 0
Owner
Matthew Kim
Matthew Kim
📈 The fastest map possible in Rust, where keys are integers and the capacity is fixed (faster than Vec!)

It is an alternative on-heap implementation of a map with keys of type usize and a fixed capacity. It works much faster than a standard HashMap becaus

Yegor Bugayenko 6 Apr 26, 2023
Rust implementations of finite-field arithmetic based on big integers with configurable limb sizes

multiprecision TODO: rewrite readme implement CIOS for 16-bit limbs All operations are in little-endian form (where the digit in memory at the smalles

Geometry 4 Mar 30, 2024
A bit-packed k-mer representation (and relevant utilities) for rust

K-mer class for rust The purpose of this repository is to build a simple library that exposes a bit-packed k-mer class for use in rust-based bioinform

COMBINE lab 41 Dec 15, 2022
Isn't it time to be a bit nicer to rustc?

politeness-macro Aren't we all too rude to computers? Isn't it time to bring a bit more politeness into our programming? Shouldn't we be a bit nicer t

Rin 6 Mar 11, 2022
Bitpack a boolean into a pointer using bit magic.

ptr-bool tl;dr: a pointer and boolean with the same size as a pointer. A convenience crate used to bitpack a boolean and pointer into the same eight b

Zack 2 Oct 24, 2022
Simple bit-level protocol definitions in Rust.

bin-proto Simple & fast structured bit-level binary co/dec in Rust. An improved and modernized fork of protocol. A more efficient but (slightly) less

null 16 Jun 13, 2024
Rust library of custom number malarkey, including variable-bit-width integers

Numberwang The Numberwang crate is a library of custom number types and functionality, including variable-bit-width integers. It is named after the fi

Dan Williams 3 Nov 12, 2024
Rust library to convert RGB 24-bit colors into ANSI 256 (8-bit) color codes with zero dependencies and at compile-time.

rgb2ansi256 rgb2ansi256 is a small Rust library to convert RGB 24-bit colors into ANSI 256 (8-bit) color codes with zero dependencies and const fn. Th

Linda_pp 7 Nov 17, 2022
A bit like tee, a bit like script, but all with a fake tty. Lets you remote control and watch a process

teetty teetty is a wrapper binary to execute a command in a pty while providing remote control facilities. This allows logging the stdout of a process

Armin Ronacher 259 Jan 3, 2023
📈 The fastest map possible in Rust, where keys are integers and the capacity is fixed (faster than Vec!)

It is an alternative on-heap implementation of a map with keys of type usize and a fixed capacity. It works much faster than a standard HashMap becaus

Yegor Bugayenko 6 Apr 26, 2023