Utf8 to utf16 conversion functions for use in const contexts

Ryan Levick

Last update: Dec 8, 2021

Related tags

Encoding Multimedia const-utf16

Overview

const-utf16

utf8 to utf16 conversion functions useable in const contexts.

Use

const HELLO_WORLD_UTF16: &[u16]= const_utf16::encode!("Hello, world!");

Minimum Supported Rust Version (MSRV)

This crate requires Rust 1.46.0 or newer due to the use of some const expression features.

Attribution

This code is largely inspired by the Rust core utf16 conversion code.

Asserts const generic expressions at build-time.

build_assert build_assert allows you to make assertions at build-time. Unlike assert and some implementations of compile-time assertions, such as stat

4 Nov 23, 2023

Text calculator with support for units and conversion

cpc calculation + conversion cpc parses and evaluates strings of math, with support for units and conversion. 128-bit decimal floating points are used

82 Jan 4, 2023

Fast conversion between linear float and 8-bit sRGB

fast-srgb8 Small crate implementing fast conversion between linear float and 8-bit sRGB. Includes API for performing 4 simultaneous conversions, which

13 Sep 3, 2022

A Rust framework for building context-sensitive type conversion.

Xylem is a stateful type conversion framework for Rust.

4 May 11, 2022

Tiny color conversion library for TUI application builders

Definition of ANSI, RGB and HSL color types and all the conversions between them. There are many other color conversion crates. This one may be useful

8 Dec 15, 2022

Rust implementation of custom numeric base conversion.

base_custom Use any characters as your own numeric base and convert to and from decimal. This can be taken advantage of in various ways: Mathematics:

5 Dec 28, 2021

$Refract - A guided AVIF/JPEG XL/WebP conversion utility for JPEG and PNG sources.$

Refract - A guided AVIF/JPEG XL/WebP conversion utility for JPEG and PNG sources.

Refract GTK Refract is a guided image conversion tool written in Rust for x86-64 Linux systems with GTK. It takes JPEG and PNG image sources and produ

34 Nov 28, 2022

Allows conversion between ndarray's types and image's types

ndarray-image Allows conversion between ndarray's types and image's types Deprecated WARNING: This crate is currently deprecated in favor of https://g

11 Jul 26, 2022

Conversion Tools API Rust client

ConversionTools Rust This Conversion Tools API Rust client allows you to use the site API and convert files faster and more conveniently. Site Convers

2 Jan 23, 2022

This tool converts Lua code to TS automatically, including the conversion of common standards to their TS equivalents.

lua-to-ts This tool converts Lua code to TS automatically, including the conversion of common standards to their TS equivalents. Code that fails to be

11 Nov 21, 2022

Python bindings for heck, the Rust case conversion library

pyheck PyHeck is a case conversion library (for converting strings to snake_case, camelCase etc). It is a thin wrapper around the Rust library heck. R

35 Nov 7, 2022

An abstract, safe, and concise color conversion library for rust nightly This requires the feature adt_const_params

colortypes A type safe color conversion library This crate provides many methods for converting between color types. Everything is implemented abstrac

13 Dec 7, 2022

Fixed point to floating point (and back) conversion utility

fixed2float Simple utility for fixed point to real number conversions, using the VisSim (Fxm.b) and Q (Qm.n) notations. Usage as a dependency of your

2 Aug 5, 2022

Rust crate for creating filters with DirectX shaders. Includes Scale, Color conversion using DirectX api.

DxFilter Scale and ColorConversion done with DirectX filters. You can also create your own filters with the provided api. Crate contains various tools

4 Aug 13, 2022

Stdto provides a set of functional traits for conversion between various data representations.

Stdto stdto provides a set of functional traits for conversion between various data representations. | Examples | Docs | Latest Note | stdto = "0.13.0

5 Dec 21, 2022

Core Fiberplane data models and methods for transforming them (templates, providers, markdown conversion)

fiberplane This repository is a monorepo for Rust code that is used throughout Fiberplane's product. Overview base64uuid - A utility for working with

18 Feb 22, 2023

A package for common types for Cargo index interactions, and conversion between them.

Development stream: https://youtu.be/zGS-HqcAvA4 License Licensed under either of Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org

18 Apr 17, 2023

Conversion of .eopkg into .stone

a-piece-of-pisi Crazy time: We're building a tool to convert a selection of .eopkg packages into .stone packages to be consumed by moss-rs to vastly a

6 Oct 17, 2023

Efficient scan conversion of a line segment with clipping to a rectangular window.

✂️ clipline 📏 clipline is a Rust crate for efficient scan conversion of a line segment with clipping to a rectangular window. It is an implementation

5 Oct 26, 2023

Comments

Added no_std attribute to prevent compile errors

This should fix the compilation error of no_std crates that depends on that macro. This error probably comes because panic handler included twice during compilation.

#[panic_handler]
fn panic(_: &core::panic::PanicInfo) -> ! {
    loop {}
}

error[E0152]: found duplicate lang item `panic_impl`
  --> src\main.rs:35:1
   |
35 | fn panic(_: &core::panic::PanicInfo) -> ! {
   | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: the lang item is first defined in crate `std` (which `const_utf16` depends on)
   = note: first definition in `std` loaded from \\?\C:\Users\alex\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\lib\std-09e4abccd169ee6e.dll, \\?\C:\Users\alex\.rustup\toolchains\nightly-x86_64-pc-windows-msvc\lib\rustlib\x86_64-pc-windows-msvc\lib\libstd-09e4abccd169ee6e.rlib

opened by Frago9876543210 2

Using a macro for encoding.

Here's the code from my tweet in text form, if you'd prefer not to retype it.

const fn utf16_len(s: &str) -> usize {
    s.len() // Obviously incorrect.
}

macro_rules! encode {
    ($s:literal) => {{
        const LEN: usize = utf16_len($s);
        const fn utf16_encode() -> [u16; LEN] {
            let mut buffer = [0; LEN];
            let mut idx = 0;
            // Obviously incorrect.
            while idx < LEN {
                buffer[idx] = $s.as_bytes()[idx] as u16;
                idx += 1;
            }
            buffer
        }
        const BUFFER: [u16; LEN] = utf16_encode();
        &BUFFER
    }}
}

const FOO: &[u16] = encode!("foo");

fn main() {
    println!("{:?}", FOO);
}

opened by stevecheckoway 0

Suggestion: Add an alternative function to output as UTF-16LE `&[u8]` slice
In the Windows world UTF-16 strings are not only encountered when interfacing with APIs, but also in a few on-disk structures (e.g. NT registry hives or NTFS filesystems). This complicates interoperability with Rust's UTF-8 world, especially in no_std environments.

My current approach when writing a parser for such an on-disk structure is as follows:

I define my own Utf16ByteString type that just wraps a &[u8].

All parser functions that output a string just return the byte slice encompassing that string in a Utf16ByteString. This has zero cost.

For users with alloc or std, my Utf16ByteString provides a to_string function that uses char::decode_utf16(bytes.chunks_exact(2).map(|two_bytes| u16::from_le_bytes(two_bytes.try_into().unwrap()))) internally. Apart from the required allocations, this function also comes with decoding overhead.

Of course, I like to avoid using to_string, and a frequent case where this should be possible are (case-sensitive) comparisons. Currently, I have to create the comparison byte buffers by hand though, e.g. let hello = &[b'H', 0, b'e', 0, b'l', 0, b'l', 0, b'o', 0]. Latest const-utf16 is no help here, as its encode! only outputs a &[u16]. I could transmute my &[u8] to a &[u16], but that would be an unsafe hack and prone to endian problems.

Could const-utf16 therefore be extended to alternatively output a UTF-16LE &[u8] slice for such comparisons? Or am I missing a zero-cost alternative here?
opened by ColinFinck 1

Utf8 to utf16 conversion functions for use in const contexts

Related tags

Overview

const-utf16

Use

Minimum Supported Rust Version (MSRV)

Attribution

You might also like...

Asserts const generic expressions at build-time.

Text calculator with support for units and conversion

Fast conversion between linear float and 8-bit sRGB

A Rust framework for building context-sensitive type conversion.

Tiny color conversion library for TUI application builders

Rust implementation of custom numeric base conversion.

Refract - A guided AVIF/JPEG XL/WebP conversion utility for JPEG and PNG sources.

Allows conversion between ndarray's types and image's types

Conversion Tools API Rust client

This tool converts Lua code to TS automatically, including the conversion of common standards to their TS equivalents.

Python bindings for heck, the Rust case conversion library

An abstract, safe, and concise color conversion library for rust nightly This requires the feature adt_const_params

Fixed point to floating point (and back) conversion utility

Rust crate for creating filters with DirectX shaders. Includes Scale, Color conversion using DirectX api.

Stdto provides a set of functional traits for conversion between various data representations.

Core Fiberplane data models and methods for transforming them (templates, providers, markdown conversion)

A package for common types for Cargo index interactions, and conversion between them.

Conversion of .eopkg into .stone

Efficient scan conversion of a line segment with clipping to a rectangular window.

Comments

Added no_std attribute to prevent compile errors

Using a macro for encoding.

Suggestion: Add an alternative function to output as UTF-16LE `&[u8]` slice

Owner

Ryan Levick

Easy c̵̰͠r̵̛̠ö̴̪s̶̩̒s̵̭̀-t̶̲͝h̶̯̚r̵̺͐e̷̖̽ḁ̴̍d̶̖̔ ȓ̵͙ė̶͎ḟ̴͙e̸̖͛r̶̖͗ë̶̱́ṉ̵̒ĉ̷̥e̷͚̍ s̷̹͌h̷̲̉a̵̭͋r̷̫̊ḭ̵̊n̷̬͂g̵̦̃ f̶̻̊ơ̵̜ṟ̸̈́ R̵̞̋ù̵̺s̷̖̅ţ̸͗!̸̼͋

Count and convert between different indexing schemes on utf8 string slices

Ed25519 suitable for use in consensus-critical contexts.

A pure Rust implementation of the Web Local Storage API, for use in non-browser contexts

A simple string parsing utility library for Rust, supporting no_std contexts.

const panic with formatting

Statically sized matrix using a definition with const generics

Rust Uint crate using const-generics

An alternative to `qcell` and `ghost-cell` that instead uses const generics

Extension trait to chunk iterators into const-length arrays.