Idiomatic Rust implementations for various Windows string types (like UNICODE_STRING)

Overview

nt-string

crates.io docs.rs license: MIT OR Apache-2.0

by Colin Finck <[email protected]>

Provides idiomatic Rust implementations for various Windows string types:

Other useful UTF-16 string types are already provided by the excellent widestring crate. This crate tries to integrate as best as possible with them.

In Action


Debugging a Rust application in WinDbg and using the dS command to display a UNICODE_STRING created in Rust. As you see, this crate's NtUnicodeString and the original UNICODE_STRING are fully compatible.

Details

The UNICODE_STRING type was designed for the C programming language, which only knows about NUL-terminated buffers of characters. To determine the length of such a buffer, you need to iterate over all characters until finding the NUL. Bad enough? It gets worse: A classic buffer overflow occurs if the buffer contains no NUL, but an algorithm attempts to find it anyway.

To overcome these performance and security hazards, UNICODE_STRINGs consist of a buffer, a buffer capacity ("maximum length"), and a field to indicate the actually used length. Determining length and capacity is now as simple as querying the corresponding fields. Length and capacity are 16-bit values and expressed in bytes.

UNICODE_STRING has been widely used by the Windows kernel team and also spilled over to some user-mode APIs. This crate makes UNICODE_STRING a first-class Rust citizen. Safety is achieved via the following measures:

  • UNICODE_STRING is split into 3 Rust types to handle references, mutable references, and owned strings separately. You should never need to call an unsafe method.
  • All methods are fallible (except for allocations and traits like Add, where Rust currently does not provide fallible alternatives).
  • The internal buffer is NUL-terminated whenever possible. While not required according to the specification, this defensive approach guards against external applications that never understood UNICODE_STRING and mistakenly treat its internal buffer as a NUL-terminated string.

Additionally, this crate provides the U16StrLe type. With UTF-16 being the ubiquitous character encoding in Windows, many on-disk strings are stored in UTF-16 Little-Endian. U16StrLe allows to perform basic operations on byte slices of such strings without converting them to another string type first. One user is the ntfs crate.

Examples

You can work with these string types just like you work with other Rust string types:

let mut string = NtUnicodeString::try_from("Hello! ").unwrap();
string.try_push_str("Moin!").unwrap();
println!("{string}");

Conversions are also supported from raw u16 string buffers as well as the U16CStr and U16Str types of the widestring crate:

let abc = NtUnicodeString::try_from_u16(&[b'A' as u16, b'B' as u16, b'C' as u16]).unwrap();
let de = NtUnicodeString::try_from_u16_until_nul(&[b'D' as u16, b'E' as u16, 0]).unwrap();
let fgh = NtUnicodeString::try_from(u16cstr!("FGH")).unwrap();
let ijk = NtUnicodeString::try_from(u16str!("IJK")).unwrap();

Just like a String automatically dereferences to a &str when you pass it to an appropriate function, you can do the same with an NtUnicodeString and it will dereference to an &NtUnicodeStr:

let string = NtUnicodeString::try_from("My String").unwrap();
subfunction(&string);

fn subfunction(str_ref: &NtUnicodeStr) {
    println!("Hello from subfunction with \"{str_ref}\".");
}

Constant UNICODE_STRINGs can be created at compile-time. This provides strings with a 'static lifetime and saves a UTF-16 conversion at runtime:

const MY_CONSTANT_STRING: NtUnicodeStr<'static> = nt_unicode_str!("My Constant String");

Finally, you most likely want to pass your NtUnicodeStr, NtUnicodeStrMut or NtUnicodeString to an FFI function that expects a pointer to a UNICODE_STRING. Use the as_ptr or as_mut_ptr methods to get an immutable or mutable pointer.

no_std support

The crate is no_std-compatible and therefore usable in all contexts.

However, the heap-allocating NtUnicodeString struct is only available via the alloc feature (enabled by default). If you want to use the crate in a pure no_std environment without heap allocations, include it with default-features = false to disable the default alloc feature.

License

This crate is licensed under either of

at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

You might also like...
The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.
The Fast Vector Similarity Library is designed to provide efficient computation of various similarity measures between vectors.

Fast Vector Similarity Library Introduction The Fast Vector Similarity Library is designed to provide efficient computation of various similarity meas

Stack heap flexible string designed to improve performance for Rust

flexible-string A stack heap flexible string designed to improve performance. FlexibleString was first implemented in spdlog-rs crate, which improved

Rust crate for obfuscating string literals.

Obfustring This crate provides a obfuscation macro for string literals. This makes it easy to protect them from common reverse engineering attacks lik

Rust based magic-string with source map chains support

enhanced-magic-string Rust implementation of https://www.npmjs.com/package/magic-string with original sourcemap chain support. license. This project i

Simple string matching with questionmark- and star-wildcard operator

wildmatch Match strings against a simple wildcard pattern. Tests a wildcard pattern p against an input string s. Returns true only when p matches the

A crate for converting an ASCII text string or file to a single unicode character

A crate for converting an ASCII text string or file to a single unicode character. Also provides a macro to embed encoded source code into a Rust source file. Can also do the same to Python code while still letting the code run as before by wrapping it in a decoder.

A special rope, designed to work with any data type that is not String

AnyRope AnyRope is an arbitrary data type rope for Rust, designed for similar operations that a rope would do, but targeted at data types that are not

Compact, clone-on-write vector and string.

ecow Compact, clone-on-write vector and string. Types An EcoVec is a reference-counted clone-on-write vector. It takes up two words of space (= 2 usiz

Parses a relative time string and returns a `Duration`

humantime_to_duration A Rust crate for parsing human-readable relative time strings and converting them to a Duration. Features Parses a variety of hu

Owner
Colin Finck
Long-time ReactOS developer and Rust enthusiast, bringing both worlds together
Colin Finck
A peer-reviewed collection of articles/talks/repos which teach concise, idiomatic Rust.

This repository collects resources for writing clean, idiomatic Rust code. Please bring your own. ?? Idiomatic coding means following the conventions

Matthias 4.2k Dec 30, 2022
Rust-idiomatic, compliant, flexible and performant BIP21 crate

Rust implementation of BIP21 Rust-idiomatic, compliant, flexible and performant BIP21 crate. About Important: while lot of work went into polishing th

Martin Habovštiak 6 Dec 15, 2022
alto provides idiomatic Rust bindings for OpenAL 1.1 and extensions (including EFX).

alto alto provides idiomatic Rust bindings for OpenAL 1.1 and extensions (including EFX). WARNING Because Alto interacts with global C state via dynam

null 80 Aug 7, 2022
Safe, idiomatic bindings to cFE and OSAL APIs for Rust

n2o4 The n2o4 crate provides safe, idiomatic Rust bindings to the APIs of cFE and OSAL, the libraries of the Core Flight System (cFS). IMPORTANT NOTE

null 3 Aug 29, 2022
The Computer Language Benchmarks Game: Rust implementations

The Computer Language Benchmarks Game: Rust implementations This is the version I propose to the The Computer Language Benchmarks Game. For regex-dna,

Guillaume P. 69 Jul 11, 2022
Rust implementations of finite-field arithmetic based on big integers with configurable limb sizes

multiprecision TODO: rewrite readme implement CIOS for 16-bit limbs All operations are in little-endian form (where the digit in memory at the smalles

Geometry 4 Mar 30, 2024
The Rust Compiler Collection is a collection of compilers for various languages, written with The Rust Programming Language.

rcc The Rust Compiler Collection is a collection of compilers for various languages, written with The Rust Programming Language. Compilers Language Co

null 2 Jan 17, 2022
CloudLLM is a Rust library designed to seamlessly bridge applications with remote Language Learning Models (LLMs) across various platforms.

CloudLLM CloudLLM is a Rust library designed to seamlessly bridge applications with remote Language Learning Models (LLMs) across various platforms. W

null 4 Oct 13, 2023
📱️🚫️🌝️💾️ 3FakeIM is a joke program meant to imitate various fictional characters, and the "[CHARACTER] CALLED ME AT 3:00 AM" clickbait trend, while poking fun.

3FakeIM ??️??️??️??️ 3FakeIM is a joke program meant to imitate various fictional characters, and the "[CHARACTER] CALLED ME AT 3:00 AM" clickbait tre

Sean P. Myrick V19.1.7.2 2 Jul 3, 2023