Display strings in a safe platform-appropriate way

Overview

os_display

Crates.io API reference MSRV CI

Printing strings can be tricky. They may contain control codes that mess up the message or the whole terminal. On Unix even filenames can contain characters like that.

Filenames may also contain invalid unicode, which is not preserved by Path::display.

Finally, they can contain special characters that aren't safe to use in a command without quoting or escaping.

This library lets you add quoting to filenames (and other strings) to display them more safely and usefully. The goal is to render them in such a way that they can be copied and pasted back into a shell without information loss.

On Unix (and other platforms) values are quoted using bash/ksh syntax, while on Windows PowerShell syntax is used.

When should I use this?

This library is best suited for command line programs that deal with arbitrary filenames or other "dirty" text. mv for example is the very tool you use to rename files with problematic names, so it's nice if its messages handle them well.

Programs that aren't expected to deal with weird data don't get as much benefit.

The output is made for shells, so displaying it in e.g. a GUI may not make sense.

Most programs get along fine without this. You likely don't strictly need it, but you may find it a nice improvement.

Usage

Import the Quotable trait:

use os_display::Quotable;

This adds two methods to the common string types (including OsStr): .quote() and .maybe_quote(). They return Quoted, a wrapper with a custom Display implementation.

.quote() always puts quotes around the text:

// Found file 'filename'
println!("Found file {}", "filename".quote());

// Found file "foo'bar"
println!("Found file {}", "foo'bar".quote());

// Unix: Found file $'foo\nbar'
// Windows: Found file "foo`nbar"
println!("Found file {}", "foo\nbar".quote());

.maybe_quote() only adds them if necessary because of whitespace or special characters:

// filename: Not found
println!("{}: Not found", "filename".maybe_quote());

// 'foo bar': Not found
println!("{}: Not found", "foo bar".maybe_quote());

// '*?$': Not found
println!("{}: Not found", "*?$".maybe_quote());

.quote() is best used inside longer sentences while .maybe_quote() can be used for text that's already separated some other way (like by a colon).

Limitations

  • Unicode may be quoted but only control characters are escaped. The printed text can still look weird, and a few (buggy) terminals drop certain characters.
  • This library should not be used to interpolate text into shell scripts. It's designed for readability, not absolute safety. Consider using the shell-escape crate instead (or ideally, passing in the values in some other way).
  • The output is not compatible with every single shell.
  • PowerShell treats quotes differently in arguments to external commands. This library is tuned for arguments to internal commandlets.

Invalid unicode

On Unix:

use std::ffi::OsStr;
use std::os::unix::ffi::OsStrExt;

// \xFF makes this invalid UTF-8, so to_str() would fail
let bad_string = OsStr::from_bytes(&[b'x', 0xFF, b'y']);
assert_eq!(bad_string.quote().to_string(), r#"$'x\xFFy'"#);

On Windows:

use std::ffi::OsString;
use std::os::windows::ffi::OsStringExt;

// 0xD800 is an unpaired surrogate, making this invalid UTF-16
let bad_string = OsString::from_wide(&[b'a' as u16, 0xD800, b'b' as u16]);
assert_eq!(bad_string.quote().to_string(), r#""a`u{D800}b""#);

Zero-width unicode

Some codepoints are zero-width. They can make a string invisible, or they can make it hard to select. GNU tools struggle with this:

$ wc $'\u200b'
wc: ​: No such file or directory

os_display adds quotes in such cases:

assert_eq!("\u{200B}".maybe_quote().to_string(), "'\u{200B}'");

It still misleadingly looks like '' when printed, but it's possible to copy and paste it and get the right result.

Feature flags

By default you can only use the current platform's quoting style. That's appropriate most of the time.

windows/unix

The windows and unix optional features can be enabled to add constructors to Quoted.

Quoted::unix("some string") will quote with bash/ksh syntax no matter the platform, and Quoted::windows("etc") uses PowerShell syntax.

Quoted::unix_raw and Quoted::windows_raw take &[u8] (for malformed UTF-8) and &[u16] (for malformed UTF-16), respectively.

native

The native feature (enabled by default) is required for the Quotable trait and the Quoted::native(&str) and Quoted::native_raw(&OsStr) constructors. If it's not enabled then the quoting style has to be chosen explicitly.

alloc/std

This crate is no_std-compatible if the alloc and/or std features are disabled.

The std feature is required to quote OsStrs. The alloc feature is required for Quoted::windows_raw.

Alternative constructors

Quoted has constructors for specific styles as well as Quoted::native() and Quoted::native_raw(). These can be used as an alternative to the Quotable trait if you prefer boring functions.

By default quotes are always added. To get behavior like .maybe_quote() use the .force() method:

println!("{}", Quoted::native(x).force(false));

Testing

The Unix implementation has been fuzzed against bash, zsh, mksh, ksh93 and busybox to ensure all output is interpreted back as the original string. It has been fuzzed to a more limited extent against fish, dash, tcsh, posh, and yash (which don't support all of the required syntax).

The PowerShell implementation has been fuzzed against PowerShell Core 7.1.4 running on Linux.

Acknowledgments

This library is modeled after the quoting done by Gnulib as seen in the GNU coreutils. The behavior is not identical, however:

  • GNU uses octal escapes, like \377 instead of \xFF.
  • GNU eagerly switches quoting style midway through, like ''$'\n''xyz' instead of $'\nxyz'. os_display avoids this unless necessary.
  • GNU escapes unassigned codepoints instead of leaving their handling up to the terminal.
  • GNU doesn't handle zero-width codepoints specially.

The first version of this code was written for the uutils project. The feedback and the opportunity to use it in a large codebase were helpful.

You might also like...
Totally Speedy Transmute (TST) is a library providing a small, performance oriented, safe version of std::mem::transmute

Totally Speedy Transmute An evil spiritual successor to Totally Safe Transmute What is it? Totally Speedy Transmute (TST) is a library providing a sma

The working code for my Safe Pay app - check out my tutorial!

Safe Pay Tutorial This repository contains the source code for my blog post Using PDAs and SPL Token in Anchor. The code contains the following: Solan

Define safe interfaces to MMIO and CPU registers with ease

regi regi lets you define safe interfaces to MMIO and CPU registers with ease. License Licensed under either of Apache License, Version 2.0 or MIT lic

A simple, stable and thread-safe implementation of a lazy value

Laizy Laizy is a Rust library that provides a simple, stable and thread-safe implementation of a Lazy Features Name Description Dependencies nightly A

Amethyst is a systems language aimed at being simple, small, portable, and safe.

amethyst Amethyst is a systems language aimed at being simple, small, portable, and safe. What is this language? From the r/ProgLangs discord server:

A safe wrapper around Gamercade's raw Api.

gamercade-rs A safe wrapper around Gamercade's Raw Api. As the Raw Api requires using a lot of unsafe and hiding of values through different types (fo

Safe API to embed an ECMAScript engine.

Kopi Kopi is a small abstraction to easily and safely embed an ECMAScript runtime inside a Rust based application. It uses the V8 execution engine to

Cross-platform GUI written in Rust using ADB to debloat non-rooted android devices
Cross-platform GUI written in Rust using ADB to debloat non-rooted android devices

Cross-platform GUI written in Rust using ADB to debloat non-rooted android devices. Improve your privacy, the security and battery life of your device.

A cross-platform application for custom presence on Discord.
A cross-platform application for custom presence on Discord.

Discord Presence ⚠️ macOS is NOT supported. This is due the package for setting the presence being broken on mac. It may work for some people (only wh

Releases(v0.1.3)
Owner
Jan Verbeek
Jan Verbeek
The simplest way to de-Google your life and business: Inbox, Calendar, Files, Contacts & much more

Bloom The all-in-one private workspace Try it for free! You no longer trust tech monopolies with your data? You are done with your privacy invaded by

Sylvain Kerkour 1.6k Dec 26, 2022
Milho (corn in portuguese) is a toy dialect of Lisp written as a way to learn more about compilers

Milho (corn in portuguese) is a toy dialect of Lisp written as a way to learn more about compilers. There are implementations in rust and go

Celso Bonutti 27 May 4, 2022
kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

kindly is a simple Rust implementation of a set-user-ID-root program, similar to sudo but in a much reduced way.

Vinícius Miguel 26 Dec 5, 2022
Advent of Code 2015, done entirely in Rust both for the challenge and as a way to learn

Advent of Code 2015 In preparation for Advent of Code 2021, I wanted to go back and try some of the older challenges. I figured it made the most sense

Matt 1 Dec 9, 2021
qn (quick note) is a simple, fast and user-friendly way to save notes 🦀⚙️

Quick Note qn Install This is currently for my personal use. I may push breaking changes at any time. If you want to use it, bring down the code and r

Code Smell 3 Jul 15, 2022
The easiest way to use BotiCord API in Rust

The easiest way to use BotiCord API in Rust · Docs Usage [dependencies]

BotiCord 6 Feb 14, 2022
Provide expansion of proc-macros, in a way that rustc directs you directly to the issues at hand

expander Expands a proc-macro into a file, and uses a include! directive in place. Advantages Only expands a particular proc-macro, not all of them. I

Bernhard Schuster 16 Oct 5, 2022
A simply better way to make Discord bots.

Cognite ?? A simple blazingly fast language agnostic microservice driven Discord bot framework made in rust that leverages Apache Kafka and KeyDB to t

Eludris 5 Aug 26, 2022
An abstraction build on top of discord-rich-presence that makes possible to use it in a more declarative way

Declarative Discord Rich Presence This library is an abstraction build on top of discord-rich-presence crate that allows you to use it in a more decla

null 2 Sep 7, 2022
Rust ABI safe code generator

CGlue offers an easy way to ABI (application binary interface) safety. Just a few annotations and your trait is ready to go!

Auri 142 Jan 2, 2023