SubStrings, Slices and Random String Access in Rust

Overview

SubStrings, Slices and Random String Access in Rust

This is a simple way to do it.

Description

  • Rust string processing is kind of hard, because text in a UTF-8 world has many complex details, and Rust exposes all that power and all that complexity to you, the programmer. Sometimes it can be over whelming. Sometimes you only want to have a simple substring or a slice and you don’t mind to pay it’s cost, because you really need this feature and the Standard Library doesn’t help you a lot there.

  • Fortunately carlomilanesi made this code available to all
    https://users.rust-lang.org/t/how-to-get-a-substring-of-a-string/1351/11

  • But if you have to do many text operations based on the positions of chars inside a strings this isn’t a really good option, because you have to scan all the strings to the correct position, from the start, to have the string divided it into the correct boundaries of the chars. In this context, you would happily pay a up front cost of transforming the string into a Vec, Vec of chars with individual chars separated, and process it as positional chars with access cost of 1 and then, slice them, range them, append to them at the end (or if you need to append in the start or the middle paying the cost of copy to a new buffer, but you can do it if you need to). The following code is my expansion to the code of carlomilanesi. It will allow you to do it.

. let mut vc = s.get_vec_chars(); println!("{}, {}, {}, {}.", vc[..5].to_string(), vc.to_string(), vc[3..8].to_string(), vc[3..].to_string()); // Reusing a String buffer from a Vec . let mut buf = String::new(); print!("{}, ", vc[..5].to_string_buf(& mut buf)); print!("{}, ", vc[..].to_string_buf(& mut buf)); print!("{}, ", vc[3..8].to_string_buf(& mut buf)); print!("{}.\n", vc[3..].to_string_buf(& mut buf)); // Random access to the Vec . for i in 0..(vc.len() - 2) { print!("{} ", vc[i..i+3].to_string_buf(& mut buf)); } println!(""); // Random modifications to the Vec . for i in (0..(vc.len() / 3) + 1).rev() { vc.insert(i*3, '#'); } println!("{} ", vc.to_string()); println!("{} ", vc.to_string_buf(& mut buf)); } // Output: // abcdè, abcdèfghij, dèfgh, dèfghij. // abcdè, abcdèfghij, dèfgh, dèfghij. // abcdè, abcdèfghij, dèfgh, dèfghij. // // abcdè, abcdèfghij, dèfgh, dèfghij. // abcdè, abcdèfghij, dèfgh, dèfghij. // abc bcd cdè dèf èfg fgh ghi hij // #abc#dèf#ghi#j // #abc#dèf#ghi#j ">
use std::ops::{Bound, RangeBounds};

trait StringUtils {
    fn substring(&self, start: usize, len: usize) -> &str;
    fn slice(&self, range: impl RangeBounds<usize>) -> &str;
    fn get_vec_chars(&self) -> Vec<char>;
}

impl StringUtils for str {
    fn substring(&self, start: usize, len: usize) -> &str {
        let mut char_pos = 0;
        let mut byte_start = 0;
        let mut it = self.chars();
        loop {
            if char_pos == start { break; }
            if let Some(c) = it.next() {
                char_pos += 1;
                byte_start += c.len_utf8();
            }
            else { break; }
        }
        char_pos = 0;
        let mut byte_end = byte_start;
        loop {
            if char_pos == len { break; }
            if let Some(c) = it.next() {
                char_pos += 1;
                byte_end += c.len_utf8();
            }
            else { break; }
        }
        &self[byte_start..byte_end]
    }
    fn slice(&self, range: impl RangeBounds<usize>) -> &str {
        let start = match range.start_bound() {
            Bound::Included(bound) | Bound::Excluded(bound) => *bound,
            Bound::Unbounded => 0,
        };
        let len = match range.end_bound() {
            Bound::Included(bound) => *bound + 1,
            Bound::Excluded(bound) => *bound,
            Bound::Unbounded => self.len(),
        } - start;
        self.substring(start, len)
    }
    fn get_vec_chars(&self) -> Vec<char> { self.chars().collect() }
}

trait StringUtilsVecChars {
    fn to_string(&self) -> String;
    fn to_string_buf<'a>(&self, buf: & 'a mut String) -> & 'a String;
}

impl StringUtilsVecChars for Vec<char> {
    fn to_string(&self) -> String { self.iter().collect() }
    fn to_string_buf<'a>(&self, buf: & 'a mut String) -> & 'a String {
        buf.clear();
        for c in self.iter() { buf.push(*c); }
        buf
    }
}

trait StringUtilsSlices {
    fn to_string(&self) -> String;
    fn to_string_buf<'a>(&self, buf: & 'a mut String) -> & 'a String;
}

impl StringUtilsSlices for [char] {
    fn to_string(&self) -> String { self.iter().collect() }
    fn to_string_buf<'a>(&self, buf: & 'a mut String) -> & 'a String {
        buf.clear();
        for c in self.iter() { buf.push(*c); }
        buf
    }
}

fn main() {
    let s = "abcdèfghij";
    // All three statements should print:
    // "abcdè, abcdèfghij, dèfgh, dèfghij."
    println!("{}, {}, {}, {}.",
        s.substring(0, 5),
        s.substring(0, 50),
        s.substring(3, 5),
        s.substring(3, 50));
    println!("{}, {}, {}, {}.",
        s.slice(..5),
        s.slice(..50),
        s.slice(3..8),
        s.slice(3..));
    println!("{}, {}, {}, {}.",
        s.slice(..=4),
        s.slice(..=49),
        s.slice(3..=7),
        s.slice(3..));

    // Allocating a string from Vec
      
       .
      
    let mut vc = s.get_vec_chars(); 
    println!("{}, {}, {}, {}.",
        vc[..5].to_string(),
        vc.to_string(),
        vc[3..8].to_string(),
        vc[3..].to_string());

    // Reusing a String buffer from a Vec
      
       .
      
    let mut buf = String::new();
    print!("{}, ", vc[..5].to_string_buf(& mut buf));
    print!("{}, ", vc[..].to_string_buf(& mut buf));
    print!("{}, ", vc[3..8].to_string_buf(& mut buf));
    print!("{}.\n", vc[3..].to_string_buf(& mut buf));
    
    // Random access to the Vec
      
       . 
      
    for i in 0..(vc.len() - 2) {
        print!("{} ", vc[i..i+3].to_string_buf(& mut buf));
    }
    println!("");
    
    // Random modifications to the Vec
      
       .
      
    for i in (0..(vc.len() / 3) + 1).rev() {
        vc.insert(i*3, '#');
    }
    println!("{} ", vc.to_string());
    println!("{} ", vc.to_string_buf(& mut buf));
}

// Output:
//    abcdè, abcdèfghij, dèfgh, dèfghij.
//    abcdè, abcdèfghij, dèfgh, dèfghij.
//    abcdè, abcdèfghij, dèfgh, dèfghij.
//
//    abcdè, abcdèfghij, dèfgh, dèfghij.
//    abcdè, abcdèfghij, dèfgh, dèfghij.
//    abc bcd cdè dèf èfg fgh ghi hij 
//    #abc#dèf#ghi#j
//    #abc#dèf#ghi#j

My Rust guide

All my other guides

Have fun!

Best regards,
Joao Nuno Carvalho

You might also like...
Rust crate which provides direct access to files within a Debian archive

debarchive This Rust crate provides direct access to files within a Debian archive. This crate is used by our debrep utility to generate the Packages

Low level access to ATmega32U4 registers in Rust

Deprecation Note: This crate will soon be deprecated in favor of avr-device. The approach of generating the svd from hand-written register definitions

Register access crate for AVR microcontrollers

avr-device Auto-generated wrappers around registers for AVR microcontrollers. Usage Add the following to Cargo.toml: [dependencies.avr-device] version

A simple string interner / symbol table for Rust projects.

Symbol Interner A small Rust crate that provides a naïve string interner. Consult the documentation to learn about the types that are exposed. Install

Rust library to detect bots using a user-agent string

Rust library to detect bots using a user-agent string

A flexible, simple to use, immutable, clone-efficient String replacement for Rust

A flexible, simple to use, immutable, clone-efficient String replacement for Rust. It unifies literals, inlined, and heap allocated strings into a single type.

A simple string parsing utility library for Rust, supporting no_std contexts.

strp Utility library for parsing data from an input string, or stdin if built with the std feature. Supports no_std contexts when built without the st

microtemplate - A fast, microscopic helper crate for runtime string interpolation.

microtemplate A fast, microscopic helper crate for runtime string interpolation. Design Goals Very lightweight: I want microtemplate to do exactly one

A memory efficient immutable string type that can store up to 24* bytes on the stack

compact_str A memory efficient immutable string type that can store up to 24* bytes on the stack. * 12 bytes for 32-bit architectures About A CompactS

Owner
João Nuno Carvalho
Hi, I’m a software engineer - computer science and electronics. I develop in many programming lang. I do command line, GUI and Web, Machine Learning and others.
João Nuno Carvalho
Count and convert between different indexing schemes on utf8 string slices

Str Indices Count and convert between different indexing schemes on utf8 string slices. The following schemes are currently supported: Chars (or "Unic

Nathan Vegdahl 11 Dec 25, 2022
A simple interpreter for the mathematical random-access machine

Random-access machine runner A simple Rust RAM program runner. Lexer/Parser Program executor Code formatter Web Compiled to WASM to run in the browser

Marcin Wojnarowski 5 Jan 14, 2023
ᎩᎦᎨᎢ (IPA: [gigagei]) is a random quote fetching console utility. Written in Rust.

gigagei ᎩᎦᎨᎢ (IPA: [gigagei]) is a random quote fetching console utility. Written in Rust. Installing Use latest pre-built binary from releases Buildi

veleth 10 Jun 17, 2022
A Rust library for random number generation.

A Rust library for random number generation.

null 1.3k Jan 6, 2023
A random.org client library for Rust

random.org A https://random.org client library. The randomness comes from atmospheric noise, which for many purposes is better than the pseudo-random

Victor Polevoy 7 Mar 10, 2022
Pure rust implementation of python's random module with compatible generator behaviour.

pyrand Pure rust implementation of (parts of) python's random module with compatible PRNG behaviour: seeding with equivalent values will yield identic

Stefan V. 4 Feb 10, 2024
File Tree Fuzzer allows you to create a pseudo-random directory hierarchy filled with some number of files.

FTZZ File Tree Fuzzer allows you to create a pseudo-random directory hierarchy filled with some number of files. Installation $ cargo +nightly install

Alex Saveau 22 Dec 28, 2022
Quad-rand implements pseudo-random generator

quad-rand quad-rand implements pseudo-random generator based on rust atomics. Compatible with wasm and any oth

Fedor Logachev 10 Feb 11, 2022
Rust wrapper for the Google Places API. Access their hundreds of millions of places, reviews, and ratings.

Google Places API Working Examples cargo run --example nearby_search cargo run --example place_details cargo run --example find_place cargo run --exam

Jared Ucherek 4 Jun 12, 2023
A string truncator and scroller written in Rust

scissrs A string truncator and scroller written in Rust. Usage scissrs --help covers the definitions of this program's flags.

Skybbles 5 Aug 3, 2022