Static-checked parsing of regexes into structs

Overview

Statically-checked regex parsing into structs.

This avoids common regex pitfalls like

  • Off by one capture indexes
  • Trying to get nonexistent captures
  • Desync of capture names in regex and the names used to fetch fields

Installation

cargo add structre

Use

Define a structure and use this macro to generate a parser helper struct:

#[structre("(?P<key>[^:]+): (?P<value>\\d+)")]
struct KV {
    key: String,
    value: usize,
}

The parser is named KVFromRegex.

Instantiate the parser and use it like:

let kv_parser = KVFromRegex::new();
let m: KV = kv_parser.parse("hi: 39393")?;

parse returns a Result. An Err is only returned in the case that there's no match or one of the field's from_str returns an Err. Per the preceding, if you only have String fields an Err always means no match, so you can treat it as a None.

Supported types

The parsed data can be a structure with named fields or a tuple.

If it has named fields:

  • Every field must have an associated named capture
  • Every named capture must have an associated field
  • There must be no unnamed captures
  • Nested structures and tuples within structures aren't supported since there's no clear correct way to handle it.

If it's a tuple:

  • The number of captures must equal the number of tuple elements
  • There must be no named captures
  • Nested tuples are okay

Any field that implements std::str::FromStr will work.

Note: the static analysis stops at any non-inline type (ex: struct MyData(SomeOtherData); - so if SomeOtherData is a tuple or has fields, these will not be checked and SomeOtherData will be parsed with FromStr from a single capture).

Limitations

As noted above, not much type available so only the structure above can be checked. Externally defined types will not be inspected.

I was hoping to be able to ensure that the regex has valid characters for numbers, but due to the above and the difficulty of reasoning about the contents of regex ASTs I had to scrap that.

Non-unicode parsing isn't currently supported. I couldn't find an ascii float parsing library and this isn't maximally optimized in the first place due to the use of anyhow. If this is important and you have a vision of how it could work please raise an issue!

You might also like...
An inquiry into nondogmatic software development. An experiment showing double performance of the code running on JVM comparing to equivalent native C code.
An inquiry into nondogmatic software development. An experiment showing double performance of the code running on JVM comparing to equivalent native C code.

java-2-times-faster-than-c An experiment showing double performance of the code running on JVM comparing to equivalent native C code ⚠️ The title of t

A translation of akaritakai/AdventOfCode2021 into Rust

Advent Of Code 2021 Solutions This repo contains my Advent Of Code 2021 solutions in Rust. This repo is a learning opportunity (a "my first Rust proje

A tool to deserialize data from an input encoding, transform it and serialize it back into an output encoding.

dts A simple tool to deserialize data from an input encoding, transform it and serialize it back into an output encoding. Requires rust = 1.56.0. Ins

cargo-add command to make dependencies into dylibs

cargo add-dynamic This cargo command allows to wrap dependencies as dylibs. For why you might want this see Speeding up incremental Rust compilation w

A tool that helps you to turn in one command a Rust crate into a Haskell Cabal library!
A tool that helps you to turn in one command a Rust crate into a Haskell Cabal library!

cabal-pack A tool that helps you to turn in one command a Rust crate into a Haskell Cabal library! To generate bindings, you need to annotate the Rust

Bitpack a boolean into a pointer using bit magic.

ptr-bool tl;dr: a pointer and boolean with the same size as a pointer. A convenience crate used to bitpack a boolean and pointer into the same eight b

Time to dive into Rust!

Lets-Learn-Rust Time to dive into Rust! Day 1 Installation Running a Simple Rust Program Managing Projects with Cargo Basic Programming - Comments and

A comprehensive and FREE Online Rust hacking tutorial utilizing the x64, ARM64 and ARM32 architectures going step-by-step into the world of reverse engineering Rust from scratch.
A comprehensive and FREE Online Rust hacking tutorial utilizing the x64, ARM64 and ARM32 architectures going step-by-step into the world of reverse engineering Rust from scratch.

FREE Reverse Engineering Self-Study Course HERE Hacking Rust A comprehensive and FREE Online Rust hacking tutorial utilizing the x64, ARM64 and ARM32

Code accompanying the 1 Hour Dive into Async live stream.
Code accompanying the 1 Hour Dive into Async live stream.

1 Hour Async This is the code accompanying the 1 Hour Dive into Async live-stream. The slides are here You can watch the recorded event on YouTube Inc

Owner
Andrew Baxter
Andrew Baxter
Generate rust structs & query functions from diesel schema files

dsync A utility to generate database structs and querying code from diesel schema files. Primarily built for create-rust-app. Currently, it's more adv

Haris 20 Feb 12, 2023
S-expression parsing and writing in Rust

rsexp S-expression parsing and writing in Rust using nom parser combinators. This implemantion aims at being compatible with OCaml's sexplib. The main

Laurent Mazare 12 Oct 18, 2022
Tutorial for parsing with nom 5.

Nom Tutorial Nom is a wonderful parser combinators library written in Rust. It can handle binary and text files. Consider it where you would otherwise

Benjamin Kay 265 Dec 11, 2022
Lightweight parsing for Rust proc macros

Lightweight parsing for Rust proc macros Venial is a WIP parser for Rust proc macros. When writing proc macros that need to parse Rust code (such as a

Olivier FAURE 148 Dec 30, 2022
Fast and simple datetime, date, time and duration parsing for rust.

speedate Fast and simple datetime, date, time and duration parsing for rust. speedate is a lax† RFC 3339 date and time parser, in other words, it pars

Samuel Colvin 43 Nov 25, 2022
PE Parsing, but blazing fast

PE Parser A blazing fast ?? PE Parser written in Rust Motivation The main goals of pe-parser is to write something suitable for a PE Loader. Is editin

Isaac Marovitz 8 Apr 21, 2023
A fast little combinational parsing library.

neure A fast little combinational parsing library Performance rel is mean release, fat is mean release with lto=fat See examples Example use neure::*;

loren 9 Aug 16, 2023
Fast fail2ban-like tools for parsing nginx logs

Fast2ban This is simple fail2ban-like replacement written in Rust. Usage: ./fast2ban # reads default config.toml from current directory ./fast2ban <co

null 36 May 10, 2023
Compile time static maps for Rust

Rust-PHF Documentation Rust-PHF is a library to generate efficient lookup tables at compile time using perfect hash functions. It currently uses the C

null 1.3k Jan 1, 2023
Static low-bandwidth search at scale

Pagefind Pagefind is a fully static search library that aims to perform well on large sites, while using as little of your users' bandwidth as possibl

CloudCannon 657 Dec 30, 2022