A small rust library for creating regex-based lexers

Overview

reglex

A small rust library for creating regex-based lexers

Example

|_| Some(Token::Keyword), r"\d+" => |s: &str| Some(Token::Number(s.parse().unwrap())), r"\{" => |_| Some(Token::Left), r"\}" => |_| Some(Token::Right), r"\s" => |_| None ]; lex(&regexes, input) } fn main() { assert_eq!( lexer(&"kw { 12 53 }".to_string()), Ok(vec![ Token::Keyword, Token::Left, Token::Number(12), Token::Number(53), Token::Right ]) ); assert_eq!(lexer(&"kw ERROR! { 12 53 }".to_string()), Err(3)); }">
use reglex::{RuleList, rule_list, lex};

#[derive(Debug, PartialEq)]
enum Token {
    Keyword,
    Number(u64),
    Left,
    Right,
}

fn lexer(input: &String) -> Result<Vec
   , 
   usize> {
    
   let regexes: RuleList
   <Token
   > 
   = 
   rule_list! [
        
   "kw" 
   => 
   |_
   | 
   Some(Token
   ::Keyword),
        
   r"\d+" 
   => 
   |s: 
   &
   str
   | 
   Some(Token
   ::
   Number(s.
   parse().
   unwrap())),
        
   r"\{" 
   => 
   |_
   | 
   Some(Token
   ::Left),
        
   r"\}" 
   => 
   |_
   | 
   Some(Token
   ::Right),
        
   r"\s" 
   => 
   |_
   | 
   None
    ];

    
   lex(
   &regexes, input)
}


   fn 
   main() {
    
   assert_eq!(
        
   lexer(
   &
   "kw  { 12 53 }".
   to_string()),
        
   Ok(
   vec![
            Token
   ::Keyword,
            Token
   ::Left,
            Token
   ::
   Number(
   12),
            Token
   ::
   Number(
   53),
            Token
   ::Right
        ])
    );

    
   assert_eq!(
   lexer(
   &
   "kw ERROR! { 12 53 }".
   to_string()), 
   Err(
   3));
}
  
You might also like...
Checks all your documentation for spelling and grammar mistakes with hunspell and a nlprule based checker for grammar

cargo-spellcheck Check your spelling with hunspell and/or nlprule. Use Cases Run cargo spellcheck --fix or cargo spellcheck fix to fix all your docume

🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto.

🐍 python-vaporetto 🛥 Vaporetto is a fast and lightweight pointwise prediction based tokenizer. This is a Python wrapper for Vaporetto. Installation

Viterbi-based accelerated tokenizer (Python wrapper)

🐍 python-vibrato 🎤 Vibrato is a fast implementation of tokenization (or morphological analysis) based on the Viterbi algorithm. This is a Python wra

Rust-nlp is a library to use Natural Language Processing algorithm with RUST

nlp Rust-nlp Implemented algorithm Distance Levenshtein (Explanation) Jaro / Jaro-Winkler (Explanation) Phonetics Soundex (Explanation) Metaphone (Exp

An efficient and powerful Rust library for word wrapping text.

Textwrap Textwrap is a library for wrapping and indenting text. It is most often used by command-line programs to format dynamic output nicely so it l

⏮ ⏯ ⏭ A Rust library to easily read forwards, backwards or randomly through the lines of huge files.

EasyReader The main goal of this library is to allow long navigations through the lines of large files, freely moving forwards and backwards or gettin

Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/
Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/

Whatlang Natural language detection for Rust with focus on simplicity and performance. Content Features Get started Documentation Supported languages

A Rust library for generically joining iterables with a separator

joinery A Rust library for generically joining iterables with a separator. Provides the tragically missing string join functionality to rust. extern c

👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike
👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike

Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l

Owner
nph
hmmmmmmmmmmmm
nph
A small CLI utility for helping you learn japanese words made in rust 🦀

Memofante (Clique aqui ver em português) Memofante is here, a biiiig help: Do you often forget japanese words you really didn't want to forget? Do you

Tiaguinho 3 Nov 4, 2023
frawk is a small programming language for writing short programs processing textual data

frawk frawk is a small programming language for writing short programs processing textual data. To a first approximation, it is an implementation of t

Eli Rosenthal 1k Jan 7, 2023
A rule based sentence segmentation library.

cutters A rule based sentence segmentation library. ?? This library is experimental. ?? Features Full UTF-8 support. Robust parsing. Language specific

null 11 Jul 29, 2022
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)

rust-bert Rust native Transformer-based models implementation. Port of Hugging Face's Transformers library, using the tch-rs crate and pre-processing

null 1.3k Jan 8, 2023
Neural network transition-based dependency parser (in Rust)

dpar Introduction dpar is a neural network transition-based dependency parser. The original Go version can be found in the oldgo branch. Dependencies

Daniël de Kok 41 Jan 25, 2022
A backend for mdBook written in Rust for generating PDF based on headless chrome and Chrome DevTools Protocol.

A backend for mdBook written in Rust for generating PDF based on headless chrome and Chrome DevTools Protocol.

Hollow Man 52 Jan 7, 2023
Simple STM32F103 based glitcher FW

Airtag glitcher (Bluepill firmware) Simple glitcher firmware running on an STM32F103 on a bluepill board. See https://github.com/pd0wm/airtag-dump for

Willem Melching 27 Dec 22, 2022
Difftastic is an experimental structured diff tool that compares files based on their syntax.

Difftastic is an experimental structured diff tool that compares files based on their syntax.

Wilfred Hughes 13.9k Jan 2, 2023
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.

Untanglr Untanglr takes in a some mangled words and makes sense out of them so you dont have to. It goes through the input and splits it probabilistic

Andrei Butnaru 15 Nov 23, 2022
Vaporetto: a fast and lightweight pointwise prediction based tokenizer

?? VAporetto: POintwise pREdicTion based TOkenizer Vaporetto is a fast and lightweight pointwise prediction based tokenizer. Overview This repository

null 184 Dec 22, 2022