nom-text

Goal: a library that extends nom to provide better tools for text formats (programming languages, configuration files).

current needs

Recognizing line and column in input data: nom-locate

aggregating more precise errors

Using nom-supreme

error recovery

parsers need to be fault tolerant: recovering from the current error allows parsing the rest of the file: https://eyalkalderon.com/blog/nom-error-recovery/

We should also allow for various error levels (like warnings, that would not stop compilation)

precedence

precedence rules can be annoying to write

precedence support: https://github.com/Geal/nom/pull/1362

bounded recursion

most language parsers end up calling themselves recursively, resulting in stack overflow issues

nicer error diagnostic

if we have error spans, we should be able to display more context, like rustc's errors

tokenization

should we support multiple phases, with &str -> Vec<Token> functions followed by &[Token] -> AST functions?

A fast, low-resource Natural Language Processing and Text Correction library written in Rust.

nlprule A fast, low-resource Natural Language Processing and Error Correction library written in Rust. nlprule implements a rule- and lookup-based app

496 Jan 8, 2023

fastest text uwuifier in the west

uwuify fastest text uwuifier in the west transforms Hey... I think I really love you. Do you want a headpat? into hey... i think i w-weawwy wuv you.

1.2k Dec 29, 2022

A crate using DeepSpeech bindings to convert mic audio from speech to text

DS-TRANSCRIBER Need an Offline Speech To Text converter? Records your mic, and returns a String containing what was said. Features Begins transcriptio

32 Oct 8, 2022

Sorta Text Format in UTF-8

STFU-8: Sorta Text Format in UTF-8 STFU-8 is a hacky text encoding/decoding protocol for data that might be not quite UTF-8 but is still mostly UTF-8.

18 Sep 4, 2022

The fastest way to identify any mysterious text or analyze strings from a file, just ask `lemmeknow` !

The fastest way to identify anything lemmeknow ⚡ Identify any mysterious text or analyze strings from a file, just ask lemmeknow. lemmeknow can be use

594 Dec 30, 2022

Font independent text analysis support for shaping and layout.

lipi Lipi (Sanskrit for 'writing, letters, alphabet') is a pure Rust crate that provides font independent text analysis support for shaping and layout

12 Sep 22, 2022

lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike.

lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l