Home / Rust Text processing
154 Repositories
Sortby
A lightweight library with vehicle tuning utilities. This includes utilities for communicating with OBD-II services, firmware downloading/flashing, and table modifications.
fasttext-rs fastText Rust binding Installation Add it to your Cargo.toml: [dependencies] fasttext = "0.6" Add extern crate fasttext to your crate root
word2vec Rust interface to word2vec word vectors. This crate provides a way to read a trained word vector file from word2vec. It doesn't provide model
pafplot A base-level sequence alignment rasterizer / dotplot generator overview In the process of generating alignments between whole genomes, we ofte
webster-rs A Rust library containing an offline version of webster's dictionary. Add to Cargo.toml webster = 0.3.0 Simple example: fn main() { le
bytelines This library provides an easy way to read in input lines as byte slices for high efficiency. It's basically lines from the standard library,
lipi Lipi (Sanskrit for 'writing, letters, alphabet') is a pure Rust crate that provides font independent text analysis support for shaping and layout
A simple cross-platform markdown viewer Usage markdown-viewer use the system file dialog to choose a markdown file to view markdown-viewer my_file.md
GUT CHAMPION Summary Gut Champion is a platformer fighter inspired by Super Smash Bros. The goal is to knock the enemy off stage. The more you hit you
BlingFire in Rust blingfire is a thin Rust wrapper for the BlingFire tokenization library. Add the library to Cargo.toml to get started cargo add blin
STFU-8: Sorta Text Format in UTF-8 STFU-8 is a hacky text encoding/decoding protocol for data that might be not quite UTF-8 but is still mostly UTF-8.
sweetpaste sweetpaste is a sweet n' simple pastebin server. It's completely server-side, with zero client-side code. Configuration The configuration w
CREATURE FEATUR(ization) A crate for polymorphic ML & NLP featurization that leverages zero-cost abstraction. It provides composable n-gram combinator
ingLix / ˈɪŋ glɪʃ / English done right. An opinionated, better system for spelling words in English. Preamble Click to expand. The English language is
ter - Text Expression Runner ter is a cli to run text expressions and perform basic text operations such as filtering, ignoring and replacing on the c
cutters A rule based sentence segmentation library. 🚧 This library is experimental. 🚧 Features Full UTF-8 support. Robust parsing. Language specific
NomBytes nombytes is a library that provides a wrapper for the bytes::Bytes byte container for use with nom. I originally made this so that I could ha
backtrack-rs 🦀 backtrack lets you solve backtracking problems simply and generically. Problems are defined by their scope and checks against possible
serde_sheets Read and write structs directly from google sheets using serde and csv Implement serde::Serialize to write and serde::Deserialize to read
A lightweight and snappy crate to remove emojis from a string.
Sobol-Burley A seedable Owen-scrambled Sobol sequence based on the paper Practical Hash-based Owen Scrambling by Brent Burley, but with an improved ha
Front-coding string dictionary in Rust This is a Rust library of the (plain) front-coding string dictionary described in Martínez-Prieto et al., Pract
ffi-pinyin 该项目是将中文汉字转换成拼音,使用rust构建动态链接库来给php调用。 该库主要是为了提高php转换中文拼音的性能而构建(特别是长文章)。 环境 需要php = 7.4 以上的版本并且开启了FFI扩展。如果你需要自己编译库还需要装rust 工具链。 还需要设置php.ini
helfsteal Simple Data Stealer Hi All, I published basic data stealer malware with Rust. FOR EDUCATIONAL PURPOSES. You can use it for Red Team operatio
miniball Minimum enclosing ball. NOTE: This crate requires nightly Rust. Finds circumscribed n-ball of set of bounds. Finds minimum n-ball enclosing s
SEFF - Simple Embeddable Font Format This crate is designed to allow decent text rendering in resource-constrained environments like microcontrollers.
Yoin - A Japanese Morphological Analyzer yoin is a Japanese morphological analyze engine written in pure Rust. mecab-ipadic is embedded in yoin. :) $
Warning: sticker is succeeded by SyntaxDot, which supports many new features: Multi-task learning. Pretrained transformer models, suchs as BERT and XL
lumberjack Read and process constituency trees in various formats. Install: From crates.io: cargo install lumberjack-utils From GitHub: cargo install
Todo_r Find all your notes with one command! Todo_r is a simple rust command line utility that keeps track of your todo items in code. It is pronounce
Naive CityHash naive-cityhash is a naive native 128-bit cityhash v102 implementation for clickhouse*. Contact Chojan Shang - @PsiACE - psiace@outlook.
Lax An argument substitution utility intended to make working on projects with deeply nested directories a bit easier. "Lax" stands for "Locate Args a
zgclp Zgclp (Zero-grammar definition command-line parser) is one of Rust's command-line parsers. A normal command-line parser generates a parser from
A Rust wrapper for the Text synthesization service TextSynth API
DeepFrog - NLP Suite Introduction DeepFrog aims to be a (partial) successor of the Dutch-NLP suite Frog. Whereas the various NLP modules in Frog wre b
A small rust library for creating regex-based lexers
xtop extremely-simplified top Depends xtop depends on below relatively primitive crates: ncurses: TUI sysconf: only to get a jiffy. signal-hook: to ha
dpar Introduction dpar is a neural network transition-based dependency parser. The original Go version can be found in the oldgo branch. Dependencies
convert koreader highlights and notes to markdown
quiz it's all kinda shoddy right now but, assuming sqlx cli is installed, and npm on lts or something recent-ish cd api docker-compose up -d sqlx data
Makdown-like text parser.
distance distance: Uses the Jaro similarity metric to measure the distance between two strings FYI, this was just to test Neon, I do not recommend usi
Bindings for CNTK library Simple low level bindings for CNTK library from Microsoft. API Documentation Status Currently exploring ways how to interact
attorand from 'atto', meaning smaller than small, and 'rand', short for random. A small random number generator hacked on top of Rust's standard libra
Periodic Words Have you ever wondered which words you can spell using only element abbreviations from the periodic table? Well thanks to this extremel
SloWord2Vec This is a naive implementation of Word2Vec implemented in Rust. The goal is to learn the basic principles and formulas behind Word2Vec. BT