144 Repositories
🎼 🧬 lightmotif A lightweight platform-accelerated library for biological motif scanning using position weight matrices. 🗺️ Overview Motif scanning
murasaki: Nostr to Speech ⚠ このソフトウェアはα版です ⚠ VOICEVOX を利用したタイムライン読み上げツールです。 指定したリレーのグローバルタイムライン、または指定した公開鍵でフォローしているユーザのタイムラインを読み上げます。 つかいかた Rust をインストー
Quickner ⚡ A simple, fast, and easy to use NER annotator for Python Quickner is a new tool to quickly annotate texts for NER (Named Entity Recognition
mtop mtop: top for Memcached. Features Display real-time statistics about your memcached servers such as Memory usage/limit Current/max connections Hi
tx-decoder A quick way to decode a contract's transaction data with only the contract address and abi. E.g, let tx_data = "0xe70dd2fc00000000000000000
Is It Planet Yet? This is just an experiment: me trying to learn how to build real-time procedural planets. Any suggestions/contributions is welcome.
rust-bert Rust native Transformer-based models implementation. Port of Hugging Face's Transformers library, using the tch-rs crate and pre-processing
nlprule A fast, low-resource Natural Language Processing and Error Correction library written in Rust. nlprule implements a rule- and lookup-based app
regex A Rust library for parsing, compiling, and executing regular expressions. Its syntax is similar to Perl-style regular expressions, but lacks a f
simdutf8 – High-speed UTF-8 validation for Rust Blazingly fast API-compatible UTF-8 validation for Rust using SIMD extensions, based on the implementa
triple_accel Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance cal
Noq Not Coq. Simple expression transformer that is not Coq. Quick Start $ cargo run ./examples/add.noq Main Idea The Main Idea is being able to define
frawk frawk is a small programming language for writing short programs processing textual data. To a first approximation, it is an implementation of t
A backend for mdBook written in Rust for generating PDF based on headless chrome and Chrome DevTools Protocol.
Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. Main features: Train new vocabularies and tok
PDFRip Fast PDF password cracking utility equipped with commonly encountered password format builders and dictionary attacks. 📖 Table of Contents Int
cpc calculation + conversion cpc parses and evaluates strings of math, with support for units and conversion. 128-bit decimal floating points are used
glam A simple and fast 3D math library for games and graphics. Development status glam is in beta stage. Base functionality has been implemented and t
Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l
vtext NLP in Rust with Python bindings This package aims to provide a high performance toolkit for ingesting textual data for machine learning applica
Difftastic is an experimental structured diff tool that compares files based on their syntax.
rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc. rga is a line-oriented search tool that allows you to look for a r
Introduction finalfusion is a crate for reading, writing, and using embeddings in Rust. finalfusion primarily works with its own format which supports
Source text parsing, lexing, and AST related functionality for Deno.
rust-tokenizers Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigra
aho-corasick A library for finding occurrences of many patterns at once with SIMD acceleration in some cases. This library provides multiple pattern s
ipa_renamer A command line tool for renaming your ipa files quickly and easily. Usage ipa_renamer 0.0.1 A command line tool for renaming your ipa file
sabi In Japanese version https://github.com/bnjbvr/rouille. Shamelessly copied and updated from it. 日本語で Rust プログラムを書くことができます! 例 main.rs sabi::sabi! {
Table of Contents What does this tool do? Do I still need to learn to write regexes then? Current features How to install? 4.1 The command-line tool 4
The fastest way to identify anything lemmeknow ⚡ Identify any mysterious text or analyze strings from a file, just ask lemmeknow. lemmeknow can be use
html5gum html5gum is a WHATWG-compliant HTML tokenizer. use std::fmt::Write; use html5gum::{Tokenizer, Token}; let html = "title hello world/tit
sudachi.rs - English README 2021-12-09 UPDATE: 0.6.2 Release Try it: pip install --update 'sudachipy=0.6.2' sudachi.rs is a Rust implementation of Su
N-grams Documentation This crate takes a sequence of tokens and generates an n-gram for it. For more information about n-grams, check wikipedia: https
lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l
Vidyut मा भूदेवं क्षणमपि च ते विद्युता विप्रयोगः ॥ Vidyut is a lightning-fast toolkit for processing Sanskrit text. Vidyut aims to provide standard co
bottom encodes UTF-8 text into a sequence comprised of bottom emoji (with , sprinkled in for good measure) followed by 👉👈. It can encode any valid UTF-8 - being a bottom transcends language, after all - and decode back into UTF-8.
finalfrontier Introduction finalfrontier is a Rust program for training word embeddings. finalfrontier currently has the following features: Models: s
Rust subtitle utilities Are you looking for substudy? Try here. (substudy has been merged into the subtitles-rs project.) This repository contains a n
🐍 python-vibrato 🎤 Vibrato is a fast implementation of tokenization (or morphological analysis) based on the Viterbi algorithm. This is a Python wra
uwuify fastest text uwuifier in the west transforms Hey... I think I really love you. Do you want a headpat? into hey... i think i w-weawwy wuv you.
NNSplit A tool to split text using a neural network. The main application is sentence boundary detection, but e. g. compound splitting for German is a
2021-07-07 UPDATE: The official Sudachi team will take over this project (cf. 日本語形態素解析器 SudachiPy の 現状と今後について - Speaker Deck) sudachi.rs An official S
Find Files (ff) Find Files (ff) utility recursively searches the files whose names match the specified RegExp pattern in the provided directory (defau
Ruplacer Find and replace text in source files: $ ruplacer old new src/ Patching src/a_dir/sub/foo.txt -- old is everywhere, old is old ++ new is ever
rs-natural Natural language processing library written in Rust. Still very much a work in progress. Basically an experiment, but hey maybe something c
decimal-rs High precision decimal with maximum precision of 38. Optional features serde When this optional dependency is enabled, Decimal implements t
Whatlang Natural language detection for Rust with focus on simplicity and performance. Content Features Get started Documentation Supported languages
WriteForAll: tips to make text better WriteForAll is a text file style checker, that compares text documents with editorial tips to make text better.
SyntaxDot Introduction SyntaxDot is a sequence labeler and dependency parser using Transformer networks. SyntaxDot models can be trained from scratch
Lindera A Japanese morphological analysis library in Rust. This project fork from fulmicoton's kuromoji-rs. Lindera aims to build a library which is e
Treebender A symbolic natural language parsing library for Rust, inspired by HDPSG. What is this? This is a library for parsing natural or constructed
suffix Fast linear time & space suffix arrays for Rust. Supports Unicode! Dual-licensed under MIT or the UNLICENSE. Documentation https://docs.rs/suff
Snips NLU Rust Installation Add it to your Cargo.toml: [dependencies] snips-nlu-lib = { git = "https://github.com/snipsco/snips-nlu-rs", branch = "mas
Textwrap Textwrap is a library for wrapping and indenting text. It is most often used by command-line programs to format dynamic output nicely so it l