229 Repositories
Rust text-processing Libraries
Tantivy is a full text search engine library written in Rust.
Tantivy is a full text search engine library written in Rust. It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is no
hj is a command line tool to convert HTTP/1-style text into JSON
hj hj is a command line tool to convert HTTP/1-style text into JSON. This command is inspired by yusukebe/rj, which is a standalone HTTP client that s
Terminal text styling via ANSI escape sequences.
Iridescent Features iridescent is a library for styling terminal text easily. It supports basic ANSI sequences, Xterm-256 colors, and RGB. You can ope
Gyroflow is an application that can stabilize your video by using motion data from a gyroscope and optionally an accelerometer
Gyroflow is an application that can stabilize your video by using motion data from a gyroscope and optionally an accelerometer. Modern cameras record that data internally (GoPro, Sony, Insta360 etc), and this application stabilizes the captured footage precisely by using them. It can also use gyro data from an external source (eg. from Betaflight blackbox).
Ultra-fast, spookily accurate text summarizer that works on any language
pithy 0.1.0 - an absurdly fast, strangely accurate, summariser Quick example: pithy -f your_file_here.txt --sentences 4 --help: Print this help messa
frawk is a small programming language for writing short programs processing textual data
frawk frawk is a small programming language for writing short programs processing textual data. To a first approximation, it is an implementation of t
Dataflow is a data processing library, primarily for machine learning
Dataflow Dataflow is a data processing library, primarily for machine learning. It provides efficient pipeline primitives to build a directed acyclic
A Rust wrapper for the Text synthesization service TextSynth API
A Rust wrapper for the Text synthesization service TextSynth API
Turns lines of text into SVG files.
Sentences 2 svg Does what it says on the tin. This takes in a file with some sentences and outputs numbered svgs. There are 3 arguments to keep in min
Wasm video filter booth app written in Rust
Video effect booth written in Rust and WebAssembly Play with it here: https://mtharrison.github.io/wasmbooth/ Aim I wrote this purely to teach myself
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better This repository contains code to deduplicate language model datasets as descrbed in the paper
Rust bindings for darknet
Rust bindings for darknet Darknet: Convolutional Neural Networks todo rewrite the demo function used in yolo.c in rust Examples Link existing files an
A lightweight full-text search library that provides full control over the scoring calculations
probly-search · A full-text search library, optimized for insertion speed, that provides full control over the scoring calculations. This start initia
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
Apache Arrow Powering In-Memory Analytics Apache Arrow is a development platform for in-memory analytics. It contains a set of technologies that enabl
Making Postgres and Elasticsearch work together like it's 2021
Making Postgres and Elasticsearch work together like it's 2021 Readme ZomboDB brings powerful text-search and analytics features to Postgres by using
Web-app that simulates different types of color blindness from a user-provided input image.
What is Dalted? Web-app that simulates different types of color blindness from a user-provided input image. Resources The color blindness simulation i
A Text User Interface library for the Rust programming language
Cursive Cursive is a TUI (Text User Interface) library for rust. It uses ncurses by default, but other backends are available. It allows you to build
UNIC: Unicode and Internationalization Crates for Rust
UNIC: Unicode and Internationalization Crates for Rust https://github.com/open-i18n/rust-unic UNIC is a project to develop components for the Rust pro
Ginkgo is a text editor built entirely in Rust
Ginkgo is a text editor built entirely in Rust. It supports cursor movements, CTRL commands, select vim commands, insert vs. normal modes, and more. Ginkgo is based on my text editor JED, which itself was based on the popular online editor Kilo.
An advanced image processing library for Rust.
ImageProc Maintainers: @chyh1990 Note: this project is under active depvelopment, API may change! imageproc is a advanced image proccessing library fo
IntelliJ version of the Afterglow Sublime Text theme
Afterglow IntelliJ This theme for IntelliJ is based on the the Afterglow Sublime Text theme, and replaces the default sidebar icons and colour of Inte
A tool that generates a Sublime Text project file that helps you get started using Scoggle.
README A tool that generates a Sublime Text project file that helps you get started using Scoggle. While Scoggle-Gen may not find every single source
Cross-platform realtime MIDI processing in Rust.
midir Cross-platform, realtime MIDI processing in Rust. Features midir is inspired by RtMidi and supports the same features*, including virtual ports
dye is a tool to easily color text in shell.
Dye dye is a tool to easily color text in shell. Usage See the gif below to see these commands in action. echo $(dye --red WARN) This tool will knock
Allows processing of iterators of Result types
try-continue try-continue provides one method, try_continue, which allows you to work with iterators of type ResultT, _, as if they were simply iter
Merge together and efficiently time-sort compressed .pcap files stored in AWS S3 object storage (or locally) to stdout for pipelined processing.
Merge together and efficiently time-sort compressed .pcap files stored in AWS S3 object storage (or locally) to stdout for pipelined processing. High performance and parallel implementation for 10 Gbps playback throughput with large numbers of files (~4k).
Cloud backup service for the Rite text editor
rite-cloud A cloud backup solution for the rite text editor. Written in Rust using Tide. Usage You can try out rite-cloud using the instance hosted he
My own personal code editor built with Rust + OpenGL
Glyph This is my personal code editor that I am building for fun and to get more familiar with OpenGL. Glyph currently supports Vim keybinds, syntax h
Parallel iterator processing library for Rust
Parallel iterator processing library for Rust I keep needing one, so I wrote it. See [IteratorExt] for supported operations. In essence, if you have:
Synchronized shadow state of Solana programs available for off-chain processing.
Solana Shadow The Solana Shadow crate adds shadows to solana on-chain accounts for off-chain processing. This create synchronises all accounts and the
Makdown-like text parser.
Makdown-like text parser.
WriteForAll is a text file style checker, that compares text documents with editorial tips to make text better.
WriteForAll: tips to make text better WriteForAll is a text file style checker, that compares text documents with editorial tips to make text better.
Auritia is a DAW coded in Rust and Vue in hopes of having cross platform compatability, while also providing enough features for anyone to use professionally
Steps Install WebView if you're not on Windows 11 Install Node deps npm i To run the dev server do npm run tauri dev Compiling Linux You will need to
Subtext is a text-based, block-oriented hypertext format.
Subtext: markup for note-taking Subtext is a text-based, block-oriented hypertext format. It is designed with note-taking in mind. It has a simple, pe
Image proxy and embed generator.
January Description Image proxy and metadata scraper. Features: Can scrape metadata from websites, e.g. OpenGraph Can scrape embeds from websites, e.g
Searching for plain-text files for lines that match a given string. Built with Rust.
Getting Started This is a minimal grep command-line utility built on Rust. It provides searching for plain-text files for lines that match a given str
Common processing blocks used with your Runes.
Common Processing Blocks (API Docs) Processing blocks built by Hammer of the Gods that you can use with your Runes. License This project is licensed u
rbdt is a python library (written in rust) for parsing robots.txt files for large scale batch processing.
rbdt 🚨 🚨 🚨 🚨 rbdt is a work in progress, currently being extracted out of another (private) project for the purpose of open sourcing and better so
Text-based to-do management CLI & language server
☑️ Todome (日本語版はこちら) Todome is a notation developed for making and editing to-do lists. It is inspired by Todo.txt, and in fact some of the todome not
Scripting language focused on processing tabular data.
ogma Welcome to the ogma project! ogma is a scripting language focused on ergonomically and efficiently processing tabular data, with batteries includ
lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike.
lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l
Font independent text analysis support for shaping and layout.
lipi Lipi (Sanskrit for 'writing, letters, alphabet') is a pure Rust crate that provides font independent text analysis support for shaping and layout
Following along with the Geometry Processing with Intrinsic Triangulations course in Rust.
Intrinsic Triangulations in Rust In this repo is code I wrote following along with the Nicholas Sharp, Mark Gillespie, Keenan Crane's course on geomet
🐎 Daac Horse: Double-Array Aho-Corasick in Rust
🐎 daachorse Daac Horse: Double-Array Aho-Corasick Overview A fast implementation of the Aho-Corasick algorithm using Double-Array Trie. Examples use
🔎 A simple in-memory search for collections and key-value stores.
Indicium Search 🔎 A simple in-memory search for collections (Vec, HashMap, BTreeMap, etc) and key-value stores. Features autocompletion. There are ma
A simple image average color extractor written in 🦀 Rust
A simple image average color extractor written in 🦀 Rust
colorStyle is a library of styles for command-line text write in Rust.
Colorstyle colorStyle is a library of styles for command-line text. Inspired by flylog/colorstyle (golang) Example let text = colorstyle::green("gre
better tools for text parsing
nom-text Goal: a library that extends nom to provide better tools for text formats (programming languages, configuration files). current needs Recogni
Swash is a pure Rust, cross-platform crate that provides font introspection, complex text shaping and glyph rendering.
Swash is a pure Rust, cross-platform crate that provides font introspection, complex text shaping and glyph rendering. Goals This crate aims to
full text search manpages
buke full text search manpages cargo run --release -- --build builds an sqlite3 database out of all manpages in your $MANPATH cargo run --release -- "
Source text parsing, lexing, and AST related functionality for Deno
Source text parsing, lexing, and AST related functionality for Deno.
The fastest way to identify any mysterious text or analyze strings from a file, just ask `lemmeknow` !
The fastest way to identify anything lemmeknow ⚡ Identify any mysterious text or analyze strings from a file, just ask lemmeknow. lemmeknow can be use
mdzk is a plain text Zettelkasten system that is based on the mdBook API.
mdzk A lovingly designed system and static publishing tool for your plain text Zettelkasten mdzk is a plain text Zettelkasten system that is based on
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.
Untanglr Untanglr takes in a some mangled words and makes sense out of them so you dont have to. It goes through the input and splits it probabilistic
Automated image compression for efficiently distributing images on the web.
Imager Apparently this project made it into the GitHub Archive Program. About Imager is a tool for automated image compression, and can competitively
Vim-fork focused on extensibility and usability
Documentation | Chat | Twitter Neovim is a project that seeks to aggressively refactor Vim in order to: Simplify maintenance and encourage contributio
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy
A library that creates a terminal-like window with feature-packed drawing of text and easy input handling. MIRROR.
BearLibTerminal provides a pseudoterminal window with a grid of character cells and a simple yet powerful API for flexible textual output and uncompli
Termbox is a library that provides minimalistic API which allows the programmer to write text-based user interfaces.
Termbox is a library that provides minimalistic API which allows the programmer to write text-based user interfaces.
A Rust library for processing application configuration easily
Configure me A Rust library for processing application configuration easily About This crate aims to help with reading configuration of application fr
Sorta Text Format in UTF-8
STFU-8: Sorta Text Format in UTF-8 STFU-8 is a hacky text encoding/decoding protocol for data that might be not quite UTF-8 but is still mostly UTF-8.
hck is a shortening of hack, a rougher form of cut.
🪓 hck A sharp cut(1) clone. hck is a shortening of hack, a rougher form of cut. A close to drop in replacement for cut that can use a regex delimiter
Experimental playground for wiktionary data
wikt Experimental playground for wiktionary data. This document might not update as often as the code does. Set up You'll want a minimum of 10 GB free
🔍TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites.
tinysearch TinySearch is a lightweight, fast, full-text search engine. It is designed for static websites. TinySearch is written in Rust, and then com
A full-text search engine in rust
Toshi A Full-Text Search Engine in Rust Please note that this is far from production ready, also Toshi is still under active development, I'm just slo
A full-text search and indexing server written in Rust.
Bayard Bayard is a full-text search and indexing server written in Rust built on top of Tantivy that implements Raft Consensus Algorithm and gRPC. Ach
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, written in Rust
Datafuse Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture Datafuse is a Real-Time Data Processing & Analytics DBMS wit
Rust-nlp is a library to use Natural Language Processing algorithm with RUST
nlp Rust-nlp Implemented algorithm Distance Levenshtein (Explanation) Jaro / Jaro-Winkler (Explanation) Phonetics Soundex (Explanation) Metaphone (Exp
DataFrame / Series data processing in Rust
black-jack While PRs are welcome, the approach taken only allows for concrete types (String, f64, i64, ...) I'm not sure this is the way to go. I want
ReVi is a cross-platform terminal based Vim inspired text editor.
ReVi Table Of Contents: About Usage Install Clone && Installing Development Q&A KeyBindings Roadmap Changelog About ReVi is a cross-platform terminal
Fang - Background job processing library for Rust.
Fang Background job processing library for Rust. Currently, it uses Postgres to store state. But in the future, more backends will be supported.
Kaolinite - A crate to assist in the creation of TUI text editors.
Kaolinite - A crate to assist in the creation of TUI text editors.
GStreamer HTTP Live Streaming Plugin
A highly configurable GStreamer HLS sink plugin. Based on the hlssink2 element. The flexhlssink is written in Rust and has various options to configure the HLS output playlist generation.
Rustymind is a driver and parser for NeuroSky MindWave EEG headset written in pure Rust.
Rustymind is a driver and parser for NeuroSky MindWave EEG headset written in pure Rust. You can use it to connect, interact, and plot real time data from the headset.
Lust is a static image server designed to automatically convert uploaded image to several formats and preset sizes
What is Lust? Lust is a static image server designed to automatically convert uploaded image to several formats and preset sizes with scaling in mind.
A crate using DeepSpeech bindings to convert mic audio from speech to text
DS-TRANSCRIBER Need an Offline Speech To Text converter? Records your mic, and returns a String containing what was said. Features Begins transcriptio
Amp: A text editor for your terminal.
Amp: A text editor for your terminal. Heavily inspired by Vi/Vim. Amp aims to take the core interaction model of Vim, simplify it, and bundle in the e
Intuitive find & replace CLI (sed alternative)
sd - s[earch] & d[isplace] sd is an intuitive find & replace CLI. The Pitch Why use it over any existing tools? Painless regular expressions sd uses r
A collection of filters for real-time audio processing
Audio Filters A collection of filters for real-time audio processing Feature Progress #![no_std] (via libm) f32 & f64 capable (via num-traits) SIMD Do
syntect is a syntax highlighting library for Rust that uses Sublime Text syntax definitions.
syntect is a syntax highlighting library for Rust that uses Sublime Text syntax definitions. It aims to be a good solution for any Rust project that needs syntax highlighting, including deep integration with text editors written in Rust.
Text Renderer written in Rust using HarfBuzz for shaping, FreeType for rasterization and OpenGL for rendering.
Provok Text Renderer written in Rust using HarfBuzz for shaping, FreeType for rasterization and OpenGL for rendering. Input Provok is fed with a JSON
Rust read/write support for well-known text (WKT)
wkt Rust read/write support for well-known text (WKT). License Licensed under either of Apache License, Version 2.0 (LICENSE-APACHE or http://www.apac
Rust read/write support for well-known text (WKT)
wkt Rust read/write support for well-known text (WKT). License Licensed under either of Apache License, Version 2.0 (LICENSE-APACHE or http://www.apac
discord bot for uwuifying text
uwubot This crate defines a discord bot using serenity for uwuifying text via uwuify. Installation You can install uwubot from source or from crates.i
A tray icon to uwuify your text
uwu-tray fastest uwuifier in the west... now in your system tray for even faster uwufication! FAQ Which platforms are supported? Windows Linux MacOS H
fastest text uwuifier in the west
uwuify fastest text uwuifier in the west transforms Hey... I think I really love you. Do you want a headpat? into hey... i think i w-weawwy wuv you.
📜🔁🎶 A CLI which converts morse code into sound
morse2sound 🎵 A CLI which converts morse code to sound Big shoutout to Br1ght0ne for guiding me how to use Rust on stream
A high-performance, high-reliability observability data pipeline.
Quickstart • Docs • Guides • Integrations • Chat • Download What is Vector? Vector is a high-performance, end-to-end (agent & aggregator) observabilit
Natural Language Processing for Rust
rs-natural Natural language processing library written in Rust. Still very much a work in progress. Basically an experiment, but hey maybe something c
A fast implementation of Aho-Corasick in Rust.
aho-corasick A library for finding occurrences of many patterns at once with SIMD acceleration in some cases. This library provides multiple pattern s
A fast, low-resource Natural Language Processing and Text Correction library written in Rust.
nlprule A fast, low-resource Natural Language Processing and Error Correction library written in Rust. nlprule implements a rule- and lookup-based app
Semantic text segmentation. For sentence boundary detection, compound splitting and more.
NNSplit A tool to split text using a neural network. The main application is sentence boundary detection, but e. g. compound splitting for German is a
👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike
Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. Main features: Train new vocabularies and tok
🦀 Example of serving deep learning models in Rust with batched prediction
rust-dl-webserver This project provides an example of serving a deep learning model with batched prediction using Rust. In particular it runs a GPT2 m
Rust version of the Haskell ERD tool. Translates a plain text description of a relational database schema to dot files representing an entity relation diagram.
erd-rs Rust CLI tool for creating entity-relationship diagrams from plain text markup. Based on erd (uses the same input format and output rendering).
Text Expression Runner – Readable and easy to use text expressions
ter - Text Expression Runner ter is a cli to run text expressions and perform basic text operations such as filtering, ignoring and replacing on the c
Employ your built-in wetware pattern recognition and signal processing facilities to understand your network traffic
Nethoscope Employ your built-in wetware pattern recognition and signal processing facilities to understand your network traffic. Check video on how it
A (self hosted) pastebin for easily sharing text right from the terminal
termpad termpad allows you to easily host a pastebin server for saving and viewing text right from the terminal, or the browser. Client Usage Assuming