Rust Text processing Libraries

Filter By

Rust Text processing

WTF Was That?

What The Fuck Was That? Find out what just happened to your Substrate Transaction by attaching a Rust debugger and re-executing it in a unit test envi

11 Apr 19, 2024

Simple, robust, BitTorrent's Mainline DHT implementation

Mainline Simple, robust, BitTorrent's Mainline DHT implementation. This library is focused on being the best and simplest Rust client for Mainline, es

4 Nov 21, 2023

A small CLI utility for helping you learn japanese words made in rust 🦀

Memofante (Clique aqui ver em português) Memofante is here, a biiiig help: Do you often forget japanese words you really didn't want to forget? Do you

3 Nov 4, 2023

What if we could check declarative macros before using them?

expandable An opinionated attribute-macro based macro_rules! expansion checker. Textbook example rustc treats macro definitions as some opaque piece o

13 Oct 26, 2023

SHA256 sentence: discover a SHA256 checksum that matches a sentence's description of hex digit words.

SHA256 sentence "The SHA256 for this sentence begins with: one, eight, two, a, seven, c and nine." Inspired by @lauriewired post Inspired by @humbleha

16 Oct 9, 2023

Low rank adaptation (LoRA) for Candle.

candle-lora LoRA (low rank adaptation) implemented in Rust for use with Candle. This technique interchanges the fully-trainable layers of the model wi

34 Oct 6, 2023

Ferrugem is Rust but in Portuguese-BR

ferrugem Aren't you pistola from writing Rust programs in English? Do you like saying "caralho" a lot? Would you like to try something different, in a

6 Oct 5, 2023

The Bytepiece Tokenizer Implemented in Rust.

bytepiece Implementation of Su's bytepiece. Bytepiece is a new tokenize method, which uses UTF-8 Byte as unigram to process text. It needs little prep

11 Oct 2, 2023

Implementation of sentence embeddings with BERT in Rust, using the Burn library.

Sentence Transformers in Burn This library provides an implementation of the Sentence Transformers framework for computing text representations as vec

4 Sep 4, 2023

A "Navie" Implementation of the Wavefront Algorithm For Sequence Alignment with Gap-Affine Scoring

A "Naive" Implementation of the Wavefront Algorithm for Sequence Alignment with Gap-Affine Scoring This repository contains some simple code that I wr

3 Jul 24, 2023

A lightweight platform-accelerated library for biological motif scanning using position weight matrices.

🎼 🧬 lightmotif A lightweight platform-accelerated library for biological motif scanning using position weight matrices. 🗺️ Overview Motif scanning

16 May 4, 2023

murasaki: Nostr to Speech (in Japanese)

murasaki: Nostr to Speech ⚠ このソフトウェアはα版です ⚠ VOICEVOX を利用したタイムライン読み上げツールです。指定したリレーのグローバルタイムライン、または指定した公開鍵でフォローしているユーザのタイムラインを読み上げます。つかいかた Rust をインストー

16 Mar 27, 2023

Quickner is a new tool to quickly annotate texts for NER (Named Entity Recognition). It is written in Rust and accessible through a Python API.

Quickner ⚡ A simple, fast, and easy to use NER annotator for Python Quickner is a new tool to quickly annotate texts for NER (Named Entity Recognition

7 Mar 3, 2023

mtop: top for Memcached

mtop mtop: top for Memcached. Features Display real-time statistics about your memcached servers such as Memory usage/limit Current/max connections Hi

6 Feb 27, 2023

A quick way to decode a contract's transaction data with only the contract address and abi.

tx-decoder A quick way to decode a contract's transaction data with only the contract address and abi. E.g, let tx_data = "0xe70dd2fc00000000000000000

15 Feb 13, 2023

Learning how to build real-time procedural planets

Is It Planet Yet? This is just an experiment: me trying to learn how to build real-time procedural planets. Any suggestions/contributions is welcome.

3 Jan 10, 2023

Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)

rust-bert Rust native Transformer-based models implementation. Port of Hugging Face's Transformers library, using the tch-rs crate and pre-processing

1.3k Jan 8, 2023

A fast, low-resource Natural Language Processing and Text Correction library written in Rust.

nlprule A fast, low-resource Natural Language Processing and Error Correction library written in Rust. nlprule implements a rule- and lookup-based app

496 Jan 8, 2023

An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.

regex A Rust library for parsing, compiling, and executing regular expressions. Its syntax is similar to Perl-style regular expressions, but lacks a f

2.6k Jan 8, 2023

SIMD-accelerated UTF-8 validation for Rust.

simdutf8 – High-speed UTF-8 validation for Rust Blazingly fast API-compatible UTF-8 validation for Rust using SIMD extensions, based on the implementa

441 Jan 8, 2023

Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance calculations and string search.

triple_accel Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance cal

75 Jan 8, 2023

Simple expression transformer that is not Coq.

Noq Not Coq. Simple expression transformer that is not Coq. Quick Start $ cargo run ./examples/add.noq Main Idea The Main Idea is being able to define

187 Jan 7, 2023

frawk is a small programming language for writing short programs processing textual data

frawk frawk is a small programming language for writing short programs processing textual data. To a first approximation, it is an implementation of t

1k Jan 7, 2023

A backend for mdBook written in Rust for generating PDF based on headless chrome and Chrome DevTools Protocol.

52 Jan 7, 2023

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. Main features: Train new vocabularies and tok

6.2k Jan 5, 2023

Fast PDF password cracking utility equipped with commonly encountered password format builders and dictionary attacks.

PDFRip Fast PDF password cracking utility equipped with commonly encountered password format builders and dictionary attacks. 📖 Table of Contents Int

226 Jan 4, 2023

Text calculator with support for units and conversion

cpc calculation + conversion cpc parses and evaluates strings of math, with support for units and conversion. 128-bit decimal floating points are used

82 Jan 4, 2023

A simple and fast linear algebra library for games and graphics

glam A simple and fast 3D math library for games and graphics. Development status glam is in beta stage. Base functionality has been implemented and t

953 Jan 3, 2023

👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike

Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l

569 Jan 3, 2023

Simple NLP in Rust with Python bindings

vtext NLP in Rust with Python bindings This package aims to provide a high performance toolkit for ingesting textual data for machine learning applica

133 Jan 3, 2023

Difftastic is an experimental structured diff tool that compares files based on their syntax.

13.9k Jan 2, 2023

rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc.

rga: ripgrep, but also search in PDFs, E-Books, Office documents, zip, tar.gz, etc. rga is a line-oriented search tool that allows you to look for a r

5.2k Jan 2, 2023

finalfusion embeddings in Rust

Introduction finalfusion is a crate for reading, writing, and using embeddings in Rust. finalfusion primarily works with its own format which supports

55 Jan 2, 2023

Source text parsing, lexing, and AST related functionality for Deno

Source text parsing, lexing, and AST related functionality for Deno.

90 Jan 1, 2023

Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigram (SentencePiece) models

rust-tokenizers Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigra

165 Jan 1, 2023

Rust Text processing Resources

Filter By

Categories