A rust implementation of some popular snowball stemming algorithms

Overview

Rust Stemmers

This crate implements some stemmer algorithms found in the snowball project which are compiled to rust using the rust-backend of the snowball compiler.

Supported Algorithms

  • Arabic
  • Armenian
  • Danish
  • Dutch
  • English
  • French
  • German
  • Greek
  • Hungarian
  • Italian
  • Norwegian
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Tamil
  • Turkish

Usage

extern crate rust_stemmers;
use rust_stemmers::{Algorithm, Stemmer};

// Create a stemmer for the english language
let en_stemmer = Stemmer::create(Algorithm::English);

// Stemm the word "fruitlessly"
// Please be aware that all algorithms expect their input to only contain lowercase characters.
assert_eq!(en_stemmer.stem("fruitlessly"), "fruitless");

Related Projects

  • The stemmer crate provides bindings to the C Snowball implementation.
You might also like...
Simple, robust, BitTorrent's Mainline DHT implementation

Mainline Simple, robust, BitTorrent's Mainline DHT implementation. This library is focused on being the best and simplest Rust client for Mainline, es

Rust-nlp is a library to use Natural Language Processing algorithm with RUST

nlp Rust-nlp Implemented algorithm Distance Levenshtein (Explanation) Jaro / Jaro-Winkler (Explanation) Phonetics Soundex (Explanation) Metaphone (Exp

Fast suffix arrays for Rust (with Unicode support).
Fast suffix arrays for Rust (with Unicode support).

suffix Fast linear time & space suffix arrays for Rust. Supports Unicode! Dual-licensed under MIT or the UNLICENSE. Documentation https://docs.rs/suff

Elastic tabstops for Rust.

tabwriter is a crate that implements elastic tabstops. It provides both a library for wrapping Rust Writers and a small program that exposes the same

An efficient and powerful Rust library for word wrapping text.

Textwrap Textwrap is a library for wrapping and indenting text. It is most often used by command-line programs to format dynamic output nicely so it l

⏮ ⏯ ⏭ A Rust library to easily read forwards, backwards or randomly through the lines of huge files.

EasyReader The main goal of this library is to allow long navigations through the lines of large files, freely moving forwards and backwards or gettin

Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/
Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/

Whatlang Natural language detection for Rust with focus on simplicity and performance. Content Features Get started Documentation Supported languages

A Rust library for generically joining iterables with a separator

joinery A Rust library for generically joining iterables with a separator. Provides the tragically missing string join functionality to rust. extern c

Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance calculations and string search.

triple_accel Rust edit distance routines accelerated using SIMD. Supports fast Hamming, Levenshtein, restricted Damerau-Levenshtein, etc. distance cal

Comments
  • Make serde optional

    Make serde optional

    I added a cargo feature to make serde optional, but default. Since the default feature set includes serde, it is opt out and should not break any crates that depend on this one.

    There was problems building the Armenian algorithm, which I am not using, but I had to correct to get this crate to build. I don't know if the casts from usize to u32 are inappropriate or not, since they could overflow, but I don't think your cursors would require more than 32-bits. Another fork removed the Armenian algorithm, probably because they were having problems too.

    opened by riley-ashton 0
Owner
CurrySoftware GmbH
CurrySoftware GmbH
An implementation of regular expressions for Rust. This implementation uses finite automata and guarantees linear time matching on all inputs.

regex A Rust library for parsing, compiling, and executing regular expressions. Its syntax is similar to Perl-style regular expressions, but lacks a f

The Rust Programming Language 2.6k Jan 8, 2023
Multilingual implementation of RAKE algorithm for Rust

RAKE.rs The library provides a multilingual implementation of Rapid Automatic Keyword Extraction (RAKE) algorithm for Rust. How to Use Append rake to

Navid 26 Dec 16, 2022
Snips NLU rust implementation

Snips NLU Rust Installation Add it to your Cargo.toml: [dependencies] snips-nlu-lib = { git = "https://github.com/snipsco/snips-nlu-rs", branch = "mas

Snips 327 Dec 26, 2022
A fast implementation of Aho-Corasick in Rust.

aho-corasick A library for finding occurrences of many patterns at once with SIMD acceleration in some cases. This library provides multiple pattern s

Andrew Gallant 662 Dec 31, 2022
🦀 A Rust implementation of a RoBERTa classification model for the SNLI dataset

RustBERTa-SNLI A Rust implementation of a RoBERTa classification model for the SNLI dataset, with support for fine-tuning, predicting, and serving. Th

AI2 11 Oct 17, 2022
Gomez - A pure Rust framework and implementation of (derivative-free) methods for solving nonlinear (bound-constrained) systems of equations

Gomez A pure Rust framework and implementation of (derivative-free) methods for solving nonlinear (bound-constrained) systems of equations. Warning: T

Datamole 19 Dec 24, 2022
Implementation of sentence embeddings with BERT in Rust, using the Burn library.

Sentence Transformers in Burn This library provides an implementation of the Sentence Transformers framework for computing text representations as vec

Tyler Vergho 4 Sep 4, 2023
A naive native 128-bit cityhash v102 implementation

Naive CityHash naive-cityhash is a naive native 128-bit cityhash v102 implementation for clickhouse*. Contact Chojan Shang - @PsiACE - psiace@outlook.

Chojan Shang 5 Apr 4, 2022
A naive (read: slow) implementation of Word2Vec. Uses BLAS behind the scenes for speed.

SloWord2Vec This is a naive implementation of Word2Vec implemented in Rust. The goal is to learn the basic principles and formulas behind Word2Vec. BT

Lloyd 2 Jul 5, 2018
A "Navie" Implementation of the Wavefront Algorithm For Sequence Alignment with Gap-Affine Scoring

A "Naive" Implementation of the Wavefront Algorithm for Sequence Alignment with Gap-Affine Scoring This repository contains some simple code that I wr

Jason Chin 3 Jul 24, 2023