Rust-nlp is a library to use Natural Language Processing algorithm with RUST

Related tags

Text processing nlp
Overview

nlp

Build Status Coverage Status

Rust-nlp

Implemented algorithm

Distance

Phonetics

Development workflow

Use multirust with nightly.

$ cd to/project
$ cargo test

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

You might also like...
Multilingual implementation of RAKE algorithm for Rust

RAKE.rs The library provides a multilingual implementation of Rapid Automatic Keyword Extraction (RAKE) algorithm for Rust. How to Use Append rake to

A "Navie" Implementation of the Wavefront Algorithm For Sequence Alignment with Gap-Affine Scoring

A "Naive" Implementation of the Wavefront Algorithm for Sequence Alignment with Gap-Affine Scoring This repository contains some simple code that I wr

nombytes is a library that provides a wrapper for the bytes::Bytes byte container for use with nom.

NomBytes nombytes is a library that provides a wrapper for the bytes::Bytes byte container for use with nom. I originally made this so that I could ha

Text Expression Runner – Readable and easy to use text expressions
Text Expression Runner – Readable and easy to use text expressions

ter - Text Expression Runner ter is a cli to run text expressions and perform basic text operations such as filtering, ignoring and replacing on the c

Subtitles-rs - Use SRT subtitle files to study foreign languages

Rust subtitle utilities Are you looking for substudy? Try here. (substudy has been merged into the subtitles-rs project.) This repository contains a n

Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigram (SentencePiece) models

rust-tokenizers Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigra

The Reactive Extensions for the Rust Programming Language

This is an implementation of reactive streams, which, at the high level, is patterned off of the interfaces and protocols defined in http://reactive-s

Simple, extendable and embeddable scripting language.

duckscript duckscript SDK CLI Simple, extendable and embeddable scripting language. Overview Language Goals Installation Homebrew Binary Release Ducks

Query textual streams with PromQL-like language

pq - query textual streams with PromQL Glossary Time Series - a stream of timestamped values, aka samples sharing the same metric name and, optionally

Comments
  • Relicense under dual MIT/Apache-2.0

    Relicense under dual MIT/Apache-2.0

    This issue was automatically generated. Feel free to close without ceremony if you do not agree with re-licensing or if it is not possible for other reasons. Respond to @cmr with any questions or concerns, or pop over to #rust-offtopic on IRC to discuss.

    You're receiving this because someone (perhaps the project maintainer) published a crates.io package with the license as "MIT" xor "Apache-2.0" and the repository field pointing here.

    TL;DR the Rust ecosystem is largely Apache-2.0. Being available under that license is good for interoperation. The MIT license as an add-on can be nice for GPLv2 projects to use your code.

    Why?

    The MIT license requires reproducing countless copies of the same copyright header with different names in the copyright field, for every MIT library in use. The Apache license does not have this drawback. However, this is not the primary motivation for me creating these issues. The Apache license also has protections from patent trolls and an explicit contribution licensing clause. However, the Apache license is incompatible with GPLv2. This is why Rust is dual-licensed as MIT/Apache (the "primary" license being Apache, MIT only for GPLv2 compat), and doing so would be wise for this project. This also makes this crate suitable for inclusion and unrestricted sharing in the Rust standard distribution and other projects using dual MIT/Apache, such as my personal ulterior motive, the Robigalia project.

    Some ask, "Does this really apply to binary redistributions? Does MIT really require reproducing the whole thing?" I'm not a lawyer, and I can't give legal advice, but some Google Android apps include open source attributions using this interpretation. Others also agree with it. But, again, the copyright notice redistribution is not the primary motivation for the dual-licensing. It's stronger protections to licensees and better interoperation with the wider Rust ecosystem.

    How?

    To do this, get explicit approval from each contributor of copyrightable work (as not all contributions qualify for copyright, due to not being a "creative work", e.g. a typo fix) and then add the following to your README:

    ## License
    
    Licensed under either of
    
     * Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
     * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
    
    at your option.
    
    ### Contribution
    
    Unless you explicitly state otherwise, any contribution intentionally submitted
    for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any
    additional terms or conditions.
    

    and in your license headers, if you have them, use the following boilerplate (based on that used in Rust):

    // Copyright 2016 nlp developers
    //
    // Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
    // http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
    // <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
    // option. This file may not be copied, modified, or distributed
    // except according to those terms.
    

    It's commonly asked whether license headers are required. I'm not comfortable making an official recommendation either way, but the Apache license recommends it in their appendix on how to use the license.

    Be sure to add the relevant LICENSE-{MIT,APACHE} files. You can copy these from the Rust repo for a plain-text version.

    And don't forget to update the license metadata in your Cargo.toml to:

    license = "MIT/Apache-2.0"
    

    I'll be going through projects which agree to be relicensed and have approval by the necessary contributors and doing this changes, so feel free to leave the heavy lifting to me!

    Contributor checkoff

    To agree to relicensing, comment with :

    I license past and future contributions under the dual MIT/Apache-2.0 license, allowing licensees to chose either at their option.
    

    Or, if you're a contributor, you can check the box in this repo next to your name. My scripts will pick this exact phrase up and check your checkbox, but I'll come through and manually review this issue later as well.

    • [ ] @Freyskeyd
    • [ ] @djudd
    opened by emberian 1
Owner
Simon Paitrault
Simon Paitrault
Natural Language Processing for Rust

rs-natural Natural language processing library written in Rust. Still very much a work in progress. Basically an experiment, but hey maybe something c

Chris Tramel 211 Dec 28, 2022
Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)

rust-bert Rust native Transformer-based models implementation. Port of Hugging Face's Transformers library, using the tch-rs crate and pre-processing

null 1.3k Jan 8, 2023
Simple NLP in Rust with Python bindings

vtext NLP in Rust with Python bindings This package aims to provide a high performance toolkit for ingesting textual data for machine learning applica

Roman Yurchak 133 Jan 3, 2023
An NLP-suite powered by deep learning

DeepFrog - NLP Suite Introduction DeepFrog aims to be a (partial) successor of the Dutch-NLP suite Frog. Whereas the various NLP modules in Frog wre b

Maarten van Gompel 16 Feb 28, 2022
Probabilistically split concatenated words using NLP based on English Wikipedia unigram frequencies.

Untanglr Untanglr takes in a some mangled words and makes sense out of them so you dont have to. It goes through the input and splits it probabilistic

Andrei Butnaru 15 Nov 23, 2022
Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/

Whatlang Natural language detection for Rust with focus on simplicity and performance. Content Features Get started Documentation Supported languages

Sergey Potapov 805 Dec 28, 2022
👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike

Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l

Peter M. Stahl 569 Jan 3, 2023
lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike.

lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l

messense 7 Dec 30, 2022
A HDPSG-inspired symbolic natural language parser written in Rust

Treebender A symbolic natural language parsing library for Rust, inspired by HDPSG. What is this? This is a library for parsing natural or constructed

Theia Vogel 32 Dec 26, 2022
frawk is a small programming language for writing short programs processing textual data

frawk frawk is a small programming language for writing short programs processing textual data. To a first approximation, it is an implementation of t

Eli Rosenthal 1k Jan 7, 2023