Machine Learning library for Rust

Overview

rusty-machine

This library is no longer actively maintained.

Join the chat at https://gitter.im/AtheMathmo/rusty-machine Build Status

The crate is currently on version 0.5.4.

Read the API Documentation to learn more.

And here is a document detailing development efforts. Including a projected timeline for immediate features. Please feel free to give feedback and let me know if there any features you believe should take precedence.


Summary

Rusty-machine is a general purpose machine learning library implemented entirely in Rust. It aims to combine speed and ease of use - without requiring a huge number of external dependencies.

This project began as a way for me to learn Rust and brush up on some less familiar machine learning algorithms and techniques. Now the project aims to provide a complete, easy to use, machine learning library for Rust.

This library is still very much in early stages of development. Although there are a good number of algorithms many other things are missing. Rusty-machine is probably not the best choice for any serious projects - but hopefully that can change in the near future!

Contributing

This project is currently looking for contributors of all capacities!

I have now created a dedicated page for contributing. If you're interested please take a look.


Implementation

This project is implemented using Rust. Currently there are no other dependencies! Though, we are planning on introducing optional BLAS/LAPACK dependencies soon.


Current Progress

Rusty-machine uses rulinalg for its linear algebra back end. This is fairly complete but there is still lots of room for optimization and we should provide BLAS/LAPACK support.

Machine Learning

  • Linear Regression
  • Logistic Regression
  • Generalized Linear Models
  • K-Means Clustering
  • Neural Networks
  • Gaussian Process Regression
  • Support Vector Machines
  • Gaussian Mixture Models
  • Naive Bayes Classifiers
  • DBSCAN
  • k-Nearest Neighbor Classifiers
  • Principal Component Analysis

There is also a basic stats module behind a feature flag.


Usage

The library usage is described well in the API documentation - including example code. I will provide a brief overview of the library in it's current state and intended usage.

Installation

The library is most easily used with cargo. Simply include the following in your Cargo.toml file:

[dependencies]
rusty-machine="0.5.4"

And then import the library using:

extern crate rusty_machine as rm;

The library consists of two core components. The linear algebra module and the learning module.

Linalg

The linear algebra module contains reexports from the rulinalg crate. This is to provide easy access to components which are used frequently within rusty-machine.

More detailed coverage can be found in the API documentation.

Learning

The learning module contains machine learning models. The machine learning implementations are designed with simpicity and customization in mind. This means you can control the optimization algorithms but still retain the ease of using default values. This is an area I am actively trying to improve on!

The models all provide predict and train methods enforced by the SupModel and UnSupModel traits.

There are some examples within this repository that can help you familiarize yourself with the library.

Comments
  • Updated neural net implementation

    Updated neural net implementation

    This PR is a response to #91, but is not yet ready to be merged.

    This PR introduces the NetLayer trait for describing the requirements for being a layer in a neural network. It changes the method of storing layers in neural networks, form only storing their sizes to explicitly storing a Vec<Box<NetLayer>>, allowing different types of layers to be defined and combined.

    The code currently passes all tests, and the and gate examples works as well, but I could use some feedback before this is merge ready. More specifically, there were several decisions I made writing this code that I am not sure were the best choices.

    Linear doesn't have a bias

    Linear layers (which are standard fully connected layers) don't include a bias term. Instead there is a separate Bias layer which appends a 1 to the end of the input, allowing a following linear layer to emulate having a bias. This feels a little roundabout, but not including a bias simplifies the linear code, and let's the user easily choose whether or not she wants to include a bias.

    num_params/param_shape redundant

    The NetLayer trait requires a function that returns the number of parameters, and a separate function that returns the shape of the parameter matrix. You could easily get the number of parameters from their shape, so is the first function necessary?

    Name of linear layer

    I am not sure Linear is the most obvious name for the layer, but I couldn't think of a better one.

    Default Implementations on NetLayer

    Some layers (Bias, ActivationFunc, pooling layers in the future) have no parameters. Would it be worthwhile to have default implementations for num_params, default_params, etc. on the NetLayer trait that assumed their were or parameters, or is it better not to have those to make sure no one forgets to implement them on a layer that does have parameters?

    Box<NetLayer> vs &NetLayer

    Neural networks currently store a vector of Box<NetLayer>. Should they instead store &NetLayer?

    NetLayer/ActivationFunc extend other traits

    NetLayer inherits from Debug, and ActivationFunc now inherits from Clone + Debug. I ran into issues with NetLayer needing Debug since BaseNeuralNetwork derive it. I ran into similar issues resulting in me having ActivationFunc inherit Clone + Debug.

    Old weight functions

    If you compile to code as is, the compiler will warn about a few unused functions, all having to do with the weights of the network. These were all created with the previous way of storing layers in mind. I was planning on just deleting them, but decided to leave them there for now in case someone sees value in leaving them, but adapted for the new storage method.

    Aside from those, the only other major thing to know is that the implementation doesn't take regularization into account yet during backpropagation. Any feedback, complaints, or improvements?

    opened by NivenT 47
  • matrixmultiply

    matrixmultiply

    matrixmultiply does f32 and f64 matrix multiplication quite competently for larger matrices. It's a simpler dependency than blas, so it might be nice to just use. It does need type specific dispatch (unstable stabilization, TypeId, or specific trait) at this point though.

    What do you think, could I help you integrate with it, or are you developing something equivalent?

    opened by bluss 20
  • Add cross validation

    Add cross validation

    Not ready to merge yet, but wanted to get a bit of feedback.

    The main issue at the moment is that we allocate fresh matrices for each of the k times we train and validate. As discussed previously, we should be able to allocate once and reuse the allocation for each of the folds. However, this would require the inputs and targets of k_fold_validate to be MatrixSlices, and I don't think any of the existing models support this yet.

    The other issue is that there's a lot of boilerplate here to avoid allocating or copying when creating the indices for training and testing on for each fold, which seems a bit overkill given all the work we do to copy all the sample data around (even if we manage to get rid of the allocations in copy_rows).

    Finally, we might want to run this in parallel, but then we really would need a separate copy of the input data for each fold.

    Any thoughts?

    opened by theotherphil 19
  • Add setters and a specified constructor: kmeans

    Add setters and a specified constructor: kmeans

    Add a constructor with specified values for all options for the KMeansClassifier. In addition, add setters for each field. Not super attached to any of the components of the patch besides KMeansClassifier::new_specified. I also added some rules to .gitignore for the sanity of vim users :smile:


    This change is Reviewable

    opened by dlrobertson 18
  • [Breaking Change] update rulinalg to v0.3.0

    [Breaking Change] update rulinalg to v0.3.0

    Adds BaseMatrix and BaseMatrixMut about everywhere.

    I benchmarked timing the examples. I didn't notice a regression but better to double check on another system.

    Once merged, it'll probably open the door to new optimizations now that slices have much more fn than before.

    opened by tafia 15
  • Add cross validation function

    Add cross validation function

    At least a naïve version, with basic reporting. Ideally also a fast version for models admitting a monoidal structure, as in https://github.com/mikeizbicki/HLearn/blob/master/README.md.

    opened by theotherphil 12
  • Implement IntoIterator for MatrixSlice

    Implement IntoIterator for MatrixSlice

    Fixes part of #94 but first I want to discuss some things with this implementation.

    1. I changed MatrixSlice.iter() to take an external lifetime bound. It seems to compile this way but I'm not sure if it is correct. This was necessary since the deducted lifetime was invalid into_iter takes self as Self and so the borrowed value was only alive inside the method.
    2. I duplicated the implementation for all three cases value, & and &mut this could surly be solved with some macro-magic. Would be nice if someone could help me there.
    3. I'm not sure how to test this. The only thing that comes in mind is that I could manually iterate and check each value individually but that seems like a lot of duplicated code.

    Small compile test

    fn into_test() { 
        let a = Matrix::new(3, 3, vec![2.0; 9]); 
        let mut b = MatrixSlice::from_matrix(&a, [1, 1], 2, 2);
    
        for v in b { 
        } 
    
        {
            let x = &b; 
            for v in x { 
            }
        }
    
        {
            let y = &mut b;
            for v in y { 
            }
        }
    } 
    
    opened by DarkDrek 12
  • Adding error handling to model traits

    Adding error handling to model traits

    This PR adds some error handling to the SupModel and UnSupModel traits. It's flagged as WIP as I have a few tweaks planned - but the core content is ready for review and I'd appreciate some feedback.

    Note that I haven't attempted to cover all possible panics from within the train and predict functions. I think it is non-trivial to decide when such things should occur and so I'd prefer to take a lightweight approach for now and add them in as we go (as this will not cause breakage in the api). With that said - if you see something that should be returning an Error here please comment so I can address/create an issue.

    opened by AtheMathmo 11
  • Use more `iter_rows` and `select_rows`

    Use more `iter_rows` and `select_rows`

    Partly closes #93

    I greped data() to search for relevant lines.

    Replace with iter_rows and select_rows whenever I could.

    Also sometimes use zip iterator instead of enumerate. Hopefully it'll lead to less bound checks.

    opened by tafia 9
  • Nnet rewrite

    Nnet rewrite

    Small rewrite of nnet module.

    • in create_weights: use rand::distributions::range::Range as advised on rand doc
      • removed initialize_weights function and put the logic directly in create_weights
    • in compute_grad: remove an unecessary intermediary grad vector. On my very unscientific test, running the example nnet-and_gate is 2.5 % faster.
    opened by tafia 9
  • A couple samples

    A couple samples

    This PR contains 2 samples based on the SVM and NNET APIs, plus a README.md, as we discussed the thing together.

    The PR attempts to address the couple points we discussed in this thread so far.

    Note the NNET sample is not the easiest we can do, but I have tried to make it a bit exciting (as in "meaningful") as well as self-contained.

    opened by ic 8
  • predictions.row_iter()  method not found in `rusty_machine::prelude::Matrix<f64>`

    predictions.row_iter() method not found in `rusty_machine::prelude::Matrix`

    8 | for (dog, prediction) in test_dogs.iter().zip(predictions.row_iter()).take(unprinted_total) { | ^^^^^^^^ method not found in rusty_machine::prelude::Matrix<f64>

    opened by yutiansut 0
  • Adam Optimizer implementation

    Adam Optimizer implementation

    This is my first pull request ever so hopefully it's not too far off from usable. I'm having some difficulties with testing my code. I plugged in my adam optimizer to the example and the trained network guessed 0.0 four times. While the default code within the example folder for a MLP worked as planned.

    For the overall structuring of the code I did my best to follow the RMSPROP optimizer.

    EDIT: Fixed some math errors and it seems to operate as expected now. Both the cost is decreasing and it correctly passed the example test. I also improved the memory usage of the previous version by using the in place util.

    opened by sixilli 0
  • thread 'main' panicked at 'assertion failed: `(left == right)`

    thread 'main' panicked at 'assertion failed: `(left == right)`

    I get the error message from the title of this issue. It's not very informative what exactly the problem is or how i can fix this. I don't know if this is a problem in my code or in the library code.

    Code: https://gist.github.com/flip111/af93e5597df5b35ca45a69d0e3571290

    Terminal:

    » RUST_BACKTRACE=1 cargo run
       Compiling rust v0.1.0 (/home/flip111/titanic/linear regression/rust)
        Finished dev [unoptimized + debuginfo] target(s) in 2.23s
         Running `target/debug/rust`
    thread 'main' panicked at 'assertion failed: `(left == right)`
      left: `891`,
     right: `714`', /home/flip111/.cargo/registry/src/github.com-1ecc6299db9ec823/rulinalg-0.3.7/src/vector.rs:314:9
    stack backtrace:
       0: backtrace::backtrace::libunwind::trace
                 at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.37/src/backtrace/libunwind.rs:88
       1: backtrace::backtrace::trace_unsynchronized
                 at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.37/src/backtrace/mod.rs:66
       2: std::sys_common::backtrace::_print_fmt
                 at src/libstd/sys_common/backtrace.rs:76
       3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
                 at src/libstd/sys_common/backtrace.rs:60
       4: core::fmt::write
                 at src/libcore/fmt/mod.rs:1030
       5: std::io::Write::write_fmt
                 at src/libstd/io/mod.rs:1412
       6: std::sys_common::backtrace::_print
                 at src/libstd/sys_common/backtrace.rs:64
       7: std::sys_common::backtrace::print
                 at src/libstd/sys_common/backtrace.rs:49
       8: std::panicking::default_hook::{{closure}}
                 at src/libstd/panicking.rs:196
       9: std::panicking::default_hook
                 at src/libstd/panicking.rs:210
      10: std::panicking::rust_panic_with_hook
                 at src/libstd/panicking.rs:473
      11: std::panicking::continue_panic_fmt
                 at src/libstd/panicking.rs:380
      12: std::panicking::begin_panic_fmt
                 at src/libstd/panicking.rs:335
      13: rulinalg::vector::Vector<T>::elemul
                 at /home/flip111/.cargo/registry/src/github.com-1ecc6299db9ec823/rusty-machine-0.5.4/<::std::macros::panic macros>:9
      14: <rusty_machine::learning::toolkit::cost_fn::CrossEntropyError as rusty_machine::learning::toolkit::cost_fn::CostFunc<rulinalg::vector::Vector<f64>>>::cost
                 at /home/flip111/.cargo/registry/src/github.com-1ecc6299db9ec823/rusty-machine-0.5.4/src/learning/toolkit/cost_fn.rs:87
      15: <rusty_machine::learning::logistic_reg::BaseLogisticRegressor as rusty_machine::learning::optim::Optimizable>::compute_grad
                 at /home/flip111/.cargo/registry/src/github.com-1ecc6299db9ec823/rusty-machine-0.5.4/src/learning/logistic_reg.rs:190
      16: <rusty_machine::learning::optim::grad_desc::GradientDesc as rusty_machine::learning::optim::OptimAlgorithm<M>>::optimize
                 at /home/flip111/.cargo/registry/src/github.com-1ecc6299db9ec823/rusty-machine-0.5.4/src/learning/optim/grad_desc.rs:83
      17: <rusty_machine::learning::logistic_reg::LogisticRegressor<A> as rusty_machine::learning::SupModel<rulinalg::matrix::Matrix<f64>,rulinalg::vector::Vector<f64>>>::train
                 at /home/flip111/.cargo/registry/src/github.com-1ecc6299db9ec823/rusty-machine-0.5.4/src/learning/logistic_reg.rs:122
      18: rust::run
                 at src/main.rs:167
      19: rust::main
                 at src/main.rs:178
      20: std::rt::lang_start::{{closure}}
                 at /rustc/4560ea788cb760f0a34127156c78e2552949f734/src/libstd/rt.rs:64
      21: std::rt::lang_start_internal::{{closure}}
                 at src/libstd/rt.rs:49
      22: std::panicking::try::do_call
                 at src/libstd/panicking.rs:292
      23: __rust_maybe_catch_panic
                 at src/libpanic_unwind/lib.rs:80
      24: std::panicking::try
                 at src/libstd/panicking.rs:271
      25: std::panic::catch_unwind
                 at src/libstd/panic.rs:394
      26: std::rt::lang_start_internal
                 at src/libstd/rt.rs:48
      27: std::rt::lang_start
                 at /rustc/4560ea788cb760f0a34127156c78e2552949f734/src/libstd/rt.rs:64
      28: main
      29: __libc_start_main
      30: _start
    
    opened by flip111 0
  • OneHotEncoder

    OneHotEncoder

    I wanted to try rust for some data analysis after getting started with python. In scikit there saw there is a OneHotEncoder which is useful for categorical data. Could such encoder be considered for rusty-machine? I would avoid having to write a bunch of boilerplate code.

    By the way, i saw that the rust bindings for tensorflow have an open issue for the same thing https://github.com/tensorflow/rust

    The scikit OneHotEncoder is not "the perfect design" in my opinion, but still it's nice to have one :) Perhaps a different design would be more suitable for rusty-machine

    opened by flip111 0
Machine Learning Library for Rust

autograph Machine Learning Library for Rust undergoing maintenance Features Portable accelerated compute Run SPIR-V shaders on GPU's that support Vulk

null 223 Jan 1, 2023
Mars is a rust machine learning library. [Goal is to make Simple as possible]

Mars Mars (ma-rs) is an blazingly fast rust machine learning library. Simple and Powerful! ?? ?? Contribution: Feel free to build this project. This i

KoBruh 3 Dec 25, 2022
A machine learning library in Rust from scratch.

Machine Learning in Rust Learn the Rust programming language through implementing classic machine learning algorithms. This project is self-completed

Chi Zuo 39 Jan 17, 2023
convolutions-rs is a crate that provides a fast, well-tested convolutions library for machine learning

convolutions-rs convolutions-rs is a crate that provides a fast, well-tested convolutions library for machine learning written entirely in Rust with m

null 10 Jun 28, 2022
A machine learning library for supervised training of parametrized models

Vikos Vikos is a library for supervised training of parameterized, regression, and classification models Design Goals Model representations, cost func

Blue Yonder GmbH 10 May 10, 2022
A Rust machine learning framework.

Linfa linfa (Italian) / sap (English): The vital circulating fluid of a plant. linfa aims to provide a comprehensive toolkit to build Machine Learning

Rust-ML 2.2k Jan 2, 2023
Machine learning crate for Rust

rustlearn A machine learning package for Rust. For full usage details, see the API documentation. Introduction This crate contains reasonably effectiv

Maciej Kula 547 Dec 28, 2022
Machine learning in Rust.

Rustml Rustml is a library for doing machine learning in Rust. The documentation of the project with a descprition of the modules can be found here. F

null 60 Dec 15, 2022
Rust based Cross-GPU Machine Learning

HAL : Hyper Adaptive Learning Rust based Cross-GPU Machine Learning. Why Rust? This project is for those that miss strongly typed compiled languages.

Jason Ramapuram 83 Dec 20, 2022
Fwumious Wabbit, fast on-line machine learning toolkit written in Rust

Fwumious Wabbit is a very fast machine learning tool built with Rust inspired by and partially compatible with Vowpal Wabbit (much love! read more abo

Outbrain 115 Dec 9, 2022
A Machine Learning Framework for High Performance written in Rust

polarlight polarlight is a machine learning framework for high performance written in Rust. Key Features TBA Quick Start TBA How To Contribute Contrib

Chris Ohk 25 Aug 23, 2022
Example of Rust API for Machine Learning

rust-machine-learning-api-example Example of Rust API for Machine Learning API example that uses resnet224 to infer images received in base64 and retu

vaaaaanquish 16 Oct 3, 2022
High-level non-blocking Deno bindings to the rust-bert machine learning crate.

bertml High-level non-blocking Deno bindings to the rust-bert machine learning crate. Guide Introduction The ModelManager class manages the FFI bindin

Carter Snook 14 Dec 15, 2022
Machine learning Neural Network in Rust

vinyana vinyana - stands for mind in pali language. Goal To implement a simple Neural Network Library in order to understand the maths behind it. This

Alexandru Olaru 3 Dec 26, 2022
Source Code for 'Practical Machine Learning with Rust' by Joydeep Bhattacharjee

Apress Source Code This repository accompanies Practical Machine Learning with Rust by Joydeep Bhattacharjee (Apress, 2020). Download the files as a z

Apress 57 Dec 7, 2022
An example of using TensorFlow rust bindings to serve trained machine learning models via Actix Web

Serving TensorFlow with Actix-Web This repository gives an example of training a machine learning model using TensorFlow2.0 Keras in python, exporting

Kyle Kosic 39 Dec 12, 2022
🏆 A ranked list of awesome machine learning Rust libraries.

best-of-ml-rust ?? A ranked list of awesome machine learning Rust libraries. This curated list contains 180 awesome open-source projects with a total

₸ornike 110 Dec 28, 2022
Machine learning crate in Rust

DeepRust - Machine learning in Rust Vision To create a deeplearning crate in rust aiming to create a great experience for ML researchers & developers

Vigneshwer Dhinakaran 8 Sep 6, 2022
BudouX-rs is a rust port of BudouX (machine learning powered line break organizer tool).

BudouX-rs BudouX-rs is a rust port of BudouX (machine learning powered line break organizer tool). Note: This project contains the deliverables of the

null 5 Jan 20, 2022