Deep recommender systems for Rust

Maciej Kula

Last update: Dec 24, 2022

Related tags

Overview

sbr

An implementation of sequence recommenders based on the wyrm autdifferentiaton library.

sbr-rs

sbr implements efficient recommender algorithms which operate on sequences of items: given previous items a user has interacted with, the model will recommend the items the user is likely to interact with in the future.

Implemented models:

LSTM: a model that uses an LSTM network over the sequence of a user's interaction to predict their next action;
EWMA: a model that uses a simpler exponentially-weighted average of past actions to predict future interactions.

Which model performs the best will depend on your dataset. The EWMA model is much quicker to fit, and will probably be a good starting point.

Example

You can fit a model on the Movielens 100K dataset in about 10 seconds:

let mut data = sbr::datasets::download_movielens_100k().unwrap();

let mut rng = rand::XorShiftRng::from_seed([42; 16]);

let (train, test) = sbr::data::user_based_split(&mut data, &mut rng, 0.2);
let train_mat = train.to_compressed();
let test_mat = test.to_compressed();

println!("Train: {}, test: {}", train.len(), test.len());

let mut model = sbr::models::lstm::Hyperparameters::new(data.num_items(), 32)
    .embedding_dim(32)
    .learning_rate(0.16)
    .l2_penalty(0.0004)
    .lstm_variant(sbr::models::lstm::LSTMVariant::Normal)
    .loss(sbr::models::Loss::WARP)
    .optimizer(sbr::models::Optimizer::Adagrad)
    .num_epochs(10)
    .rng(rng)
    .build();

let start = Instant::now();
let loss = model.fit(&train_mat).unwrap();
let elapsed = start.elapsed();
let train_mrr = sbr::evaluation::mrr_score(&model, &train_mat).unwrap();
let test_mrr = sbr::evaluation::mrr_score(&model, &test_mat).unwrap();

println!(
    "Train MRR {} at loss {} and test MRR {} (in {:?})",
    train_mrr, loss, test_mrr, elapsed
);

License: MIT

Practice repo for learning Rust. Currently going through "Rust for JavaScript Developers" course.

rust-practice 🦀 Practice repo for learning Rust. Directories /rust-for-js-dev Files directed towards "Rust for JavaScript Developers" course. Thank y

0 Dec 25, 2021

A Rust library with homemade machine learning models to classify the MNIST dataset. Built in an attempt to get familiar with advanced Rust concepts.

mnist-classifier Ideas UPDATED: Finish CLI Flags Parallelize conputationally intensive functions Class-based naive bayes README Image parsing Confusio

0 Sep 2, 2021

🦀Rust Turkiye - Rust Dersleri

Rust Turkiye - Rust Dersleri CURIOSITY - Featuring Richard Feynman Bu repo Rust Turkiye tarafindan duzenlenen Rust Dersleri egitiminin alistirma ve ko

12 Jan 14, 2023

A Rust machine learning framework.

Linfa linfa (Italian) / sap (English): The vital circulating fluid of a plant. linfa aims to provide a comprehensive toolkit to build Machine Learning

2.2k Jan 2, 2023

Machine Learning library for Rust

rusty-machine This library is no longer actively maintained. The crate is currently on version 0.5.4. Read the API Documentation to learn more. And he

1.2k Dec 31, 2022

Rust library for Self Organising Maps (SOM).

RusticSOM Rust library for Self Organising Maps (SOM). Using this Crate Add rusticsom as a dependency in Cargo.toml [dependencies] rusticsom = "1.1.0"

26 Oct 17, 2022

Rust language bindings for TensorFlow

TensorFlow Rust provides idiomatic Rust language bindings for TensorFlow. Notice: This project is still under active development and not guaranteed to

4.1k Jan 1, 2023

Machine learning crate for Rust

rustlearn A machine learning package for Rust. For full usage details, see the API documentation. Introduction This crate contains reasonably effectiv

547 Dec 28, 2022

Rust bindings for the C++ api of PyTorch.

tch-rs Rust bindings for the C++ api of PyTorch. The goal of the tch crate is to provide some thin wrappers around the C++ PyTorch api (a.k.a. libtorc

2.3k Jan 1, 2023

Comments

Not fully backpropagated error

In the example given in readme

use std::time::Instant;

use rand;
use rand::prng::XorShiftRng;
use rand::SeedableRng;
use sbr;

fn main() {
    let mut data = sbr::datasets::download_movielens_100k().unwrap();

    let mut rng = XorShiftRng::from_seed([42; 16]);

    let (train, test) = sbr::data::user_based_split(&mut data, &mut rng, 0.2);
    let train_mat = train.to_compressed();
    let test_mat = test.to_compressed();

    println!("Train: {}, test: {}", train.len(), test.len());

    let mut model = sbr::models::lstm::Hyperparameters::new(data.num_items(), 32)
        .embedding_dim(32)
        .learning_rate(0.16)
        .l2_penalty(0.0004)
        .lstm_variant(sbr::models::lstm::LSTMVariant::Normal)
        .loss(sbr::models::Loss::WARP)
        .optimizer(sbr::models::Optimizer::Adagrad)
        .num_epochs(10)
        .rng(rng)
        .build();

    let start = Instant::now();
    let loss = model.fit(&train_mat).unwrap();
    let elapsed = start.elapsed();
    let train_mrr = sbr::evaluation::mrr_score(&model, &train_mat).unwrap();
    let test_mrr = sbr::evaluation::mrr_score(&model, &test_mat).unwrap();

    println!(
        "Train MRR {} at loss {} and test MRR {} (in {:?})",
        train_mrr, loss, test_mrr, elapsed
    );
}

getting the below error

Train: 82948, test: 17052
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9
thread '<unnamed>' panicked at 'Not fully backpropagated.', /home/nineleaps/.cargo/registry/src/github.com-1ecc6299db9ec823/wyrm-0.9.1/src/nodes.rs:45:9

opened by infinite-Joy 0

Unable to fit model when num_threads > number of logical cores on machine
Hi, I ran into an issue around multithreading support when running goodbooks-recommender (https://maciejkula.github.io/2018/07/27/recommending-books-with-rust/)

When instantiating hyperparameters with num_threads greater than the number of cores on my CPU - the task never completes, and goodbooks-recommender's CPU usage drops to 0 almost immediately.

After some println!() driven debugging I narrowed down the issue to this section of sbr https://github.com/maciejkula/sbr-rs/blob/master/src/models/sequence_model.rs#L101-L168

I think the issue is caused by a combination of the following factors:

work in the optimizer (from wyrm) is synchronized with barriers https://github.com/maciejkula/wyrm/blob/master/src/optim/mod.rs#L81

rayon uses a threadpool to allocate work https://github.com/rayon-rs/rayon/blob/master/FAQ.md#how-does-rayon-balance-work-between-threads, and defaults to running n threads, where n is the number of logical cores https://github.com/rayon-rs/rayon/blob/master/FAQ.md#how-many-threads-will-rayon-spawn

there will be m synchronized optimizers, where m is the number of threads specified https://github.com/maciejkula/wyrm/blob/12715ae99ca531db6557dca786e4a480ec608101/src/optim/mod.rs#L100-L102 and work cannot start until their barriers are synchronized

unfortunately when num_threads is set higher than the number of logical cores, they will not all be synchronized, as rayon will only run n (number of cores) pieces of work at a time

Inserting a println!() before and after this line in wyrm (https://github.com/maciejkula/wyrm/blob/12715ae99ca531db6557dca786e4a480ec608101/src/optim/mod.rs#L81) illustrates this issue.

... println!("pre barrier sync"); let _barrier = self.barrier_guard.synchronize(); println!("post barrier sync"); ...

With thread count set to 4 all is fine and I see repeated "pre barrier sync", "post barrier sync" messages. If I set thread count to 5 I see 4 "pre barrier sync" messages followed by nothing, and the program hangs.

I'm not sure how to solve this, but hopefully this report is helpful nonetheless.
opened by daniel-ferguson 1

Releases(v0.3.0)

v0.3.0(Jun 19, 2018)

Source code(tar.gz)
Source code(zip)

Owner

Maciej Kula

GitHub

Msgpack serialization/deserialization library for Python, written in Rust using PyO3, and rust-msgpack. Reboot of orjson. msgpack.org[Python]

ormsgpack ormsgpack is a fast msgpack library for Python. It is a fork/reboot of orjson It serializes faster than msgpack-python and deserializes a bi

139 Dec 30, 2022

Deep recommender systems for Rust

Related tags

Overview

sbr

sbr-rs

Example

You might also like...

Practice repo for learning Rust. Currently going through "Rust for JavaScript Developers" course.

A Rust library with homemade machine learning models to classify the MNIST dataset. Built in an attempt to get familiar with advanced Rust concepts.

🦀Rust Turkiye - Rust Dersleri

A Rust machine learning framework.

Machine Learning library for Rust

Rust library for Self Organising Maps (SOM).

Rust language bindings for TensorFlow

Machine learning crate for Rust

Rust bindings for the C++ api of PyTorch.

Comments

Not fully backpropagated error

Unable to fit model when num_threads > number of logical cores on machine

Releases(v0.3.0)

v0.3.0(Jun 19, 2018)

Owner

Maciej Kula

🦀 Example of serving deep learning models in Rust with batched prediction

miniature: a toy deep learning library written in Rust

Deep learning superresolution in pure rust

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Awesome deep learning crate

Messing around with deep learning

High performance distributed framework for training deep learning recommendation models based on PyTorch.

☁ Puff ☁ - The deep stack framework.

Deep learning at the speed of light.

Msgpack serialization/deserialization library for Python, written in Rust using PyO3, and rust-msgpack. Reboot of orjson. msgpack.org[Python]