A machine learning library for supervised training of parametrized models

Blue Yonder GmbH

Last update: May 10, 2022

Related tags

Machine learning vikos

Overview

Vikos

Vikos is a library for supervised training of parameterized, regression, and classification models

Design Goals

Model representations, cost functions, and optimization algorithms can be changed independently of each other.
Generics: Not committed to a particular data structure for inputs, targets, etc.
If the design goals above can only be achieved by sacrificing performance, so be it.

Current State

Just starting to get the traits right, by continuously trying new use cases and implementing the learning algorithms. If you are looking for more mature rust libraries in the domain of ML, you might want to check out:

rusty-machine
leaf.

Getting Started

Install the rust package manager cargo. Goto rustup and follow the instructions on the page (in my experience this works fine for Windows, Ubuntu and OS X).
Run cargo new --bin hello_vikos.
switch to the hello_vikos directory.
Run cargo run to execute the hello world program.

Edit the Cargo.toml file. Add vikos = "0.2" to your dependencies. The file should now look somewhat like this:

[package]
name = "hello_vikos"
version = "0.2"
authors = ["..."]

[dependencies]
vikos = "0.1.8"

Insert use vikos; at the first line in src/main.rs
You can now start replacing code in main with code from the tutorial.
```
fn main() {
    /* tutorial code goes here */
}
```

Documentation

Thanks to the folks of docs.rs for building and hosting the documentation!

Contributing

Want to help out? Just create an issue, pull request or contact [email protected].

Comments

All implementations of Cost::gradient simply multiply a gradient by gradient_error_by_coefficent

All implementations of Cost::gradient simply multiply a gradient by gradient_error_by_coefficent. I.e. with the currently implemented Cost functions, we could just drop the gradient_error_by_coefficent variable from the interface and let the caller of Cost::gradient multiply with such a value.

Of course, if it were to be expected that some implementations of the cost function would use that parameter in a non-multiplicative way, having it remain in the interface might be a reasonable choice.

opened by HolgerPeters 2
use Vector for gradient
Mathematically sound

Easier code (subjective, but we do need less loops)

Sharing of intermediate results which would otherwise recalculated for each coefficient

Will help avoid dynamic allocations in future. (First step to get rid of Vec in training states)
opened by pacman82 1
Implement generalized linear models

vikos supports logistic and linear regression. Actually we could change it to train any given GLM model (https://en.wikipedia.org/wiki/Generalized_linear_model).

Such models are of the type y=g^-1(X*a+b)

opened by MaxBenChrist 1
Tutorial and docs

Mostly tutorial-related changes (formatting according to the rust guidelines, added missing bits and pieces to the tutorial text).

Also spell out "Stochastic Gradient Descent" for SGD in every instance it is mention in the docs.

opened by HolgerPeters 1
Multiclass

The model trait now supports a target types different from f64. This extensions of the trait has been used to support a OneVsRest model in the library. The iris sample has therefore been stripped of its custom model implementation and uses the one from the library instead.

opened by pacman82 0
Model input types
This branch splits the old Model trait into a new Model trait for accessing the coefficients and an Expert trait for modeling the relation between that coefficients and the target.

Motivation: This branch started out with the humble goal to remove the std::marker::PhantomData hack in model::Constant which has been required for type inference to kick in, so that the user does not have to specify the feature type explicitly. However the only way to get rid of it without breaking type inference, has been to split the Model trait as stated above. Despite born out of technical necessity to please the compiler, I found that this split has intriguing consequences for the domain of ML: Consider the urban legend about the barometer question. Niels Bohr was supposedly asked to determine the height of the building using a barometer. Among others he suggested:

Throwing the barometer from a building

Marking off the number of barometer lengths vertically along the emergency staircase

Measuring the pressure difference between ground and roof Let's say we'd have to set of measurements for 1,2 and 3. We can write three Experts for the same set of coefficients (in that case the height of the building). The generalization still works even if not all experts use all coefficients (e.g. if the length of the barometer would be unknown. We would when need to learn it, but only 2 is influenced by this).

On a less philosophical note this branch allowed to implement the Model trait directly for f64 and get rid of Constant. Once the target type is generic it will allow the same classifier to implement an Expert returning a crisp prediction (true | false) or a probability (f64)
opened by pacman82 0
Ymca

resolves #17 by treating truth(Y), model(M), cost(c) and teacher(alias algorithm) as generic types in Traits. Before this pull request YMCA types are modelled as associated types in traits and generic types in implementations. This also removed the usage of std::marker::PhantomData in all but one places in the library

opened by pacman82 0
Add a Gitter chat badge to README.md

blue-yonder/vikos now has a Chat Room on Gitter

@HolgerPeters has just created a chat room. You can visit it here: https://gitter.im/vikos-optimization/Lobby.

This pull-request adds this badge to your README.md:

If my aim is a little off, please let me know.

Happy chatting.

PS: Click here if you would prefer not to receive automatic pull-requests from Gitter in future.

opened by gitter-badger 0
Remove gradient_error_by_coefficient argument from Cost::gradient

I created this PR as a PR to the base of the cost-trait branch, so that the diff is a bit easier and so that we can discuss how we want to refactor the gradient function's dependency on the derivative of the model (see Issue #10).

I refactored Cost to have a default implementation of the gradient function, and added a function outer_derivative for the derivative of the cost function when deriving with respect of the model-prediction.

opened by HolgerPeters 0
Gradients are not the only valuable attribute of a cost function

Currently, the Cost trait can be defined by just specifying fn gradient(prediction, truth, gradient_by_error_coefficient). Probably, a library user will also want to be able to obtain absolute costs as a complement to the gradient. I think this is feasible, since I cannot imagine a cost function where we can give the gradient, but not an absolute value.

Another benefit of this would be, that at the moment when implementing the Cost trait, the actual cost function is not documented in code (in doc comments at best). So when reading https://github.com/blue-yonder/vikos/blob/master/src/cost.rs#L38 it is not obvious what the cost function actually calculates.

opened by HolgerPeters 0
Persist expertises

For most world applications we need to persist our expertises, so we can reuse them without training. Most likely using rustserialize, but I did not have a closer look at the library yet

opened by pacman82 0
Compare vikos to vowpal wabbit

At the moment, vikos is fitting its models by online algorithms, one event at a time. This is quite similar to vowpal wabbit.

Vowpal Wabbit (VW) project is a fast out-of-core learning system sponsored by Microsoft Research and (previously) Yahoo! Research. Source

Vowpal wabbit is also fitting Logistic or Linear Regression Models. It would be interesting to compare both projects in terms of speed or design decisions.

opened by MaxBenChrist 1
Add link functions for common glm models

Now vikos is able to fit arbitrary glm models.

It would be nice if one adds the link functions for custom glm models such as Normal, Exponential, and so on.

You can find an overview here: Table custom link functions

opened by MaxBenChrist 1
Point out the selling points of this package

What I am lacking is a description of the selling points of this package.

It should answer the question: "What is Vikos doing better than other RUST ML libraries"?

(It seems to be that data is processed in a stream fashion, one sample at a time)

opened by MaxBenChrist 0
teach_event with immutable model

Currently, in the teacher trait, Teacher<M>::teach_event<Y, C>(&self, &mut Self::Training, &mut M, &C, &M::Input, Y) has two mutable arguments, the associated training type and model. My question is, whether we could rewrite this into a function with immutable arguments, that returns new values for these arguments. This way we might be able to rewrite learn_history's loop over the history as a folding operation (reduction).

opened by HolgerPeters 1
readme should show short Example

I received some verbal feedback that it would be nice to have a real short example in the readme. An elevator pitch made code snippet. It had also been remarked that the README should highlight the existence of a tutorial

opened by pacman82 1

Releases(v0.1.8)

v0.1.8(Nov 16, 2016)

With this release vikos supports Target types other than f64. This allows Model trait to be used for regression as well as multiclass classification.
Source code(tar.gz)
Source code(zip)

Owner

Blue Yonder GmbH

GitHub

Label Propagation Algorithm by Rust. Label propagation (LP) is graph-based semi-supervised learning (SSL). LGC and CAMLP have been implemented.

label-propagation-rs Label Propagation Algorithm by Rust. Label propagation (LP) is graph-based semi-supervised learning (SSL). A simple LGC and a mor

4 Sep 15, 2021

Training material for the KDAB training day 2023

Rust & Qt workshop This workbook contains the material for a Rust & Qt workshop held by Ferrous Systems, developed and held together with KDAB. It is

4 Nov 17, 2023

A Rust library with homemade machine learning models to classify the MNIST dataset. Built in an attempt to get familiar with advanced Rust concepts.

mnist-classifier Ideas UPDATED: Finish CLI Flags Parallelize conputationally intensive functions Class-based naive bayes README Image parsing Confusio