An implementation of the Pair Adjacent Violators algorithm for isotonic regression in Rust

Overview

Pair Adjacent Violators for Rust

Rust crates.io

Overview

An implementation of the Pair Adjacent Violators algorithm for isotonic regression. Note this algorithm is also known as "Pool Adjacent Violators".

What is "Isotonic Regression" and why should I care?

Imagine you have two variables, x and y, and you don't know the relationship between them, but you know that if x increases then y will increase, and if x decreases then y will decrease. Alternatively it may be the opposite, if x increases then y decreases, and if x decreases then y increases.

Examples of such isotonic or monotonic relationships include:

  • x is the pressure applied to the accelerator in a car, y is the acceleration of the car (acceleration increases as more pressure is applied)
  • x is the rate at which a web server is receiving HTTP requests, y is the CPU usage of the web server (server CPU usage will increase as the request rate increases)
  • x is the price of an item, and y is the probability that someone will buy it (this would be a decreasing relationship, as x increases y decreases)

These are all examples of an isotonic relationship between two variables, where the relationship is likely to be more complex than linear.

So we know the relationship between x and y is isotonic, and let's also say that we've been able to collect data about actual x and y values that occur in practice.

What we'd really like to be able to do is estimate, for any given x, what y will be, or alternatively for any given y, what x would be required.

But of course real-world data is noisy, and is unlikely to be strictly isotonic, so we want something that allows us to feed in this raw noisy data, figure out the actual relationship between x and y, and then use this to allow us to predict y given x, or to predict what value of x will give us a particular value of y. This is the purpose of the pair-adjacent-violators algorithm.

...and why should I care?

Using the examples I provide above:

  • A self-driving car could use it to learn how much pressure to apply to the accelerator to give a desired amount of acceleration
  • An autoscaling system could use it to help predict how many web servers they need to handle a given amount of web traffic
  • A retailer could use it to choose a price for an item that maximizes their profit (aka "yield optimization")

Isotonic regression in online advertising

If you have an hour to spare, and are interested in learning more about how online advertising works - you should check out this lecture that I gave in 2015 where I explain how we were able to use pair adjacent violators to solve some fun problems.

A picture is worth a thousand words

Here is the relationship that PAV extracts from some very noisy input data where there is an increasing relationship between x and y:

PAV in action

Features

  • Smart linear interpolation between points and extrapolation outside the training data domain
  • Fairly efficient implementation without compromizing code readability
  • Will intelligently extrapolate to compute y for values of x greater or less than those used to build the PAV model

License

Released under the LGPL version 3 by Ian Clarke.

See also

You might also like...
Rust implementation of real-coded GA for solving optimization problems and training of neural networks
Rust implementation of real-coded GA for solving optimization problems and training of neural networks

revonet Rust implementation of real-coded genetic algorithm for solving optimization problems and training of neural networks. The latter is also know

Instance Distance is a fast pure-Rust implementation of the Hierarchical Navigable Small Worlds paper

Fast approximate nearest neighbor searching in Rust, based on HNSW index

A real-time implementation of
A real-time implementation of "Ray Tracing in One Weekend" using nannou and rust-gpu.

Real-time Ray Tracing with nannou & rust-gpu An attempt at a real-time implementation of "Ray Tracing in One Weekend" by Peter Shirley. This was a per

A neural network, and tensor dynamic automatic differentiation implementation for Rust.

Corgi A neural network, and tensor dynamic automatic differentiation implementation for Rust. BLAS The BLAS feature can be enabled, and requires CBLAS

Flexible, reusable reinforcement learning (Q learning) implementation in Rust

Rurel Rurel is a flexible, reusable reinforcement learning (Q learning) implementation in Rust. Release documentation In Cargo.toml: rurel = "0.2.0"

A Rust๐Ÿฆ€ implementation of CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning

craftml-rs A Rust implementation of CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning (Siblini et al., 2018). Perf

Barnes-Hut t-SNE implementation written in Rust.
Barnes-Hut t-SNE implementation written in Rust.

bhtsne Barnes-Hut implementation of t-SNE written in Rust. The algorithm is described with fine detail in this paper by Laurens van der Maaten. Instal

Generic k-means implementation written in Rust

RKM - Rust k-means A simple Rust implementation of the k-means clustering algorithm based on a C++ implementation, dkm. This implementation is generic

A naive DBSCAN implementation in Rust

DBSCAN Density-Based Spatial Clustering of Applications with Noise Wikipedia link DBSCAN is a density-based clustering algorithm: given a set of point

Comments
  • Nan interpolation for descending direction

    Nan interpolation for descending direction

    Hi,

    Thanks for developing this crate, it's very convenient for my application! I was able to make it work pretty easily. However, I'm having issues with the interpolate function in the case of a descending direction. Here's an example:

    let regression = IsotonicRegression::new_descending(
            &[
                Point::new(0.0, 3.0),
                Point::new(1.0, 2.0),
                Point::new(2.0, 1.0)
            ]
    );
    println!("{:}", regression.interpolate(0.5));
    // Returns NaN
    

    This issue however doesn't occur on an ascending direction when I reverse the order of the y coordinates. Any idea what's happening?

    Thank you!

    bug 
    opened by Puumanamana 2
Owner
Ian Clarke
Degree in CS & AI. Creator of freenetproject.org, kweb.io, 33mail.com.
Ian Clarke
TopK algorithm implementation in Rust (Filtered Space-Saving)

TopK TopK algorithm implementation in Rust. This crate currently provides the Filtered Space-Saving algorithm. Version numbers follow the semver conve

null 6 Feb 24, 2023
An Implementation of the Context Tree Weighting (CTW) Sequence Prediction Algorithm

Context Tree Weighting (CTW) CTW is a lightweight, practical and well performing sequence prediction algorithm discovered by Frans Willems, Yuri Shtar

null 7 Dec 23, 2022
๐Ÿš€ efficient approximate nearest neighbor search algorithm collections library written in Rust ๐Ÿฆ€ .

?? efficient approximate nearest neighbor search algorithm collections library written in Rust ?? .

Hora-Search 2.3k Jan 3, 2023
Label Propagation Algorithm by Rust. Label propagation (LP) is graph-based semi-supervised learning (SSL). LGC and CAMLP have been implemented.

label-propagation-rs Label Propagation Algorithm by Rust. Label propagation (LP) is graph-based semi-supervised learning (SSL). A simple LGC and a mor

vaaaaanquish 4 Sep 15, 2021
A naive density-based clustering algorithm written in Rust

Density-based clustering This a pure Rust implementation of a naive density-based clustering algorithm similar to DBSCAN. Here, 50 points are located

chris m 0 Mar 19, 2020
A rust library inspired by kDDBSCAN clustering algorithm

kddbscan-rs Rust implementation of the kddbscan clustering algorithm. From the authors of kDDBSCAN algorithm. Due to the adoption of global parameters

WhizSid 2 Apr 28, 2021
k-Medoids clustering in Rust with the FasterPAM algorithm

k-Medoids Clustering in Rust with FasterPAM This Rust crate implements k-medoids clustering with PAM. It can be used with arbitrary dissimilarites, as

Erich Schubert 11 Oct 16, 2022
Rust port of the extended isolation forest algorithm for anomaly detection

Extended Isolation Forest This is a rust port of the anomaly detection algorithm described in Extended Isolation Forest and implemented in https://git

Nico Mandery 6 Oct 21, 2022
Execute genetic algorithm (GA) simulations in a customizable and extensible way.

genevo genevo provides building blocks to run simulations of optimization and search problems using genetic algorithms (GA). The vision for genevo is

Innoave 110 Dec 21, 2022
A neural network model that can approximate any non-linear function by using the random search algorithm for the optimization of the loss function.

random_search A neural network model that can approximate any non-linear function by using the random search algorithm for the optimization of the los

ph04 2 Apr 1, 2022