Fast hierarchical agglomerative clustering in Rust.

Overview

kodama

This crate provides a fast implementation of agglomerative hierarchical clustering.

Linux build status

This library is released under the MIT license.

The ideas and implementation in this crate are heavily based on the work of Daniel Müllner, and in particular, his 2011 paper, Modern hierarchical, agglomerative clustering algorithms. Parts of the implementation have also been inspired by his C++ library, fastcluster. Müllner's work, in turn, is based on the hierarchical clustering facilities provided by MATLAB and SciPy.

The runtime performance of this library is on par with Müllner's fastcluster implementation.

For a more detailed example of how to use hierarchical clustering, see the example in the API documentation.

Documentation

https://docs.rs/kodama

Usage

Add this to your Cargo.toml:

[dependencies]
kodama = "0.1"

and this to your crate root:

extern crate kodama;

C API and Go bindings

This repository includes kodama-capi, which provides a C interface to hierarchical clustering.

This repository also includes Go FFI bindings via cgo to the aforementioned C API. Documentation for the Go library can be found at godoc.org/github.com/diffeo/kodama/go-kodama.

You might also like...
l2 is a fast, Pytorch-style Tensor+Autograd library written in Rust
l2 is a fast, Pytorch-style Tensor+Autograd library written in Rust

l2 • 🤖 A Pytorch-style Tensor+Autograd library written in Rust Installation • Contributing • Authors • License • Acknowledgements Made by Bilal Khan

K-dimensional tree in Rust for fast geospatial indexing and lookup

kdtree K-dimensional tree in Rust for fast geospatial indexing and nearest neighbors lookup Crate Documentation Usage Benchmark License Usage Add kdtr

Ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust.

Ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust.

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

The Rust CUDA Project An ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust Guide | Getting Started | Fe

Robust and Fast tokenizations alignment library for Rust and Python
Robust and Fast tokenizations alignment library for Rust and Python

Robust and Fast tokenizations alignment library for Rust and Python

A blazing fast CLIP gRPC service in rust.

CLIP as service in Rust A blazing fast gRPC server for CLIP model, powered by ONNX. Only text model can be used now. Build cargo build --bin clip-as-s

💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. Main features: Train new vocabularies and tok

MesaTEE GBDT-RS : a fast and secure GBDT library, supporting TEEs such as Intel SGX and ARM TrustZone

MesaTEE GBDT-RS : a fast and secure GBDT library, supporting TEEs such as Intel SGX and ARM TrustZone MesaTEE GBDT-RS is a gradient boost decision tre

FFSVM stands for
FFSVM stands for "Really Fast Support Vector Machine"

In One Sentence You trained a SVM using libSVM, now you want the highest possible performance during (real-time) classification, like games or VR. Hig

Comments
  • fix deprecated stuff, update to Rust 2018 and switch to GitHub Actions

    fix deprecated stuff, update to Rust 2018 and switch to GitHub Actions

    Pretty much this entire PR is just a bunch of cosmetic changes bringing kodama up to latest edition of Rust, along with updating idioms and applying rustfmt.

    opened by BurntSushi 2
  • fix empty dendrogram steps bug

    fix empty dendrogram steps bug

    In a recent commit, we fixed the linkage routines to handle the case of an empty matrix gracefully. Unfortunately, this seems to have uncovered another bug. When asking for the steps from an empty dendrogram, the Rust FFI code does something like this:

    creating an empty dendrogram:
    
      let steps = vec![];
      for i in 0..num_steps { // num_steps == 0 here
        steps.push(build_step(...));
      }
      dendrogram{steps: steps}
    
    asking for the steps:
    
      dendrogram.steps.as_ptr()
    

    It turns out that when Rust creates a Vec that has zero capacity, it specifically does not allocate. (This is a very desirable quality, since it makes creating empty Vecs very very cheap.) However, since the Vec doesn't allocate but still provides routines for accessing its inner raw pointer (e.g., as_ptr() above), it must still be able to provide a non-null pointer. The Vec implementation achieves this by using a pointer value that is known to never point to valid memory.

    Go does not like this, which makes sense in a way, because it has a garbage collector and garbage collectors like to do funny things with pointers. The end result here is that attempting to cast the pointer returned by Rust to an array on which we can index causes the compiler to generate code that attempts to dereference the pointer (that we know is invalid). This in turn causes a nil-dereference panic.

    A simple way to fix this is to replace this

    let steps = vec![];
    

    with this

    let steps = Vec::with_capacity(1);
    

    This crates a Vec with zero elements but a capacity of 1, which forces an allocation, and therefore in turn results in producing a pointer to valid memory. This makes Go happy.

    Alternatively, we can detect this case from the Go code (the dendrogram is empty) and just return a slice. We choose to do that in this commit, but for no particularly strong reason other than the fact that it seems weird to create a non-empty allocation on the Rust side.

    opened by BurntSushi 0
  • fix panic when running linkage on empty matrices

    fix panic when running linkage on empty matrices

    The commit messages should explain things.

    This PR also bumps kodama to 0.2 to reflect a corresponding bump to num-traits (which is a public dependency).

    opened by BurntSushi 0
Owner
Diffeo
Uncover the whole story.
Diffeo
HNSW ANN from the paper "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs"

hnsw Hierarchical Navigable Small World Graph for fast ANN search Enable the serde feature to serialize and deserialize HNSW. Tips A good default for

Rust Computer Vision 93 Dec 30, 2022
A Rust🦀 implementation of CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning

craftml-rs A Rust implementation of CRAFTML, an Efficient Clustering-based Random Forest for Extreme Multi-label Learning (Siblini et al., 2018). Perf

Tom Dong 15 Nov 6, 2022
A naive density-based clustering algorithm written in Rust

Density-based clustering This a pure Rust implementation of a naive density-based clustering algorithm similar to DBSCAN. Here, 50 points are located

chris m 0 Mar 19, 2020
A rust library inspired by kDDBSCAN clustering algorithm

kddbscan-rs Rust implementation of the kddbscan clustering algorithm. From the authors of kDDBSCAN algorithm. Due to the adoption of global parameters

WhizSid 2 Apr 28, 2021
Rust implementation for DBSCANSD, a trajectory clustering algorithm.

DBSCANSD Rust implementation for DBSCANSD, a trajectory clustering algorithm. Brief Introduction DBSCANSD (Density-Based Spatial Clustering of Applica

Nick Gu 2 Mar 14, 2021
k-Medoids clustering in Rust with the FasterPAM algorithm

k-Medoids Clustering in Rust with FasterPAM This Rust crate implements k-medoids clustering with PAM. It can be used with arbitrary dissimilarites, as

Erich Schubert 11 Oct 16, 2022
DBSCAN and OPTICS clustering algorithms.

petal-clustering A collection of clustering algorithms. Currently this crate provides DBSCAN and OPTICS. Examples The following example shows how to c

Petabi 15 Dec 15, 2022
A fast, safe and easy to use reinforcement learning framework in Rust.

RSRL (api) Reinforcement learning should be fast, safe and easy to use. Overview rsrl provides generic constructs for reinforcement learning (RL) expe

Thomas Spooner 139 Dec 13, 2022
Fwumious Wabbit, fast on-line machine learning toolkit written in Rust

Fwumious Wabbit is a very fast machine learning tool built with Rust inspired by and partially compatible with Vowpal Wabbit (much love! read more abo

Outbrain 115 Dec 9, 2022
Rust wrapper for the Fast Artificial Neural Network library

fann-rs Rust wrapper for the Fast Artificial Neural Network (FANN) library. This crate provides a safe interface to FANN on top of the low-level bindi

Andreas Fackler 12 Jul 17, 2022