Program implementing the approximate version of DBSCAN introduced by Gan and Tao

Overview

appr_dbscan_rust

Rust implementation of the approximate version of DBSCAN introduced by Gan and Tao in this paper

Notice

An upated version of this library is available in the linfa ecosistem

Recognized data files

Accepted data files should contain one data point per line and nothing else. Each line should contain the components of the point separated by whitespace.

Each component of a point will be read and stored as a 64 bit floating point variable

Example

1.0 1.1 0.5
2.3 3.4 6.2
...

Usage

There are four main functions in this library that differ in the kind of input they accept:

Warning

Each function below expects all points in the data file to have the same dimensionality and panics otherwise.

Approximated DBSCAN from data file with fixed dimensionality

If the dimensionality of each point is statically known (so not a result from another calculation) then this function can be used:

pub fn do_appr_dbscan_file<P, const D: usize>(
    filename: P, 
    epsilon: f64, 
    rho: f64, 
    min_pts: usize
) -> DBSCANResult<D> 
where
    P: AsRef<Path>, 

Example

extern crate appr_dbscan;
use appr_dbscan::do_appr_dbscan_file;
use appr_dbscan::utils::DBSCANResult;
 
let res : DBSCANResult<2> = do_appr_dbscan_file("./datasets/out_test_1.txt", 0.3, 0.1, 10);
let clusters_count = res.len() - 1;
let noise_points_count = res[0].len();

Approximated DBSCAN from data file with unknown dimensionality

If the dimensionality of the data points is not statically known (like if there is a loop going through multiple files with different dimensionalities) then this function can be used:

pub fn do_appr_dbscan_auto_dimensionality_file<P>(
    filename: P, 
    epsilon: f64, 
    rho: f64, 
    min_pts: usize
) -> (VectorDBSCANResult, usize) 
where
    P: AsRef<Path>, 

Example

extern crate appr_dbscan;
use appr_dbscan::do_appr_dbscan_auto_dimensionality_file;

let (res,dimensionality) = do_appr_dbscan_auto_dimensionality_file("./datasets/out_test_1.txt", 0.3, 0.1, 10);
println!("Points dimensionality: {}",dimensionality);
let clusters_count = res.len() - 1;
let noise_points_count = res[0].len();

Approximated DBSCAN from vector of points of fixed dimensionality

If you have a vector of points of the type Vec<[f64;D]> then this function can be used:

pub fn do_appr_dbscan_points<const D: usize>(
    points: Vec<Point<D>>, 
    epsilon: f64, 
    rho: f64, 
    min_pts: usize
) -> DBSCANResult<D>

Example

extern crate appr_dbscan;
use appr_dbscan::do_appr_dbscan_points;
use appr_dbscan::utils::DBSCANResult;
 
let points = vec![[0.0,0.0],[1.0,1.0],[0.0,1.0],[1.0,0.0],[2.0,1.0],[0.0,2.0],[2.0,1.0],[1.0,1.0]];
let res : DBSCANResult<2> = do_appr_dbscan_points(points, 0.3, 0.1, 10);
let clusters_count = res.len() - 1;
let noise_points_count = res[0].len();

Approximated DBSCAN from vector of points of unknown dimensionality

If you have a vector of points of the type Vec<Vec<f64>> (in example if you are in a loop clustering different vectors) then this function can be used:

pub fn do_appr_dbscan_auto_dimensionality_points(
    points: Vec<VectorPoint>, 
    epsilon: f64, 
    rho: f64, 
    min_pts: usize
) -> (VectorDBSCANResult, usize)

Example

extern crate appr_dbscan;
use appr_dbscan::do_appr_dbscan_auto_dimensionality_points;

let points = vec![vec![0.0,0.0],vec![1.0,1.0],vec![0.0,1.0],vec![1.0,0.0],vec![2.0,1.0],vec![0.0,2.0],vec![2.0,1.0],vec![1.0,1.0]];
let (res, dimensionality) = do_appr_dbscan_auto_dimensionality_points(points, 0.3, 0.1, 10);
println!("Points dimensionality: {}",dimensionality);
let clusters_count = res.len() - 1;
let noise_points_count = res[0].len();
You might also like...
A library implementing GPU-accelerated cryptographic functionality for the zkSync prover.
A library implementing GPU-accelerated cryptographic functionality for the zkSync prover.

zkSync Era: A ZK Rollup For Scaling Ethereum zkSync Era is a layer 2 rollup that uses zero-knowledge proofs to scale Ethereum without compromising on

A small, basical and unoptimized version of RWKV in Rust written by someone with no math or ML knowledge.

Smol Rust RWKV What is it? A simple example of the RWKV approach to language models written in Rust by someone that knows basically nothing about math

Small program which groups images based on the GPS position.

gps-cluster This small program will take some pictures in input, and based on the metadata on every image, it will group them by their GPS position, i

Narwhal and Tusk A DAG-based Mempool and Efficient BFT Consensus.

This repo contains a prototype of Narwhal and Tusk. It supplements the paper Narwhal and Tusk: A DAG-based Mempool and Efficient BFT Consensus.

MesaTEE GBDT-RS : a fast and secure GBDT library, supporting TEEs such as Intel SGX and ARM TrustZone

MesaTEE GBDT-RS : a fast and secure GBDT library, supporting TEEs such as Intel SGX and ARM TrustZone MesaTEE GBDT-RS is a gradient boost decision tre

Ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust.

Ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust.

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

The Rust CUDA Project An ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust Guide | Getting Started | Fe

[WIP] An experimental Java-like language and it's virtual machine, for learning Java and JVM.

Sky VM An experimental Java-like language and it's virtual machine, for learning Java and JVM. Dependencies Rust (rust-lang/rust) 2021 Edition, dual-l

Robust and Fast tokenizations alignment library for Rust and Python
Robust and Fast tokenizations alignment library for Rust and Python

Robust and Fast tokenizations alignment library for Rust and Python

Owner
Ivano Donadi
I'm an engineering student that enjoys coding using c, c++, and rust. I mostly enjoy building CLI programs and IoT stuff
Ivano Donadi
A naive DBSCAN implementation in Rust

DBSCAN Density-Based Spatial Clustering of Applications with Noise Wikipedia link DBSCAN is a density-based clustering algorithm: given a set of point

Alan K 2 Dec 23, 2022
A pure, low-level tensor program representation enabling tensor program optimization via program rewriting

Glenside is a pure, low-level tensor program representation which enables tensor program optimization via program rewriting, using rewriting frameworks such as the egg equality saturation library.

Gus Smith 45 Dec 28, 2022
HNSW ANN from the paper "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs"

hnsw Hierarchical Navigable Small World Graph for fast ANN search Enable the serde feature to serialize and deserialize HNSW. Tips A good default for

Rust Computer Vision 93 Dec 30, 2022
Graph-based Approximate Nearest Neighbor Search

granne* granne (graph-based retrieval of approximate nearest neighbors) is a Rust library for approximate nearest neighbor search based on Hierarchica

null 283 Dec 21, 2022
🚀 efficient approximate nearest neighbor search algorithm collections library written in Rust 🦀 .

?? efficient approximate nearest neighbor search algorithm collections library written in Rust ?? .

Hora-Search 2.3k Jan 3, 2023
Rust-port of spotify/annoy as a wrapper for Approximate Nearest Neighbors in C++/Python optimized for memory usage.

Rust-port of spotify/annoy as a wrapper for Approximate Nearest Neighbors in C++/Python optimized for memory usage.

Arthur·Thomas 13 Mar 10, 2022
A neural network model that can approximate any non-linear function by using the random search algorithm for the optimization of the loss function.

random_search A neural network model that can approximate any non-linear function by using the random search algorithm for the optimization of the los

ph04 2 Apr 1, 2022
Rust-port of spotify/annoy as a wrapper for Approximate Nearest Neighbors in C++/Python optimized for memory usage.

Fareast This library is a rust port of spotify/annoy , currently only index serving is supported. It also provides FFI bindings for jvm, dotnet and da

Arthur·Thomas 13 Mar 10, 2022
An approximate K-NN written in Rust.

small_knn This library is an approximate K-nearest neighbor search based on Hierarchical Navigable Small World (https://arxiv.org/pdf/1603.09320.pdf).

null 2 Nov 22, 2022
A toolkit for implementing polynomial IOPs (PIOPs)

Plonky3 Plonky3 is a toolkit for implementing polynomial IOPs (PIOPs), such as PLONK and STARKs. It aims to support several polynomial commitment sche

null 5 Mar 31, 2023