K-dimensional tree in Rust for fast geospatial indexing and lookup

Overview

kdtree Build Status

K-dimensional tree in Rust for fast geospatial indexing and nearest neighbors lookup

Usage

Add kdtree to Cargo.toml

[dependencies]
kdtree = "0.5.1"

Add points to kdtree and query nearest n points with distance function

use kdtree::KdTree;
use kdtree::ErrorKind;
use kdtree::distance::squared_euclidean;

let a: ([f64; 2], usize) = ([0f64, 0f64], 0);
let b: ([f64; 2], usize) = ([1f64, 1f64], 1);
let c: ([f64; 2], usize) = ([2f64, 2f64], 2);
let d: ([f64; 2], usize) = ([3f64, 3f64], 3);

let dimensions = 2;
let mut kdtree = KdTree::new(dimensions);

kdtree.add(&a.0, a.1).unwrap();
kdtree.add(&b.0, b.1).unwrap();
kdtree.add(&c.0, c.1).unwrap();
kdtree.add(&d.0, d.1).unwrap();

assert_eq!(kdtree.size(), 4);
assert_eq!(
    kdtree.nearest(&a.0, 0, &squared_euclidean).unwrap(),
    vec![]
);
assert_eq!(
    kdtree.nearest(&a.0, 1, &squared_euclidean).unwrap(),
    vec![(0f64, &0)]
);
assert_eq!(
    kdtree.nearest(&a.0, 2, &squared_euclidean).unwrap(),
    vec![(0f64, &0), (2f64, &1)]
);
assert_eq!(
    kdtree.nearest(&a.0, 3, &squared_euclidean).unwrap(),
    vec![(0f64, &0), (2f64, &1), (8f64, &2)]
);
assert_eq!(
    kdtree.nearest(&a.0, 4, &squared_euclidean).unwrap(),
    vec![(0f64, &0), (2f64, &1), (8f64, &2), (18f64, &3)]
);
assert_eq!(
    kdtree.nearest(&a.0, 5, &squared_euclidean).unwrap(),
    vec![(0f64, &0), (2f64, &1), (8f64, &2), (18f64, &3)]
);
assert_eq!(
    kdtree.nearest(&b.0, 4, &squared_euclidean).unwrap(),
    vec![(0f64, &1), (2f64, &0), (2f64, &2), (8f64, &3)]
);

Benchmark

cargo bench with 2.3 GHz Intel i5-7360U:

cargo bench
     Running target/release/deps/bench-9e622e6a4ed9b92a

running 2 tests
test bench_add_to_kdtree_with_1k_3d_points       ... bench:         106 ns/iter (+/- 25)
test bench_nearest_from_kdtree_with_1k_3d_points ... bench:       1,237 ns/iter (+/- 266)

test result: ok. 0 passed; 0 failed; 0 ignored; 2 measured; 0 filtered out

Thanks Eh2406 for various fixes and perf improvements.

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Comments
  • add ability to remove item

    add ability to remove item

    From limited testing this seems to work well. Had to add a partialeq constraint to filter data and points. I don't rebalance the tree, just remove the items, not sure if that would be helpful. Fixes #30

    opened by etrombly 4
  • Is it possible to have a function nearest_radius that includes the maximum radius?

    Is it possible to have a function nearest_radius that includes the maximum radius?

    I think that everything is ready to allow users to pass the maximum radius when searching for nearest points. Unfortunately rust does not support optional arguments yet, so something like this could be trivial to implement:

    pub fn nearest_radius<F>(
            &self,
            point: &[A],
            num: usize,
            radius: A,
            distance: &F,
        )
    

    The current function

    pub fn nearest<F>(
            &self,
            point: &[A],
            num: usize,
            distance: &F,
        )
    

    calls to

    self.nearest_step(
                    point,
                    num,
                    A::infinity(),
                    distance,
                    &mut pending,
                    &mut evaluated,
                );
    

    which already has the maximum radius as A::infinity().

    opened by exepulveda 4
  • Bug: distance_to_space on wrong thing. Massive performance wins.

    Bug: distance_to_space on wrong thing. Massive performance wins.

    This is big! I stumbled on this experimenting with a refactoring.

    New bench:

    running 2 tests
    test bench_add_to_kdtree_with_1k_3d_points       ... bench:         156 ns/iter (+/- 44)
    test bench_nearest_from_kdtree_with_1k_3d_points ... bench:       3,603 ns/iter (+/- 3,217)
    

    Old bench:

    running 2 tests
    test bench_add_to_kdtree_with_1k_3d_points       ... bench:         139 ns/iter (+/- 61)
    test bench_nearest_from_kdtree_with_1k_3d_points ... bench:       6,677 ns/iter (+/- 6,613)
    

    Maby evan "0.3.1"?

    opened by Eh2406 4
  • Stack overflow in add_to_bucket

    Stack overflow in add_to_bucket

        I get the same error when I use this crate.
    

    And I figured out the reason.

    The avg will be equal to a, under the limit of f64's accuracy. As a result, the fn split() can't split this node successfully. In this situation, fn add_to_bucket() and fn split() will call each other infinitely.

        let a = 0.47945351705599926f64;
        let b = 0.47945351705599931f64;
        
        let avg = (a + b) / 2.0;
    
        println!("{:.20E}", a);
        println!("{:.20E}", b);
        println!("{:.20E}", avg);
    

    And the resolution is extremely easy. Just modify fn belongs_in_left() like this

    fn belongs_in_left(&self, point: &[A]) -> bool {
            // before
            // point[self.split_dimension.unwrap()] < self.split_value.unwrap()
            // after
            point[self.split_dimension.unwrap()] <= self.split_value.unwrap()
    }
    

    Originally posted by @TYPEmber in https://github.com/mrhooray/kdtree-rs/issues/20#issuecomment-1344093398

    opened by TYPEmber 2
  • What is a computational complexity of queuing nearest point?

    What is a computational complexity of queuing nearest point?

    For unknown reason the kdtree-rs library takes about 1000x more time than Python's sklearn.neighbors.KDTree for very large datasets (like several millions of points). Can you confirm we have O(log(n)) for the nearest point query?

    opened by ababo 2
  • Some cleanups

    Some cleanups

    • Run cargo fmt
    • Run cargo +nightly clippy and fixed warnings
    • Run cargo fix --edition
    • Fixed docs in src/distance.rs, and other cosmetic changes
    • Replaced assert!(x == y) with assert_eq!(x, y)
    opened by ordovicia 2
  • Stack overflow in add_to_bucket

    Stack overflow in add_to_bucket

    I haven't minimised yet, but in some circumstances, add_to_bucket results in a stack overflow. Without looking too closely, I think there's an infinite (or exponential) loop where add_to_bucket calls split, which in turn calls add_to_bucket, etc.

    Edit: If I create a KdTree with sufficient capacity to begin with, the stack overflow doesn't occur, so the problem is with it dynamically resizing.

    opened by varkor 2
  • Add serde to support serialization

    Add serde to support serialization

    I have a project encoding location data into a kdtree. Because all the fields are private it is a hassle to add getters for all of them. Supporting serde here would be a lot more convenient.

    opened by etrombly 2
  • Question about U on kdtre.rs

    Question about U on kdtre.rs

    Hi,

    I'm interested in space aware datastructures and just learning rust, I was wondering what does the U: AsRef<f64> means in the kdtree.rs file. Does it mean that there should be a generic parameter U that defaults to a slice of f64 items that are default implemented and functions like:

        pub fn nearest<F>(&self,
                          point: &[f64],
                          num: usize,
                          distance: &F)
                          -> Result<Vec<(f64, &T)>, ErrorKind>
    

    can be implemented to use a type U rather than f64?

    opened by odarbelaeze 2
  • Release a new Version

    Release a new Version

    It seems like a lot has changed since the last release (the documentation doesn't even match anymore). So it would be cool if you could release a new version. I'll have to use the git version directly till then I guess.

    opened by CryZe 2
  • Do you have a plan to upgrade the version of crate ?

    Do you have a plan to upgrade the version of crate ?

    Hello ,
    First of all, thank you for your work. I'm planning to use your kdtree crate in rust. In my checking, the latest crate version is 0.6.0 , which is almost 3 years ago. But it seems that git hub has more commits after it. Do you have a plan to upgrade the version of crate ?

    opened by cdragon-choi 1
Owner
Rui Hu
Rui Hu
Nearest neighbor search algorithms including a ball tree and a vantage point tree.

petal-neighbors Nearest neighbor search algorithms including a ball tree and a vantage point tree. Examples The following example shows how to find tw

Petabi 6 Oct 19, 2022
Hamming Weight Tree from the paper Online Nearest Neighbor Search in Hamming Space

hwt Hamming Weight Tree from the paper Online Nearest Neighbor Search in Hamming Space To understand how the data structure works, please see the docs

Rust Computer Vision 7 Oct 9, 2021
An Implementation of the Context Tree Weighting (CTW) Sequence Prediction Algorithm

Context Tree Weighting (CTW) CTW is a lightweight, practical and well performing sequence prediction algorithm discovered by Frans Willems, Yuri Shtar

null 7 Dec 23, 2022
Ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust.

Ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust.

Riccardo D'Ambrosio 2.1k Jan 5, 2023
Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

The Rust CUDA Project An ecosystem of libraries and tools for writing and executing extremely fast GPU code fully in Rust Guide | Getting Started | Fe

Rust GPU 2.1k Dec 30, 2022
Robust and Fast tokenizations alignment library for Rust and Python

Robust and Fast tokenizations alignment library for Rust and Python

Yohei Tamura 14 Dec 10, 2022
MesaTEE GBDT-RS : a fast and secure GBDT library, supporting TEEs such as Intel SGX and ARM TrustZone

MesaTEE GBDT-RS : a fast and secure GBDT library, supporting TEEs such as Intel SGX and ARM TrustZone MesaTEE GBDT-RS is a gradient boost decision tre

MesaLock Linux 179 Nov 18, 2022
A fast, safe and easy to use reinforcement learning framework in Rust.

RSRL (api) Reinforcement learning should be fast, safe and easy to use. Overview rsrl provides generic constructs for reinforcement learning (RL) expe

Thomas Spooner 139 Dec 13, 2022
💥 Fast State-of-the-Art Tokenizers optimized for Research and Production

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility. Main features: Train new vocabularies and tok

Hugging Face 6.2k Jan 2, 2023
Fast, accessible and privacy friendly AI deployment

Mithril Security - BlindAI Website | LinkedIn | Blog | Twitter | Documentation | Discord Fast, accessible and privacy friendly AI deployment ?? ?? Bli

Mithril Security 312 Dec 23, 2022
A fast and cross-platform Signed Distance Function (SDF) viewer, easily integrated with your SDF library.

SDF Viewer (demo below) A fast and cross-platform Signed Distance Function (SDF) viewer, easily integrated with your SDF library. A Signed Distance Fu

null 39 Dec 21, 2022
scalable and fast unofficial osu! server implementation

gamma! the new bancho server for theta! built for scalability and speed configuration configuration is done either through gamma.toml, or through envi

null 3 Jan 7, 2023
Fast hierarchical agglomerative clustering in Rust.

kodama This crate provides a fast implementation of agglomerative hierarchical clustering. This library is released under the MIT license. The ideas a

Diffeo 61 Oct 7, 2022
Fwumious Wabbit, fast on-line machine learning toolkit written in Rust

Fwumious Wabbit is a very fast machine learning tool built with Rust inspired by and partially compatible with Vowpal Wabbit (much love! read more abo

Outbrain 115 Dec 9, 2022
Rust wrapper for the Fast Artificial Neural Network library

fann-rs Rust wrapper for the Fast Artificial Neural Network (FANN) library. This crate provides a safe interface to FANN on top of the low-level bindi

Andreas Fackler 12 Jul 17, 2022
Instance Distance is a fast pure-Rust implementation of the Hierarchical Navigable Small Worlds paper

Fast approximate nearest neighbor searching in Rust, based on HNSW index

Instant Domain Search, Inc. 135 Dec 24, 2022
l2 is a fast, Pytorch-style Tensor+Autograd library written in Rust

l2 • ?? A Pytorch-style Tensor+Autograd library written in Rust Installation • Contributing • Authors • License • Acknowledgements Made by Bilal Khan

Bilal Khan 163 Dec 25, 2022
A blazing fast CLIP gRPC service in rust.

CLIP as service in Rust A blazing fast gRPC server for CLIP model, powered by ONNX. Only text model can be used now. Build cargo build --bin clip-as-s

Rorical 6 Mar 6, 2023
FFSVM stands for "Really Fast Support Vector Machine"

In One Sentence You trained a SVM using libSVM, now you want the highest possible performance during (real-time) classification, like games or VR. Hig

Ralf Biedert 53 Nov 24, 2022