An fast, offline reverse geocoder (>1,000 HTTP requests per second) in Rust.

Overview

Rust Reverse Geocoder

A fast reverse geocoder in Rust. Inspired by Python reverse-geocoder.

Links

Description

rrgeo takes a latitude and longitude as input and returns the closest city, country, latitude, and longitude, using a k-d tree to efficiently find the nearest neighbour based on a known list of locations. This can be useful if you need to reverse geocode a large number of coordinates quickly, or just need the rough location of coordinates but don't want the expense or complication of an online reverse geocoder.

This crate is implemented as a library, an Actix REST API, a Warp REST API, and as a command-line utility, thanks to Cargo workspaces.

Usage

Command line search

Example usage:

> cargo run -p rrgeo-cmd --release 40 -73
71 ms to load cities.csv
3 ms to build the KdTree
(40.72788, -73.09761): West Sayville New York Suffolk County US

Actix Web Server

Example usage:

cargo run -p rrgeo-actix --release

Navigate to the local web server.

Warp Web Server

Example usage:

cargo run -p rrgeo-warp --release

Navigate to the local web server.

Benchmarks

Benchmarked on Intel Core i7 4790K at 4.00GHz.

The core library (measured with criterion):

> cargo bench
Benchmarking search: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 11.7s or reduce sample count to 40.
search                  time:   [2.3144 ms 2.3440 ms 2.3760 ms]
                        change: [-3.7824% -1.2338% +1.6347%] (p = 0.36 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

Served via Actix Web:

> cargo run --release --bin rrgeo-actix
> oha http://localhost:3000/\?lat\=40\&long\=\-73 -z 5sec
Summary:
  Success rate: 1.0000
  Total:        5.0166 secs
  Slowest:      0.0531 secs
  Fastest:      0.0023 secs
  Average:      0.0166 secs
  Requests/sec: 2985.8853

  Total data:   1.47 MiB
  Size/request: 103.00 B
  Size/sec:     300.34 KiB

Response time histogram:
  0.004 [606]  |■■■■
  0.009 [2031] |■■■■■■■■■■■■■■■
  0.013 [4248] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.017 [3815] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.022 [2415] |■■■■■■■■■■■■■■■■■■
  0.026 [1097] |■■■■■■■■
  0.030 [542]  |■■■■
  0.035 [130]  |
  0.039 [55]   |
  0.043 [34]   |
  0.048 [6]    |

Latency distribution:
  10% in 0.0092 secs
  25% in 0.0124 secs
  50% in 0.0158 secs
  75% in 0.0206 secs
  90% in 0.0252 secs
  95% in 0.0284 secs
  99% in 0.0347 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0043 secs, 0.0008 secs, 0.0099 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0001 secs

Status code distribution:
  [200] 14979 responses

Served via Warp:

> cargo run --release --bin rrgeo-warp
Summary:
Summary:
  Success rate: 1.0000
  Total:        5.0006 secs
  Slowest:      0.2553 secs
  Fastest:      0.0113 secs
  Average:      0.1263 secs
  Requests/sec: 388.9503

  Total data:   275.42 KiB
  Size/request: 145.00 B
  Size/sec:     55.08 KiB

Response time histogram:
  0.022 [9]    |
  0.044 [10]   |
  0.067 [8]    |
  0.089 [9]    |
  0.111 [78]   |■
  0.133 [1793] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.155 [0]    |
  0.177 [0]    |
  0.200 [0]    |
  0.222 [0]    |
  0.244 [38]   |

Latency distribution:
  10% in 0.1229 secs
  25% in 0.1239 secs
  50% in 0.1250 secs
  75% in 0.1266 secs
  90% in 0.1277 secs
  95% in 0.1285 secs
  99% in 0.2504 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0045 secs, 0.0009 secs, 0.0097 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0001 secs

Status code distribution:
  [200] 1945 responses

Performance

Below we have comparisons between the Rust, Python and Node.js versions.

Rust Node
Load CSV 61ms 1221ms
Build KdTree 4ms 805ms
Search 1.1ms 0.5ms

Most of the performance differences appear to be in time taken to load the CSV file and create the k-d tree, but not searching the tree. Searching time resembles algorithmic complexity of k-d tree. Python version is partly implemented in C++ meaning it is not a purely Python implementation. (It might be interesting to see how a pure Python version performs.) The Node.js version is pure JavaScript, as in, not using C add-ons.

Rust --release build

     Running `target/release/web`
PT0.062677465S seconds to load cities.csv
PT0.003835230S seconds to build the KdTree
PT0.068904911S seconds to search
PT0.002596743S seconds to search
PT0.002887542S seconds to search

Rust --debug build

     Running `target/debug/web`
PT1.198010357S seconds to load cities.csv
PT0.124435778S seconds to build the KdTree
PT1.401588031S seconds to search
PT0.077837996S seconds to search
PT0.078178297S seconds to search

Python mode 1 (single threaded K-D tree)

➜  reverse-geocoder git:(master) ✗ time python mode1.py
Loading formatted geocoded file...
[{'name': 'Saint Louis Park', 'cc': 'US', 'lon': '-93.34801', 'admin1': 'Minnesota', 'admin2': 'Hennepin County', 'lat': '44.9483'}]

python mode1.py  1.60s user 0.22s system 98% cpu 1.847 total

Python mode 2 (multi threaded K-D tree)

➜  reverse-geocoder git:(master) ✗ time python mode2.py
Loading formatted geocoded file...
[{'name': 'Saint Louis Park', 'cc': 'US', 'lon': '-93.34801', 'admin1': 'Minnesota', 'admin2': 'Hennepin County', 'lat': '44.9483'}]

python mode2.py  2.82s user 0.34s system 142% cpu 2.221 total

nreverse (Node.js version)

load modules: 12.619ms
load cities.csv: 1221.833ms
create kdtree: 805.310ms
search tree: 0.758ms
search tree: 0.086ms
search tree: 0.198ms
search tree: 0.104ms
search tree: 0.031ms
total_heap_size 114mb
total_heap_size_executable 5mb
total_physical_size 112mb
total_available_size 1325mb
used_heap_size 83mb
heap_size_limit 1432mb
malloced_memory 0mb
peak_malloced_memory 4mb
does_zap_garbage 0mb

License

Licensed under either of

at your option.

Comments
  • Replace old dependencies for reverse-geocoder

    Replace old dependencies for reverse-geocoder

    Currently, it uses rustc-serialize which has been deprecated in favor of serde (https://crates.io/crates/rustc-serialize) and quick-csv, which depends on rustc-serialize and has not been updated in 5 years.

    opened by jqnatividad 3
  • use new_with_capacity to reduce allocations

    use new_with_capacity to reduce allocations

    I also made the lookup return an Option<&Record> so clients can choose themselves if they want to clone, and rewrote some code to be more rustic.

    opened by llogiq 2
  • performance: switch to kiddo kd tree library for ~800x improvement on benchmark 😲

    performance: switch to kiddo kd tree library for ~800x improvement on benchmark 😲

    I tried a little experiment and switched out the old kdtree library for kiddo, a performance-improved fork of the now 2-year-old kdtree. (fair disclosure: I'm the maintainer of kiddo).

    This resulted in a huge improvement in performance on the benchmark. Tested on a Ryzen 5900X, I got the following result for the old master branch:

    image

    316.1 microseconds.

    After my refactoring, I got the following result:

    image

    409.9 nanoseconds - i.e. 0.4 microseconds! Faster by a factor of 790 😎

    opened by sdd 1
  • use quick-csv

    use quick-csv

    Gets tiny boost using quick-csv instead of csv

    csv
    $ time target/release/rreverse
    PT0.355788687S seconds to load cities.csv
    PT0.214264474S seconds to build the KdTree
    (44.9483, -93.34801): Saint Louis Park Minnesota Hennepin County US
    
    real    0m0.616s
    user    0m0.548s
    sys 0m0.068s
    
    quick-csv
    $ time target/release/rreverse
    PT0.276284187S seconds to load cities.csv
    PT0.212306487S seconds to build the KdTree
    (44.9483, -93.34801): Saint Louis Park Minnesota Hennepin County US
    
    real    0m0.539s
    user    0m0.468s
    sys 0m0.068s
    
    opened by tafia 1
  • Bump axum-core from 0.2.7 to 0.2.8

    Bump axum-core from 0.2.7 to 0.2.8

    Bumps axum-core from 0.2.7 to 0.2.8.

    Release notes

    Sourced from axum-core's releases.

    axum-core - v0.2.8

    Security

    • breaking: Added default limit to how much data Bytes::from_request will consume. Previously it would attempt to consume the entire request body without checking its length. This meant if a malicious peer sent an large (or infinite) request body your server might run out of memory and crash.

      The default limit is at 2 MB and can be disabled by adding the new DefaultBodyLimit::disable() middleware. See its documentation for more details.

      This also applies to String which used Bytes::from_request internally.

      (#1346)

    #1346: tokio-rs/axum#1346

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
    • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
    • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
    • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

    You can disable automated security fix PRs for this repo from the Security Alerts page.

    dependencies 
    opened by dependabot[bot] 0
  • V2.0

    V2.0

    V2.0

    Update and improve error handling and function signatures.

    • Simplify the search API function signature
    • Use standard 2018 edition Rust error handling idioms
    • actix_web handler demonstrating exhaustive match of all errors
    opened by gx0r 0
  • More information about bundled cities.csv file

    More information about bundled cities.csv file

    Hi @gx0r ! First off, thanks again for this awesome library!

    However, could you include some information on how the bundled cities.csv was prepared?

    Specifically, how it was transformed to its current format from the files published by Geonames here - http://download.geonames.org/export/dump/?

    Also, having a README when the cities.csv file was last updated would be great too!

    I'm asking as I'd like to take advantage of the Locations::from_path method to upload the latest version from Geonames.

    opened by jqnatividad 2
Owner
Grant Miner
Grant Miner
Reverse-geocoder microservice

Reverse Geocoder This is a simple and very lightweight reverse geocoder for offline use. It's implemented in Rust and uses a k-d-tree to find the clos

Treestack GmbH 3 May 12, 2023
A set of tools for generating isochrones and reverse isochrones from geographic coordinates

This library provides a set of tools for generating isochrones and reverse isochrones from geographic coordinates. It leverages OpenStreetMap data to construct road networks and calculate areas accessible within specified time limits.

null 3 Feb 22, 2024
A single-binary, GPU-accelerated LLM server (HTTP and WebSocket API) written in Rust

Poly Poly is a versatile LLM serving back-end. What it offers: High-performance, efficient and reliable serving of multiple local LLM models Optional

Tommy van der Vorst 13 Nov 5, 2023
Fast 2D Delaunay triangulation in Rust. A port of Delaunator.

delaunator-rs A very fast static 2D Delaunay triangulation library for Rust. A port of Delaunator. Documentation Example use delaunator::{Point, trian

Vladimir Agafonkin 123 Dec 20, 2022
Fast shortest path calculations for Rust

Fast Paths The most famous algorithms used to calculate shortest paths are probably Dijkstra's algorithm and A*. However, shortest path calculation ca

Andi 226 Jan 2, 2023
A fast R-tree for Rust. Ported from an implementation that's designed for Tile38.

rtree.rs A fast R-tree for Rust. Ported from an implementation that's designed for Tile38. Features Optimized for fast inserts and updates. Ideal for

Josh Baker 102 Dec 30, 2022
Fast Geospatial Feature Storage API

Hecate OpenStreetMap Inspired Data Storage Backend Focused on Performance and GeoJSON Interchange Hecate Feature Comparison Feature Hecate ESRI MapSer

Mapbox 243 Dec 19, 2022
Blazing fast and lightweight PostGIS vector tiles server

Martin Martin is a PostGIS vector tiles server suitable for large databases. Martin is written in Rust using Actix web framework. Requirements Install

Urbica 921 Jan 7, 2023
Trees for fast location-to-value lookup.

HexTree hextree provides tree structures that represent geographic regions with H3 cells. The primary structures are: HexTreeMap: an H3 cell-to-value

Jay Kickliter 38 Dec 15, 2022
Rust crate for performing coordinate transforms

Synopsis A Rust crate use for performing coordinate transformations. The crate relies on nalgebra vectors to perform the coordinate transformations. C

Dave 25 Aug 20, 2022
Geospatial primitives and algorithms for Rust

geo Geospatial Primitives, Algorithms, and Utilities The geo crate provides geospatial primitive types such as Point, LineString, and Polygon, and pro

GeoRust 989 Dec 29, 2022
Rust bindings for GDAL

gdal [] GDAL bindings for Rust. So far, you can: open a raster dataset for reading/writing get size and number of bands get/set projection and geo-tra

GeoRust 211 Jan 4, 2023
Rust bindings for the latest stable release of PROJ

PROJ Coordinate transformation via bindings to the PROJ v7.2.1 API. Two coordinate transformation operations are currently provided: projection (and i

GeoRust 96 Dec 21, 2022
Geohash for Rust

Rust-Geohash Rust-Geohash is a Rust library for Geohash algorithm. Ported from node-geohash module. Documentation Docs Check the API doc at docs.rs Li

GeoRust 74 Sep 8, 2022
Rust read/write support for well-known text (WKT)

wkt Rust read/write support for well-known text (WKT). License Licensed under either of Apache License, Version 2.0 (LICENSE-APACHE or http://www.apac

GeoRust 40 Dec 11, 2022
Google Encoded Polyline encoding & decoding in Rust.

polyline Google Encoded Polyline encoding & decoding in Rust. A Note on Coordinate Order This crate uses Coordinate and LineString types from the geo-

GeoRust 14 Dec 11, 2022
Geocoding library for Rust.

geocoding Rust utilities to enrich addresses, cities, countries, and landmarks with geographic coordinates through third-party geocoding web services.

GeoRust 55 Dec 12, 2022
Rust read/write support for GPS Exchange Format (GPX)

gpx gpx is a library for reading and writing GPX (GPS Exchange Format) files. It uses the primitives provided by geo-types to allow for storage of GPS

GeoRust 63 Dec 5, 2022
Geospatial primitives and algorithms for Rust

geo Geospatial Primitives, Algorithms, and Utilities The geo crate provides geospatial primitive types such as Point, LineString, and Polygon, and pro

GeoRust 990 Jan 1, 2023