Fast shortest path calculations for Rust

Andi

Last update: Jan 2, 2023

Related tags

Geospatial road-network shortest-paths routing-algorithm dijkstra-algorithm traffic-simulation contraction-hierarchies

Overview

Fast Paths

The most famous algorithms used to calculate shortest paths are probably Dijkstra's algorithm and A*. However, shortest path calculation can be done much faster by preprocessing the graph.

Fast Paths uses Contraction Hierarchies, one of the best known speed-up techniques for shortest path calculation. It is especially suited to calculate shortest paths in road networks, but can be used for any directed graph with positive, non-zero edge weights.

Installation

In Cargo.toml

[dependencies]
fast_paths = "0.1.0"

Basic usage

// begin with an empty graph
let mut input_graph = InputGraph::new();

// add an edge between nodes with ID 0 and 6, the weight of the edge is 12.
// Note that the node IDs should be consecutive, if your graph has N nodes use 0...N-1 as node IDs,
// otherwise performance will degrade.
input_graph.add_edge(0, 6, 12);
// ... add many more edges here

// freeze the graph before using it (you cannot add more edges afterwards, unless you call thaw() first)
input_graph.freeze();

// prepare the graph for fast shortest path calculations. note that you have to do this again if you want to change the
// graph topology or any of the edge weights
let fast_graph = fast_paths::prepare(&input_graph);

// calculate the shortest path between nodes with ID 8 and 6 
let shortest_path = fast_paths::calc_path(&fast_graph, 8, 6);

match shortest_path {
    Some(p) => {
        // the weight of the shortest path
        let weight = p.get_weight();
        
        // all nodes of the shortest path (including source and target)
        let nodes = p.get_nodes();
    },
    None => {
        // no path has been found (nodes are not connected in this graph)
    }
}

Batch-wise shortest path calculation

For batch-wise calculation of shortest paths the method described above is inefficient. You should keep the PathCalculator object to execute multiple queries instead:

// ... see above
// create a path calculator (note: not thread-safe, use a separate object per thread)
let mut path_calculator = fast_paths::create_calculator(&fast_graph);
let shortest_path = path_calculator.calc_path(&fast_graph, 8, 6);

Serializing the prepared graph

FastGraph implements standard Serde serialization.

To be able to use the graph in a 32bit WebAssembly environment, it needs to be transformed to a 32bit representation when preparing it on a 64bit system. This can be achieved with the following two methods, but it will only work for graphs that do not exceed the 32bit limit, i.e. the number of nodes and edges and all weights must be below 2^32.

use fast_paths::{deserialize_32, serialize_32, FastGraph};

#[derive(Serialize, Deserialize)]
struct YourData {
    #[serde(serialize_with = "serialize_32", deserialize_with = "deserialize_32")]
    graph: FastGraph,
    // the rest of your struct
}

Preparing the graph after changes

The graph preparation can be done much faster using a fixed node ordering, which is just a permutation of node ids. This can be done like this:

let fast_graph = fast_paths::prepare(&input_graph);
let node_ordering = fast_graph.get_node_ordering();

let another_fast_graph = fast_paths::prepare_with_order(&another_input_graph, &node_ordering);

For this to work another_input_graph must have the same number of nodes as input_graph, otherwise prepare_with_order will return an error. Also performance will only be acceptable if input_graph and another_input_graph are similar to each other, say you only changed a few edge weights.

Benchmarks

FastPaths was run on a single core on a consumer-grade laptop using the road networks provided for the DIMACS implementation challenge graphs. The following graphs were used for the benchmark:

area	number of nodes	number of edges
New York	264.346	733.846
California&Nevada	1.890.815	4.630.444
USA	23.947.347	57.708.624

graph	metric	preparation time	average query time (micros)
NY city	distance	24 s	162
CAL&NV	distance	100 s	430
USA	distance	35 min	3980
NY city	time	14 s	77
CAL&NV	time	62 s	222
USA	time	13 min	1086

The shortest path calculation time was averaged over 100k random routing queries.

There are also some benchmarks using smaller maps included in the test suite. You can run them like this:

export RUST_TEST_THREADS=1; cargo test --release -- --ignored --nocapture

Graph limitations

loop-edges (from node A to node A) will be ignored, because since we are only considering positive non-zero edge-weights they cannot be part of a shortest path
in case the graph has duplicate edges (multiple edges from node A to node B) only the edge with the lowest weight will be considered

Special Thanks

Thanks to Dustin Carlino from A/B Street!

License

This project is licensed under either of

Apache License, Version 2.0, (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in fast_paths by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Comments

bidijkstra stall-on-demand

I am not sure if I understand stall-on-demand fully.

this one will improve query time for sure :wink: i have seen @easbar implemented it for graphhopper, so i guess you understood everything.

please check my understanding: the blocking happens before the neighboring nodes are added to the heap. Because adding nodes to the heap is the expensive part.

so we iterate over incoming edges with its neighbors and check if a node with a higher level is already settled. if so we end here.? if not we iterate as usual over outgoing edges with its neighbors and add them to the heap.

do we have to store something when a node is blocked?

opened by Stunkymonkey 21
Why does the preparation time vary so much for different maps?

The preparation time varies drastically depending on the graph we use. Take for example these graphs (included in this repository), that are similar in size:

|name|nodes|edges|edges/nodes|preparation time (ms)|fast_graph edges|fast_graph edges/node|prep per node (μs)|query time (μs)| |-|-|-|-|-|-|-|-|-| |Bremen dist|40_461|85_090|2.10|429|66_520|1.64|10|18| |Bremen time|40_461|85_090|2.10|6_868|62_518|1.54|169|13| |Bremen uniform*|40_461|85_090|2.10|307|65_669|1.62|8|15| |South Seattle|29_763|49_604|1.67|16_750|64_029|2.15|560|32|

The measurements shown in the table are just the result of running the tests in lib.rs on my laptop.

*Update: For the Bremen uniform graph I used the Bremen time graph, but set all edge weights to a constant value of 10.

I'd really like to find out where this difference comes from and it would be especially useful to speed up the preparation of the abstreet graphs (South Seattle here) somehow. Also related: #33.

opened by easbar 13
Allow storing the graph for 32bit systems on a 64bit system.

This PR adds two helper methods to save/load a graph (which uses usize integers) in a 32bit representation. This allows writing a 32bit graph also on a 64bit system. A 32bit graph might be needed when using FastPaths with WebAssembly.

Fixes #12.

opened by easbar 10
CH improve amount of created shortcuts

as seen here, the amount of shortcuts is not optimal. The current approach as described in the original publication is creating shortcuts, if the contracting node is not part of the shortest path from previous neighbor to the next neighbor. As seen here this can lead to non perfect amount of shortcuts. A "newer" approach is, to have a distinct set of previous neighbors and another distinct set with following neighbors. Then use Dijkstra and calculate the travel-costs. If the cost is equal to the cost from previous to contracting node to the following neighbor then the path from neighbor to neighbor is the optimal path. Therefore a new shortcut has to be created. Otherwise another better path is available and will create a shortcut in the future. The path itself does not have to be fully resolved. Only the costs are important. Additional Improvement

i already did an implementation in rust, but with a different data-structure: link

maybe I find some time and could try implement it for your model. Is this wanted? Or do your prefer doing this yourself?

opened by Stunkymonkey 9
Allow pathfinding from multiple sources to a single destination.

Hi @easbar,

I have a strange new use case for CHs, and as far as I can tell, this very simple change seems to work. I want to calculate the shortest path between multiple sources and a single destination. The resulting path should use the closest source. (I'm not trying to calculate one path from each source, but rather let multiple sources all be possibilities for the result.)

The reason for this: I model directed road segments as nodes, and turns in intersections as edges. When a route begins at a building, currently vehicles always turn one direction onto the road connected to the driveway -- right in the US, left in the UK. So the pathfinding starts from just one node. But I want to try allowing vehicles to turn left or right out of a driveway -- in many cases, this will produce a much more realistic path.

I looked at how the start node is used in calc_path, and it looks extremely straightforward to generalize -- so I did. And in my initial tests, it works -- on my end, I'm hacking every request to use two start nodes, one of them at a fixed position on the map. I have a UI where I can click one start point and visualize the path to a destination I hover over. With this experimental code hooked up, the path goes from the clicked start point, until I cross the threshold where my fixed starting point is closer to the destination.

However, I don't have a theoretical understanding of how the forwards+backwards search happens in CHs. So although the code seems to work, I don't know if it even should, or if I'm somehow getting lucky in my experiment so far.

If this kind of change is algorithmically valid, then I'll clean up the API, add docs, and add a unit test.

And maybe later, I'll investigate the same idea for multiple end points -- to model entering the destination driveway from either side of the road.

Thanks!

opened by dabreegster 8
Adding derives for Clone and Copy where it makes sense

Adds Clone Derives on custom structs, and a few Copy, to allow users of the lib to clone InputGraph and FastGraph.

Haven't added a unit-test as it compiles and is a derivation so it seems pointless, did a quick sanity check though and the cloning works

opened by Alfred-Mountfield 8
Extremely high memory usage when generating fast graph
I am facing extremely high memory usage when generating a fast graph(>25GB)

I have 873k nodes and 900k edges. My code looks like this:

let mut input_graph = InputGraph::new(); for e in edges { input_graph.add_edge(e.start_node, e.end_node, e.weight); } println!("Freezing graph..."); input_graph.freeze(); println!("Calculating optimized graph..."); let fast_graph = fast_paths::prepare(&input_graph); println!("Done.\r\nSaving graph...\t"); fast_paths::save_to_disk(&fast_graph, format!("{}.ff", filename).as_ref()); println!("Done.");

It starts swallowing memory at let fast_graph = fast_paths::prepare(&input_graph);. I tried creating a new params with a ratio of 0.01 and 10.0, which didn't seem to help. (I'm not entirely sure what this value does, which is why I tried a small and large value)

I looked through the benchmark code but it doesn't look like I'm doing anything fundamentally different.
opened by scd31 8
Document how to run the benchmarks

Hello! Could you please share how you ran the benchmarks listed in the README? The only tests I can see are in the lib.rs, and running cargo test after removing the #[ignore] still compiles the code in unoptimized mode. 'grep'-ing for '#[bench]' also shows nothing.

opened by PayasR 6
Add a convenience function to save an InputGraph to a file.

Every time I send over a sample input graph, I think I wind up writing some code like this; not sure why it was missing originally.

Also, we could just use serde and automatically serialize/deserialize input graphs to JSON, bincode, etc, instead of this custom format. But maybe that's less useful, since the internal format of InputGraph might evolve over time.

opened by dabreegster 5
Why not more object oriented?

I noticed a lot of methods are used like this:

fast_paths::a_method(&graph, a, b, ...)

Whereas a more OO method call would look like this:

graph.a_method(a, b, ...)

Is there any reason the first is used so often? I would like to send in a PR to move over to the second way, but want to make sure it would be accepted.

opened by scd31 5
usize does not work well for de/serialization.

WASM is 32bit always (afaik).

So if you serialize a FastGraph on a 64bit platform the resulting bytes will not be deserializable in a WASM binary.

Is there a real need for usize or would you be amenable to a more concrete type? Or do you know of some way to tell serde that usize should read 8 bytes (u64) when it's decoding into a u32?

opened by Carterj3 4
Possibly wrong path?

Hey, i am using this lib for the current advent of code. And I got the wrong result for my input. The test input is working and everything else also. Maybe you can verify that the code is working correctly from your point of view?

I got the result of 3019 but it should be 3012. I rerun the code multiple times and also verified, that the enlarged matrix is correct (it is :/). So I cannot find any issues in my code. My next thought is, that your lib maybe has an issue? :D

Maybe you want/can take a look :)

You can find the code here: https://github.com/auryn31/aoc_2021/blob/main/day_15/src/main.rs

But there is a solid chance my code is wrong :D

Best

opened by auryn31 0
Improve preparation time
Some of the ideas taken from this discussion: https://github.com/Stunkymonkey/osm_ch/issues/1

[x] cancel witness searches after a certain amount of nodes have been explored -> see #37

[ ] use contracted_neighbors heuristic

[ ] use flat array for preparation graph

[ ] parallelization https://github.com/Stunkymonkey/osm_ch/issues/1#issuecomment-877780244

[ ] contract only a certain fraction of nodes

[x] find out why abstreet graph preparation is slow compared to e.g. plain OSM maps https://github.com/Stunkymonkey/osm_ch/issues/1#issuecomment-876583423 -> see #37

[ ] clean up query code? https://github.com/Stunkymonkey/osm_ch/issues/1#issuecomment-877780244
opened by easbar 0
Add a way to get all nodes in a radius

My main use of this algorithm is in pandana ware the common need is to for each knode, calculate summary statistics about all the nodes that are within a thresholded.

opened by Eh2406 7