Disaggregate zone-based origin/destination data to specific points

Dustin Carlino

Last update: Jun 29, 2022

Related tags

Utilities odjitter

Overview

odjitter

This crate contains an implementation of the ‘jittering’ technique for pre-processing origin-destination (OD) data. Jittering in a data visualisation context refers to the addition of “random noise to the data” to prevent points in graphs from overlapping, as described in by Wickham et al. (2016) and in the documentation page for the function geom_jitter().

In the context of OD data jittering refers to randomly moving start and end points associated with OD pairs, as described in an under review paper on the subject (Lovelace et al. under review). The technique is implemented in the function od_jitter() in the od R package. The functionality contained in this repo is an extended and much faster implementation: according to our benchmarks on a large dataset it was around 1000 times faster than the R implementation.

The crate is still a work in progress: the API may change. Issues and pull requests are particularly useful at this stage.

Installation

Install the package from the system command line as follows (you need to have installed and set-up cargo first):

cargo install --git https://github.com/dabreegster/odjitter

To check the package installation worked, you can run odjitter command without arguments. If it prints the following message congratulations, it works 🎉

odjitter

## error: The following required arguments were not provided:
##     --od-csv-path <OD_CSV_PATH>
##     --zones-path <ZONES_PATH>
##     --output-path <OUTPUT_PATH>
##     --max-per-od <MAX_PER_OD>
## 
## USAGE:
##     odjitter [OPTIONS] --od-csv-path <OD_CSV_PATH> --zones-path <ZONES_PATH> --output-path <OUTPUT_PATH> --max-per-od <MAX_PER_OD>
## 
## For more information try --help

Usage

To run algorithm you need a minimum of three inputs, examples of which are provided in the data/ folder of this repo:

A .csv file containing OD data with two columns containing zone IDs (specified with --origin-key=geo_code1 --destination-key=geo_code2 by default) and other columns representing trip counts:

geo_code1	geo_code2	all	bus	car_driver	car_passenger	bicycle	foot	other
S02001616	S02001616	82	3	6	0	2	71	0
S02001616	S02001620	188	42	26	3	11	105	1
S02001616	S02001621	99	13	7	3	15	61	0

A .geojson file representing zones that contains values matching the zone IDs in the OD data (the field containing zone IDs is specified with --zone-name-key=InterZone by default):

head -6 data/zones.geojson

## {
## "type": "FeatureCollection",
## "name": "zones_min",
## "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
## "features": [
## { "type": "Feature", "properties": { "InterZone": "S02001616", "Name": "Merchiston and Greenhill", "TotPop2011": 5018, "ResPop2011": 4730, "HHCnt2011": 2186, "StdAreaHa": 126.910911, "StdAreaKm2": 1.269109, "Shape_Leng": 9073.5402482000009, "Shape_Area": 1269109.10155 }, "geometry": { "type": "MultiPolygon", "coordinates": [ [ [ [ -3.2040366, 55.9333372 ], [ -3.2036354, 55.9321624 ], [ -3.2024036, 55.9321874 ], [ -3.2019838, 55.9315586 ], [ -3.2005071, 55.9317411 ], [ -3.199902, 55.931113 ], [ -3.2033504, 55.9308279 ], [ -3.2056319, 55.9309507 ], [ -3.2094979, 55.9308666 ], [ -3.2109753, 55.9299985 ], [ -3.2107073, 55.9285904 ], [ -3.2124928, 55.927854 ], [ -3.2125633, 55.9264661 ], [ -3.2094928, 55.9265616 ], [ -3.212929, 55.9260741 ], [ -3.2130774, 55.9264384 ], [ -3.2183973, 55.9252709 ], [ -3.2208941, 55.925282 ], [ -3.2242732, 55.9258683 ], [ -3.2279975, 55.9277452 ], [ -3.2269867, 55.928489 ], [ -3.2267625, 55.9299817 ], [ -3.2254561, 55.9307854 ], [ -3.224148, 55.9300725 ], [ -3.2197791, 55.9315472 ], [ -3.2222706, 55.9339127 ], [ -3.2224909, 55.934809 ], [ -3.2197844, 55.9354692 ], [ -3.2204535, 55.936195 ], [ -3.218362, 55.9368806 ], [ -3.2165749, 55.937069 ], [ -3.215582, 55.9380761 ], [ -3.2124132, 55.9355465 ], [ -3.212774, 55.9347972 ], [ -3.2119068, 55.9341947 ], [ -3.210138, 55.9349668 ], [ -3.208051, 55.9347716 ], [ -3.2083105, 55.9364224 ], [ -3.2053546, 55.9381495 ], [ -3.2046077, 55.9395298 ], [ -3.20356, 55.9380951 ], [ -3.2024323, 55.936318 ], [ -3.2029121, 55.935831 ], [ -3.204832, 55.9357555 ], [ -3.2040366, 55.9333372 ] ] ] ] } },

A .geojson file representing a transport network from which origin and destination points are sampled

head -6 data/road_network.geojson

## {
## "type": "FeatureCollection",
## "name": "road_network_min",
## "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:OGC:1.3:CRS84" } },
## "features": [
## { "type": "Feature", "properties": { "osm_id": "3468", "name": "Albyn Place", "highway": "tertiary", "waterway": null, "aerialway": null, "barrier": null, "man_made": null, "access": null, "bicycle": null, "service": null, "z_order": 4, "other_tags": "\"lit\"=>\"yes\",\"lanes\"=>\"3\",\"maxspeed\"=>\"20 mph\",\"sidewalk\"=>\"both\",\"lanes:forward\"=>\"2\",\"lanes:backward\"=>\"1\"" }, "geometry": { "type": "LineString", "coordinates": [ [ -3.207438, 55.9533584 ], [ -3.2065953, 55.9535098 ] ] } },

The jitter function requires you to set the maximum number of trips for all trips in the jittered result. A value of 1 will create a line for every trip in the dataset, a value above the maximum number of trips in the ‘all’ column in the OD ata will result in a jittered dataset that has the same number of desire lines (the geographic representation of OD pairs) as in the input (50 in this case).

With reference to the test data in this repo, you can run the jitter command line tool as follows:

odjitter --od-csv-path data/od.csv \
  --zones-path data/zones.geojson \
  --subpoints-path data/road_network.geojson \
  --max-per-od 50 --output-path output_max50.geojson

## Scraped 7 zones from data/zones.geojson
## Scraped 5073 subpoints from data/road_network.geojson
## Disaggregating OD data
## Wrote output_max50.geojson

Try running it with a different max-per-od value (10 in the command below):

odjitter --od-csv-path data/od.csv \
  --zones-path data/zones.geojson \
  --subpoints-path data/road_network.geojson \
  --max-per-od 10 --output-path output_max10.geojson

## Scraped 7 zones from data/zones.geojson
## Scraped 5073 subpoints from data/road_network.geojson
## Disaggregating OD data
## Wrote output_max10.geojson

Outputs

The figure below shows the output of the jitter commands above visually, with the left image showing unjittered results with origins and destinations going to zone centroids (as in many if not most visualisations of desire lines between zones), the central image showing the result after setting max-per-od argument to 50, and the right hand figure showing the result after setting max-per-od to 10.

Note: odjitter uses a random number generator to sample points, so the output will change each time you run it, unless you set the rng-seed, as documented in the next section.

Details

For full details on odjitter’s arguments run odjitter --help which gives the following output:

odjitter --help

## odjitter 0.1.0
## Dustin Carlino <[email protected]
## Disaggregate origin/destination data from zones to points
## 
## USAGE:
##     odjitter [OPTIONS] --od-csv-path <OD_CSV_PATH> --zones-path <ZONES_PATH> --output-path <OUTPUT_PATH> --max-per-od <MAX_PER_OD>
## 
## OPTIONS:
##         --all-key <ALL_KEY>
##             Which column in the OD row specifies the total number of trips to disaggregate?
##             [default: all]
## 
##         --destination-key <DESTINATION_KEY>
##             Which column in the OD row specifies the zone where trips ends? [default: geo_code2]
## 
##     -h, --help
##             Print help information
## 
##         --max-per-od <MAX_PER_OD>
##             What's the maximum number of trips per output OD row that's allowed? If an input OD row
##             contains less than this, it will appear in the output without transformation. Otherwise,
##             the input row is repeated until the sum matches the original value, but each output row
##             obeys this maximum
## 
##         --min-distance-meters <MIN_DISTANCE_METERS>
##             Guarantee that jittered points are at least this distance apart [default: 1.0]
## 
##         --od-csv-path <OD_CSV_PATH>
##             The path to a CSV file with aggregated origin/destination data
## 
##         --origin-key <ORIGIN_KEY>
##             Which column in the OD row specifies the zone where trips originate? [default:
##             geo_code1]
## 
##         --output-path <OUTPUT_PATH>
##             The path to a GeoJSON file where the disaggregated output will be written
## 
##         --rng-seed <RNG_SEED>
##             By default, the output will be different every time the tool is run, based on a
##             different random number generator seed. Specify this to get deterministic behavior,
##             given the same input
## 
##         --subpoints-path <SUBPOINTS_PATH>
##             The path to a GeoJSON file with subpoints to sample from. If this isn't specified,
##             random points within each zone will be used instead
## 
##     -V, --version
##             Print version information
## 
##         --zone-name-key <ZONE_NAME_KEY>
##             In the zones GeoJSON file, which property is the name of a zone [default: InterZone]
## 
##         --zones-path <ZONES_PATH>
##             The path to a GeoJSON file with named zones

References

Lovelace, Robin, Rosa Félix, and Dustin Carlino Under Review Jittering: A Computationally Efficient Method for Generating Realistic Route Networks from Origin-Destination Data. TBC.

Wickham, Hadley 2016 ggplot2: Elegant Graphics for Data Analysis. 2nd ed. 2016 edition. New York, NY: Springer.

Comments

Add new `subpoint_origins` and `subpoint_destinations` optional arguments
In many OD datasets the locations of destinations (e.g. work places, shops, schools) are different than the locations of the origins (e.g. residential buildings). Some destinations attract more trips than others, so weighting values are probably also needed.

Based on input data in #8, I imagine this could work something like this:

odjitter --od-csv-path data/od_school.csv \ --zones-path data/zones.geojson \ --subpoints_destinations data/schools.geojson \ --weight_key_destinations weight \ --all-key car \ --max-per-od 10 --output-path output_to_schools_max_10.geojson

A naive approach, that I think should at least provide an output (but errors when I try it) is as follows:

odjitter --od-csv-path data/od_school.csv \ --zones-path data/zones.geojson \ --subpoints-path data/schools.geojson \ --max-per-od 50 --output-path output_max50.geojson

Illustration of what the output could look like (with --max-per-od 1000 in this case):
opened by Robinlovelace 17

R odjitter in Windows

Hi,

In Win 10, there is apparently an error of reading/writing the temporary od_jittered.geojson file.

library(od) #just to get a small data input
library(odjitter)
#> 
#> Attaching package: 'odjitter'
#> The following object is masked from 'package:base':
#> 
#>     jitter

od_all_jittered = odjitter::jitter(
  od = od_data_df,
  zones = od_data_zones_min,
  subpoints = sf::st_sample(od_data_zones_min, 200)
)
#> Warning in system(msg): 'odjitter' not found
#> Error: Cannot open "C:\Users\UTILIZ~1\AppData\Local\Temp\RtmpeWTALa/od_jittered.geojson"; The file doesn't seem to exist.

^{Created on 2022-03-25 by the reprex package (v2.0.1)}

The error is persistent between runs/data input.

Error: Cannot open "C:\Users\UTILIZ~1\AppData\Local\Temp\RtmpC08uIU/od_jittered.geojson"; The file doesn't seem to exist.

opened by temospena 13

Update README

This should probably be seen in context of broader meta-issue on documentation but opening this after changes in #11. My plan is to:

[x] Switch from .Rmd source to .qmd for source to reduce dependencies
[x] Tidy-up (no leftover files)
[x] Explain and demonstrate with a reproducible example the use of new subpoints arguments using the schools dataset

Can work on this later today but happy to hold horses if other features, e.g. addition of weight_key_origins and weight_key_destinations arguments, are in the pipeline. Note to self: most recent version of the pkg has the following arguments:

odjitter --help
odjitter 0.1.0
Dustin Carlino <[email protected]
Disaggregate origin/destination data from zones to points

USAGE:
    odjitter [OPTIONS] --od-csv-path <OD_CSV_PATH> --zones-path <ZONES_PATH> --output-path <OUTPUT_PATH> --disaggregation-threshold <DISAGGREGATION_THRESHOLD>

OPTIONS:
        --destination-key <DESTINATION_KEY>
            Which column in the OD row specifies the zone where trips ends? [default: geo_code2]

        --disaggregation-key <DISAGGREGATION_KEY>
            Which column in the OD row specifies the total number of trips to disaggregate?
            [default: all]

        --disaggregation-threshold <DISAGGREGATION_THRESHOLD>
            What's the maximum number of trips per output OD row that's allowed? If an input OD row
            contains less than this, it will appear in the output without transformation. Otherwise,
            the input row is repeated until the sum matches the original value, but each output row
            obeys this maximum

    -h, --help
            Print help information

        --min-distance-meters <MIN_DISTANCE_METERS>
            Guarantee that jittered points are at least this distance apart [default: 1.0]

        --od-csv-path <OD_CSV_PATH>
            The path to a CSV file with aggregated origin/destination data

        --origin-key <ORIGIN_KEY>
            Which column in the OD row specifies the zone where trips originate? [default:
            geo_code1]

        --output-path <OUTPUT_PATH>
            The path to a GeoJSON file where the disaggregated output will be written

        --rng-seed <RNG_SEED>
            By default, the output will be different every time the tool is run, based on a
            different random number generator seed. Specify this to get deterministic behavior,
            given the same input

        --subpoints-destinations-path <SUBPOINTS_DESTINATIONS_PATH>
            The path to a GeoJSON file to use for sampling subpoints for destination zones. If this
            isn't specified, random points within each zone will be used instead

        --subpoints-origins-path <SUBPOINTS_ORIGINS_PATH>
            The path to a GeoJSON file to use for sampling subpoints for origin zones. If this isn't
            specified, random points within each zone will be used instead

    -V, --version
            Print version information

        --zone-name-key <ZONE_NAME_KEY>
            In the zones GeoJSON file, which property is the name of a zone [default: InterZone]

        --zones-path <ZONES_PATH>
            The path to a GeoJSON file with named zones

opened by Robinlovelace 9

Command to fully disaggregate

cargo run -- disaggregate --od-csv-path data/od.csv --zones-path data/zones.geojson --output-path output_individual.geojson

Output looks like this:

{"geometry":{"coordinates":[[-3.203857147334649,55.95213138764797],[-3.222935941651701,55.95172951209746]],"type":"LineString"},"properties":{"mode":"bus"},"type":"Feature"},
{"geometry":{"coordinates":[[-3.221598670781587,55.951891527310494],[-3.2243560653816594,55.947639333117095]],"type":"LineString"},"properties":{"mode":"bus"},"type":"Feature"},
{"geometry":{"coordinates":[[-3.2315398898250978,55.94935689855381],[-3.22343974294507,55.9478142626001]],"type":"LineString"},"properties":{"mode":"bus"},"type":"Feature"},

opened by dabreegster 5

Support weighted subpoints #7

Adds flags --weight-key-destinations and --weight-key-origins. I haven't tested this manually with the schools dataset or figured out how to sanely unit test it (fix the RNG seed and plug in a very high weight for one school and tiny for the others?). So up to you if we proceed, or if you have some idea for validation

opened by dabreegster 5

`odjitter` crashing with big OD data and `--max-per-od 1`

Hi, @dabreegster .

I am trying to use odjitter with a subset of the São Paulo OD data and it is crashing when I set --max-per-od 1. It works fine when I try with --max-per-od 100 and --max-per-od 10. My PC freezes in the process, so it is probably a RAM usage related problem -- I have a core i5 6th gen with 8GB running Ubuntu 20.04.3 LTS.

Here is a reproducible example (using R):

piggyback::pb_download(file = "zones_sp_center.geojson", 
                       repo = "spstreets/OD2017"
                       )

piggyback::pb_download(file = "od_sp_center.csv",
                       repo = "spstreets/OD2017"
                       )

system("odjitter --od-csv-path ./od_sp_center.csv --zones-path ./zones_sp_center.geojson --max-per-od 1 --output-path result.geojson")

# Scraped 114 zones from ./zones_sp_center.geojson
# Disaggregating OD data
# Killed

opened by lucasccdias 3

Document jittering of OD data in which origins and destinations are different
Sometimes origin zones are different from destinations. odjitter will still work in this case, but pre-processing is needed.

[ ] Description of the data

[ ] Example input dataset

[ ] Reproducible example

[ ] Tests?
opened by Robinlovelace 2
Port R interface into this repo

Currently we have a basic but seemingly effective and tested (by me) R interface based on system calls: https://github.com/atumworld/odrust

At some point it would be good to switch from using system calls to using the rextendr interface framework, a separate issue #22

This issue aims at cohesion, so that all odjitter code, in any language, is in one easy to find place.

I'm happy to do the port and first thought is to put the R bindings/package into a new subfolder called simply r/, mirroring the approach in https://github.com/apache/arrow/tree/master/r and with a view to there being at some point Python bindings in a py/ subfolder, linking to #23.

opened by Robinlovelace 1
Separate origin/destination subpoints

This splits the --subpoints-path flag into separate --subpoints-origins-path and --subpoints-destinations-path flags. If either one isn't specified, the tool falls back to picking random points instead.

No support for weighted subpoints yet; I'll do that separately. There was some other cleanup to do first, so this PR is already big

opened by dabreegster 1

Usage from R on PowerShell

Just tested on a new installation on Windows and it fails:

> library(odjitter)

Attaching package: ‘odjitter’

The following object is masked from ‘package:base’:

    jitter

> #> 
> #> Attaching package: 'odjitter'
> #> The following object is masked from 'package:base':
> #> 
> #>     jitter
> od = readr::read_csv("https://github.com/dabreegster/odjitter/raw/main/data/od.csv")
Rows: 49 Columns: 11                                                   
── Column specification ─────────────────────────────────────────────────
Delimiter: ","
chr (2): geo_code1, geo_code2
dbl (9): all, from_home, train, bus, car_driver, car_passenger, bicyc...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
> #> Rows: 49 Columns: 11
> #> ── Column specification ───────────────────────────────────────────────────────────
> #> Delimiter: ","
> #> chr (2): geo_code1, geo_code2
> #> dbl (9): all, from_home, train, bus, car_driver, car_passenger, bicycle, foo...
> #> 
> #> ℹ Use `spec()` to retrieve the full column specification for this data.
> #> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
> zones = sf::read_sf("https://github.com/dabreegster/odjitter/raw/main/data/zones.geojson")
> names(zones)[1] = "geo_code"
> road_network = sf::read_sf("https://github.com/dabreegster/odjitter/raw/main/data/road_network.geojson")
> od_unjittered = od::od_to_sf(od, zones)
0 origins with no match in zone ids
0 destinations with no match in zone ids
 points not in od data removed.
> #> 0 origins with no match in zone ids
> #> 0 destinations with no match in zone ids
> #>  points not in od data removed.
> set.seed(42) # for reproducibility
> od_jittered = jitter(od, zones, subpoints = road_network)
Error in system("odjitter --help", intern = TRUE) : 'odjitter' not found

Solution: something like this:

system(r"(powershell C:\Users\geoevid\.cargo\bin\odjitter.exe)")

opened by eugenividal 0

Rename arguments?

Currently there are two potentially misleading argument names:

## OPTIONS:
##         --all-key <ALL_KEY>
##             Which column in the OD row specifies the total number of trips to disaggregate?
##             [default: all]
## 
...
## 
##         --max-per-od <MAX_PER_OD>
##             What's the maximum number of trips per output OD row that's allowed? If an input OD row
##             contains less than this, it will appear in the output without transformation. Otherwise,
##             the input row is repeated until the sum matches the original value, but each output row
##             obeys this maximum

Misleading because it's too specific. Really the first is the name of the column used to determine if an OD pair should be disaggregated and into how many 'sub-OD pairs'. The second can be described simply as the disaggregation threshold. I'm not precious about this and open minded to other options including keeping it as is. Plan to put in a PR as the basis for informed conversation on this.

opened by Robinlovelace 0

Example code fails

Hi. The code given in the vignette is not working for me (running R 4.2.0, RStudio 2022.07.1).

> library(odjitter)
Attaching package: ‘odjitter’
The following object is masked from ‘package:base’:
    jitter
> od <- readr::read_csv("https://github.com/dabreegster/odjitter/raw/main/data/od.csv")
Rows: 49 Columns: 11                                                                                                                     
── Column specification ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (2): geo_code1, geo_code2
dbl (9): all, from_home, train, bus, car_driver, car_passenger, bicycle, foot, other
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
> zones = sf::read_sf("https://github.com/dabreegster/odjitter/raw/main/data/zones.geojson")
> names(zones)[1] = "geo_code"
> road_network = sf::read_sf("https://github.com/dabreegster/odjitter/raw/main/data/road_network.geojson")
> od_unjittered = od::od_to_sf(od, zones)
0 origins with no match in zone ids
0 destinations with no match in zone ids
 points not in od data removed.
> set.seed(42) # for reproducibility
> od_jittered <- jitter(od, zones, subpoints = road_network)
Error in system(paste0(odjitter_location, " --help"), intern = TRUE) : 
  'odjitter' not found

Thanks for any help.

opened by blackburnstat 1

Jittering fails when input zones.geojson file contains mixxed geometry types

This was the cause of errors that were driving me insane, no need for a fix on the Rust side and it's an edge case that can be seen as an issue with dodgy input data my side. Still wanted to document it here while it's fresh in my head.

Input zone object that failed looked broadly like this:

Simple feature collection with 14 features and 2 fields
Geometry type: GEOMETRY
Dimension:     XY
Bounding box:  xmin: -6.918001 ymin: 53.00297 xmax: -6.71583 ymax: 53.27975
Geodetic CRS:  WGS 84
# A tibble: 14 × 3
   geo_code  social                                                                                             geometry
 * <chr>      <dbl>                                                                                       <GEOMETRY [°]>
 1 o06065     1118. MULTIPOLYGON (((-6.71583 53.16379, -6.716134 53.16175, -6.728203 53.15357, -6.731625 53.15206, -6...
 2 o06029      337. MULTIPOLYGON (((-6.828214 53.01617, -6.834301 53.0142, -6.846945 53.0165, -6.851927 53.02025, -6....
 3 o06075      865. MULTIPOLYGON (((-6.82495 53.27751, -6.821514 53.27175, -6.82003 53.27001, -6.821732 53.2688, -6.8...
 4 d35402922     1  POLYGON ((-6.747552 53.1533, -6.747565 53.15278, -6.747606 53.15226, -6.747675 53.15174, -6.74777...

Note MULTIPOLYGON objects suddenly switch to POLYGON objects. Attached is a .zip of a reproducible example that worked post simply changing the geometry type. test-data.zip

opened by Robinlovelace 2

Python interface

We already have a simple R interface, one that relies on system calls rather than the more sophisticated 'rextendr' approach #22: https://github.com/atumworld/odrust

It would be good to have a Python interface. Any Python developers out there very welcome to help out with this!
help wanted

opened by Robinlovelace 3
Consider an rextendr interface
@Robinlovelace, I want to understand the current friction of R calling odjitter by command-line. https://github.com/atumworld/odrust/blob/main/R/odr_jitter.R is how it works today, correct?

Is the problem...

Having to compile the Rust tool on a target system? (#6 solves if so)

Packaging for CRAN and having a dependency on any extra binary tool?

Slow to write input CSV or zone geojson?

Slow to read the output geojson? (We can look at geopackage, flatgeobuf, etc if so)

or something else?
opened by dabreegster 2
Consider adding departure time
If the CSV input has something like a departure_seconds column, the jittered output could have this too. For each output row, the departure time would be jittered somehow -- maybe a uniform or normal distribution centered around the input time? We would need extra config and flags (with default values) to specify all of this.

@lucasccdias, is my understanding correct? How specifically would you want to jitter departure_seconds?

I'm hesitant to add this feature, because I'm not convinced that it will be easy for the user to learn and specify a bunch of extra command-line flags to say how they want to transform time. Instead, why couldn't they add departure time on their end? In other words:

Write the desire line CSV file

Call odjitter on it

Read the output GeoJSON file and add a departure time property, using whatever logic they want

One question is how departure time is determined. Does any of the input desire line data have something like this? How is it specified -- maybe just the hour range that a bunch of trips go from zone1 to zone2? If so, maybe what we should instead do is make it easy to match up the jittered GeoJSON output with the original input, and have some kind of lookup key, or just copy over the departure time property, and do the jittering on that elsewhere.

And backing up a little more, I think the motivation for this feature request was to generate A/B Street scenarios, either with abstr or not. If so, it could be helpful to understand how we want to do that. Part of odjitter input is weighted subpoints, and there's more in-progress code within A/B Street to generate these weights for the exact purpose of creating scenarios. If our ultimate aim is to create scenarios from raw desire line data, we have a spectrum of options how to do it -- some of them using odjitter, some of them directly calling this other pipeline.
opened by dabreegster 2
Sanit checking of weighted results

There are some statistical sanity checks in #7 but I've just done some more sanity checks and the results are good. Summary of them below. Three schools in one zone:

With weights, how many trips have destinations at each?

Strong near-linear positive relationship between n_trips and weight:

opened by Robinlovelace 4

Owner

Dustin Carlino

Speculative cartographer

GitHub

Doku is a framework for building documentation with code-as-data methodology in mind.

Doku is a framework for building documentation with code-as-data methodology in mind. Say goodbye to stale, hand-written documentation - with D

73 Nov 28, 2022

Utilities to gather data out of roms. Written in Rust. It (should) support all types.

snesutilities Utilities to gather data out of roms. Written in Rust. It (should) support all types. How Have a look at main.rs: use snesutilities::Sne

5 Oct 12, 2022

A VtubeStudio plugin that allows iFacialMocap to stream data to the app, enabling full apple ARkit facial tracking to be used for 2D Vtuber models.

facelink_rs A VtubeStudio plugin that allows iFacialMocap to stream data to the app, enabling full apple ARkit facial tracking to be used for 2D Vtube

2 May 6, 2022

A real-time event-oriented data-hub

Redcar A real-time event-oriented data-hub, inspired by the data hub. It is: Universal: the front end uses gRPC to provide services. Fast: benchmarked

6 Mar 2, 2022

Parses COVID-19 testing data from DC government ArcGIS APIs

covid-dc Parses COVID-19 testing data from DC government ArcGIS APIs Example debug output from cargo run RapidSite { attributes: RapidSiteAttribut

1 Jan 8, 2022

Code for connecting an RP2040 to a Bosch BNO055 IMU and having the realtime orientation data be sent to the host machine via serial USB

Code for connecting an RP2040 (via Raspberry Pi Pico) to a Bosch BNO055 IMU (via an Adafruit breakout board) and having the realtime orientation data be sent to the host machine via serial USB.