Blazing fast Rust JSONPath query engine.

Overview

rsonpath – SIMD-powered JSONPath πŸš€

Rust docs.rs

Crates.io GitHub Release Date GitHub last commit

Crates.io

Experimental JSONPath engine for querying massive streamed datasets.

Features

The rsonpath crate provides a JSONPath parser and a query execution engine, which utilizes SIMD instructions to provide massive throughput improvements over conventional engines.

Benchmarks of rsonpath against a reference no-SIMD engine on the Pison dataset. NOTE: Scale is logarithmic! Main throughput plot

Supported selectors

The project is actively developed and currently supports only a subset of the JSONPath query language.

Selector Syntax Supported Since Tracking Issue
Root $ βœ”οΈ v0.1.0
Dot .<label> βœ”οΈ v0.1.0
Index (object member) [<label>] βœ”οΈ v0.1.0
Index (array index) [<index>] ❌ - #64
Index (array index from end) [-<index>] ❌ -
Descendant .. βœ”οΈ v0.1.0
Child wildcard .*, .[*] βœ”οΈ v0.3.0
Descendant wildcard ..*, ..[*] βœ”οΈ v0.4.0
Slice [<start>:<end>:<step>] ❌ -
List [<sel1>, <sel2>, ..., <selN>] ❌ -
Filter [?(<expr>)] ❌ -

Installation

See Releases for precompiled binaries for all first-class support targets.

Easiest way to install is via cargo.

cargo install rsonpath

This might fail with the following error:

Target architecture is not supported by SIMD features of this crate. Disable the default `simd` feature.

This means the SIMD features of the engine are not implemented for the machine's CPU. You can still use rsonpath, but the speed will be limited (see the reference engine in the chart above). To install without simd, run cargo install --no-default-features -F default-optimizations.

Alternatively, you can download the source code and manually run just install (requires just) or cargo install --path ./crates/rsonpath.

Native CPU optimizations

If maximum speed is paramount, you should install rsonpath with native CPU instructions support. This will result in a binary that is not portable and might work incorrectly on any other machine, but will squeeze out every last bit of throughput.

To do this, run the following cargo install variant:

RUSTFLAGS="-C target-cpu=native" cargo install rsonpath

Usage

To run a JSONPath query on a file execute:

rsonpath '$..a.b' ./file.json

If the file is omitted, the engine reads standard input.

For details, consult rsonpath --help.

Results

The results are presented as an array of indices at which a colon of a matching record was found, the comma directly preceding the matched record in a list, or the opening bracket of the list in case of the first element in it. Alternatively, passing --result count returns only the number of matches. Work to support more useful result reports is ongoing.

Engine

By default, the main SIMD engine is used. On machines not supporting SIMD, the recursive implementation might be faster in some cases. To change the engine use --engine recursive.

Supported platforms

The crate is continuously built for all Tier 1 Rust targets, and tests are continuously ran for targets that can be ran with GitHub action images. SIMD is supported only on x86-64 platforms for AVX2, while nosimd builds are always available for all targets.

Target triple nosimd build SIMD support Continuous testing Tracking issues
aarch64-unknown-linux-gnu βœ”οΈ ❌ ❌ #21, #115
i686-unknown-linux-gnu βœ”οΈ ❌ βœ”οΈ #14
x86_64-unknown-linux-gnu βœ”οΈ βœ”οΈ avx2+pclmulqdq βœ”οΈ
x86_64-apple-darwin βœ”οΈ ❌ βœ”οΈ
i686-pc-windows-gnu βœ”οΈ ❌ βœ”οΈ #14
i686-pc-windows-msvc βœ”οΈ ❌ βœ”οΈ #14
x86_64-pc-windows-gnu βœ”οΈ βœ”οΈ avx2+pclmulqdq βœ”οΈ
x86_64-pc-windows-msvc βœ”οΈ βœ”οΈ avx2+pclmulqdq βœ”οΈ

Caveats and limitations

JSONPath

Not all selectors are supported, see the support table above.

Duplicate keys

The engine assumes that every object in the input JSON has no duplicate keys. Behavior on duplicate keys is not guaranteed to be stable, but currently the engine will simply match the first such key.

> rsonpath '$.key'
{"key":"value","key":"other value"}
[6]

This behavior can be overriden with a custom installation of rsonpath, disabling the default unique-labels feature. This will hurt performance.

> cargo install rsonpath --no-default-features -F simd -F head-skip -F tail-skip
> rsonpath '$.key'
{"key":"value","key":"other value"}
[6, 20]

Unicode

The engine does not parse unicode escape sequences in labels. This means that a label "a" is different from a label "\u0041", even though semantically they represent the same string. Parsing unicode sequences is costly, so the support for this was postponed in favour of high performance. It would be possible for a flag to exist to trigger this behaviour, but it is not currently worked on.

Build & test

The dev workflow utilizes just. Use the included Justfile. It will automatically install Rust for you using the rustup tool if it detects there is no Cargo in your environment.

just build
just test

Benchmarks

Benchmarks for rsonpath are located in a separate repository, included as a git submodule in this main repository.

Easiest way to run all the benchmarks is just bench. For details, look at the README in the submodule.

Background

This project is the result of my thesis. You can read it for details on the theoretical background on the engine and details of its implementation.

Dependencies

Showing direct dependencies, for full graph see below.

cargo tree --package rsonpath --edges normal --depth 1
rsonpath v0.4.0 (/home/mat/rsonpath/crates/rsonpath)
β”œβ”€β”€ clap v4.1.11
β”œβ”€β”€ color-eyre v0.6.2
β”œβ”€β”€ eyre v0.6.8
β”œβ”€β”€ log v0.4.17
β”œβ”€β”€ rsonpath-lib v0.4.0 (/home/mat/rsonpath/crates/rsonpath-lib)
└── simple_logger v4.1.0
cargo tree --package rsonpath-lib --edges normal --depth 1
rsonpath-lib v0.4.0 (/home/mat/rsonpath/crates/rsonpath-lib)
β”œβ”€β”€ aligners v0.0.10
β”œβ”€β”€ cfg-if v1.0.0
β”œβ”€β”€ log v0.4.17
β”œβ”€β”€ memchr v2.5.0
β”œβ”€β”€ nom v7.1.3
β”œβ”€β”€ replace_with v0.1.7
β”œβ”€β”€ smallvec v1.10.0
β”œβ”€β”€ thiserror v1.0.40
└── vector-map v1.0.1

Justification

  • clap – standard crate to provide the CLI.

  • color-eyre, eyre – more accessible error messages for the parser.

  • log, simple-logger – diagnostic logs during compilation and execution.

  • aligners – SIMD operations require correct input data alignment, putting those requirements at type level makes our code more robust.

  • cfg-if – used to support SIMD and no-SIMD versions.

  • memchr – rapid, SIMDified substring search for fast-forwarding to labels.

  • nom – for parser implementation.

  • replace_with – for safe handling of internal classifier state when switching classifiers.

  • smallvec – crucial for small-stack performance.

  • thiserror – idiomatic Error implementations.

  • vector_map – used in the query compiler for measurably better performance.

Full dependency tree

cargo tree --package rsonpath --edges normal
rsonpath v0.4.0 (/home/mat/rsonpath/crates/rsonpath)
β”œβ”€β”€ clap v4.1.11
β”‚   β”œβ”€β”€ bitflags v2.0.2
β”‚   β”œβ”€β”€ clap_derive v4.1.9 (proc-macro)
β”‚   β”‚   β”œβ”€β”€ heck v0.4.1
β”‚   β”‚   β”œβ”€β”€ proc-macro-error v1.0.4
β”‚   β”‚   β”‚   β”œβ”€β”€ proc-macro-error-attr v1.0.4 (proc-macro)
β”‚   β”‚   β”‚   β”‚   β”œβ”€β”€ proc-macro2 v1.0.52
β”‚   β”‚   β”‚   β”‚   β”‚   └── unicode-ident v1.0.6
β”‚   β”‚   β”‚   β”‚   └── quote v1.0.26
β”‚   β”‚   β”‚   β”‚       └── proc-macro2 v1.0.52 (*)
β”‚   β”‚   β”‚   β”œβ”€β”€ proc-macro2 v1.0.52 (*)
β”‚   β”‚   β”‚   β”œβ”€β”€ quote v1.0.26 (*)
β”‚   β”‚   β”‚   └── syn v1.0.107
β”‚   β”‚   β”‚       β”œβ”€β”€ proc-macro2 v1.0.52 (*)
β”‚   β”‚   β”‚       β”œβ”€β”€ quote v1.0.26 (*)
β”‚   β”‚   β”‚       └── unicode-ident v1.0.6
β”‚   β”‚   β”œβ”€β”€ proc-macro2 v1.0.52 (*)
β”‚   β”‚   β”œβ”€β”€ quote v1.0.26 (*)
β”‚   β”‚   └── syn v1.0.107 (*)
β”‚   β”œβ”€β”€ clap_lex v0.3.1
β”‚   β”‚   └── os_str_bytes v6.4.1
β”‚   β”œβ”€β”€ is-terminal v0.4.3
β”‚   β”‚   β”œβ”€β”€ io-lifetimes v1.0.5
β”‚   β”‚   β”‚   └── libc v0.2.139
β”‚   β”‚   └── rustix v0.36.8
β”‚   β”‚       β”œβ”€β”€ bitflags v1.3.2
β”‚   β”‚       β”œβ”€β”€ io-lifetimes v1.0.5 (*)
β”‚   β”‚       β”œβ”€β”€ libc v0.2.139
β”‚   β”‚       └── linux-raw-sys v0.1.4
β”‚   β”œβ”€β”€ once_cell v1.17.0
β”‚   β”œβ”€β”€ strsim v0.10.0
β”‚   β”œβ”€β”€ termcolor v1.2.0
β”‚   └── terminal_size v0.2.3
β”‚       └── rustix v0.36.8 (*)
β”œβ”€β”€ color-eyre v0.6.2
β”‚   β”œβ”€β”€ backtrace v0.3.67
β”‚   β”‚   β”œβ”€β”€ addr2line v0.19.0
β”‚   β”‚   β”‚   └── gimli v0.27.1
β”‚   β”‚   β”œβ”€β”€ cfg-if v1.0.0
β”‚   β”‚   β”œβ”€β”€ libc v0.2.139
β”‚   β”‚   β”œβ”€β”€ miniz_oxide v0.6.2
β”‚   β”‚   β”‚   └── adler v1.0.2
β”‚   β”‚   β”œβ”€β”€ object v0.30.3
β”‚   β”‚   β”‚   └── memchr v2.5.0
β”‚   β”‚   └── rustc-demangle v0.1.21
β”‚   β”œβ”€β”€ eyre v0.6.8
β”‚   β”‚   β”œβ”€β”€ indenter v0.3.3
β”‚   β”‚   └── once_cell v1.17.0
β”‚   β”œβ”€β”€ indenter v0.3.3
β”‚   β”œβ”€β”€ once_cell v1.17.0
β”‚   └── owo-colors v3.5.0
β”œβ”€β”€ eyre v0.6.8 (*)
β”œβ”€β”€ log v0.4.17
β”‚   └── cfg-if v1.0.0
β”œβ”€β”€ rsonpath-lib v0.4.0 (/home/mat/rsonpath/crates/rsonpath-lib)
β”‚   β”œβ”€β”€ aligners v0.0.10
β”‚   β”‚   β”œβ”€β”€ cfg-if v1.0.0
β”‚   β”‚   β”œβ”€β”€ lazy_static v1.4.0
β”‚   β”‚   └── page_size v0.4.2
β”‚   β”‚       └── libc v0.2.139
β”‚   β”œβ”€β”€ cfg-if v1.0.0
β”‚   β”œβ”€β”€ log v0.4.17 (*)
β”‚   β”œβ”€β”€ memchr v2.5.0
β”‚   β”œβ”€β”€ nom v7.1.3
β”‚   β”‚   β”œβ”€β”€ memchr v2.5.0
β”‚   β”‚   └── minimal-lexical v0.2.1
β”‚   β”œβ”€β”€ replace_with v0.1.7
β”‚   β”œβ”€β”€ smallvec v1.10.0
β”‚   β”œβ”€β”€ thiserror v1.0.40
β”‚   β”‚   └── thiserror-impl v1.0.40 (proc-macro)
β”‚   β”‚       β”œβ”€β”€ proc-macro2 v1.0.52 (*)
β”‚   β”‚       β”œβ”€β”€ quote v1.0.26 (*)
β”‚   β”‚       └── syn v2.0.4
β”‚   β”‚           β”œβ”€β”€ proc-macro2 v1.0.52 (*)
β”‚   β”‚           β”œβ”€β”€ quote v1.0.26 (*)
β”‚   β”‚           └── unicode-ident v1.0.6
β”‚   └── vector-map v1.0.1
β”‚       β”œβ”€β”€ contracts v0.4.0 (proc-macro)
β”‚       β”‚   β”œβ”€β”€ proc-macro2 v1.0.52 (*)
β”‚       β”‚   β”œβ”€β”€ quote v1.0.26 (*)
β”‚       β”‚   └── syn v1.0.107 (*)
β”‚       └── rand v0.7.3
β”‚           β”œβ”€β”€ getrandom v0.1.16
β”‚           β”‚   β”œβ”€β”€ cfg-if v1.0.0
β”‚           β”‚   └── libc v0.2.139
β”‚           β”œβ”€β”€ libc v0.2.139
β”‚           β”œβ”€β”€ rand_chacha v0.2.2
β”‚           β”‚   β”œβ”€β”€ ppv-lite86 v0.2.17
β”‚           β”‚   └── rand_core v0.5.1
β”‚           β”‚       └── getrandom v0.1.16 (*)
β”‚           └── rand_core v0.5.1 (*)
└── simple_logger v4.1.0
    β”œβ”€β”€ colored v2.0.0
    β”‚   β”œβ”€β”€ atty v0.2.14
    β”‚   β”‚   └── libc v0.2.139
    β”‚   └── lazy_static v1.4.0
    β”œβ”€β”€ log v0.4.17 (*)
    └── time v0.3.17
        β”œβ”€β”€ itoa v1.0.5
        β”œβ”€β”€ libc v0.2.139
        β”œβ”€β”€ num_threads v0.1.6
        β”œβ”€β”€ time-core v0.1.0
        └── time-macros v0.2.6 (proc-macro)
            └── time-core v0.1.0
Comments
  • Shell.nix created

    Shell.nix created

    Issue

    This change includes support for nix-based environments in the project via a nix shell file.

    There is not currently an issue for this item.

    I am not sure if the submodules are updated correctly. Thanks!

    Checklist

    All of these should be ticked off before you submit the PR.

    • [X] I ran just verify locally and it succeeded.
    • [x] Issue was given go ahead and is linked above OR I have included justification for a minor change.
    • [X] Unit tests for my changes are not included because no functionality was changed.
    opened by ZwerdDPU 6
  • Any Descendant

    Any Descendant

    Issue

    This PR implements descendant wildcard search.

    Resolves: #68 #69 #70 #71 #72

    Checklist

    • [X] I ran just verify locally and it succeeded.
    • [X] Issue was given go ahead and is linked above
    • [X] Unit tests for my changes are included
    opened by ZwerdDPU 4
  • Facing an error while running Rson vs Jsonski benchmark

    Facing an error while running Rson vs Jsonski benchmark

    Describe the bug A clear and concise description of what the bug is.

    I am trying to run rson vs jsonski benchmark facing this error where I am unable to debug since I am not experienced with Rust. I have attached the image.

    Proposed solution (optional) Idea on an appropriate fix.

    Desktop (please complete the following information):

    • MacOS - i9
    • rustc 1.64.0 (a55dd71d5 2022-09-19)

    Additional context Add any other context about the problem here. image

    type: bug type: question closed: user error area: benchmarks 
    opened by gandharva26 4
  • Bug (panic) parsing incorrect Json

    Bug (panic) parsing incorrect Json

    Describe the bug panic message when reading incorrect value.

    The application panicked (crashed).
    Message:  index out of bounds: the len is 128 but the index is 18446744073709551615
    Location: /home/cha/git/rust/rsonpath/rsonpath/src/stackless.rs:297
    ``
    
    **MRE**
    Example of incorrectly formatted document:
    
    {'a': {'b': 'c', 'e': 'f'}, 'g': 'h'}
    
    (Python dict not Jsonified)
    
    **Expected behavior**
    If the formatting error is detectable, throw an incorrect input error.
    mod: engine type: reliability area: library 
    opened by charles-paperman 3
  • Build for `aarch64-unknown-linux-gnu` fails due to

    Build for `aarch64-unknown-linux-gnu` fails due to "Target architecture is not supported by SIMD features of this crate. Disable the default `simd` feature" in some circumstances

    Describe the bug Reported in https://github.com/NixOS/nixpkgs/pull/223333

    Build fails on architectures not supporting SIMD because aligners is compiled with the simd feature unconditionally.

    The real riddle here is why the CI pipeline for aarch64-unknown-linux-gnu succeeds. No idea as of now.

    MRE Repro is hard currently, but it can be seen here: https://github.com/NixOS/nixpkgs/pull/223333#discussion_r1150897852

    Expected behavior

    1. Code compiles with aligners/simd disabled.
    2. Issues like these are caught during release.

    Workarounds (optional) I don't think there is one? You can enable features of a dependency by passing --features dep/feat, but there doesn't seem to be a flag to disable default features of a dependency.

    Proposed solution (optional) Fix is simple, pass the feature flag to aligners in Cargo.toml, BUT before merging that we absolutely need to get a repro in the CI pipeline. That this wasn't caught earlier is weird and unacceptable.

    Desktop (please complete the following information):

    • Rust version: 1.68.1
    • Target triple: aarch64-unknown-linux-gnu
    • Features: --no-default-features --features=default-optimizations
    • Version: v0.3.2
    type: bug area: devops 
    opened by V0ldek 2
  • Compile-only CLI flag

    Compile-only CLI flag

    Is your feature request related to a problem? Please describe. When implementing new selectors or just diagnosing bugs with a particular query, we often have to look at the compilation itself and the resulting automaton. It would be nice to have a way to run the binary only to the point of compilation, and not involve any of the engines.

    Describe the solution you'd like A new variant to the --engine switch sounds ideal. Example run:

    just r '$..a.b' -e compiler -v
    
    DEBUG [rsonpath_lib::query::parser] Parsed tokens: $(Descendant("a"))(Child("b"))
    INFO  [rsonpath] Preparing query: `$..['a']['b']`
    
    DEBUG [rsonpath_lib::query::automaton] NFA: r1 --a-> d1 --b-> acc
    DEBUG [rsonpath_lib::query::automaton::minimizer] New superstate created: {NfaStateId(0)} DFA(1)
    DEBUG [rsonpath_lib::query::automaton::minimizer] Expanding superstate: {NfaStateId(0)}, last checkpoint is Some(NfaStateId(0))
    DEBUG [rsonpath_lib::query::automaton::minimizer] Considering transition NFA(0) --"a"-> NFA(1)
    DEBUG [rsonpath_lib::query::automaton::minimizer] Raw transitions: {"a": {NfaStateId(1)}}
    DEBUG [rsonpath_lib::query::automaton::minimizer] New superstate created: {NfaStateId(0), NfaStateId(1)} DFA(2)
    DEBUG [rsonpath_lib::query::automaton::minimizer] Normalized transitions: {"a": {NfaStateId(0), NfaStateId(1)}}
    DEBUG [rsonpath_lib::query::automaton::minimizer] Translated transitions: [("a", State(2))]
    DEBUG [rsonpath_lib::query::automaton::minimizer] Expanding superstate: {NfaStateId(0), NfaStateId(1)}, last checkpoint is Some(NfaStateId(0))
    DEBUG [rsonpath_lib::query::automaton::minimizer] Considering transition NFA(0) --"a"-> NFA(1)
    DEBUG [rsonpath_lib::query::automaton::minimizer] Considering transition NFA(1) --"b"-> NFA(2)
    DEBUG [rsonpath_lib::query::automaton::minimizer] Raw transitions: {"a": {NfaStateId(1)}, "b": {NfaStateId(2)}}
    DEBUG [rsonpath_lib::query::automaton::minimizer] New superstate created: {NfaStateId(0), NfaStateId(2)} DFA(3)
    DEBUG [rsonpath_lib::query::automaton::minimizer] Normalized transitions: {"a": {NfaStateId(0), NfaStateId(1)}, "b": {NfaStateId(0), NfaStateId(2)}}
    DEBUG [rsonpath_lib::query::automaton::minimizer] Translated transitions: [("a", State(2)), ("b", State(3))]
    DEBUG [rsonpath_lib::query::automaton::minimizer] Expanding superstate: {NfaStateId(0), NfaStateId(2)}, last checkpoint is Some(NfaStateId(0))
    DEBUG [rsonpath_lib::query::automaton::minimizer] Considering transition NFA(0) --"a"-> NFA(1)
    DEBUG [rsonpath_lib::query::automaton::minimizer] Raw transitions: {"a": {NfaStateId(1)}}
    DEBUG [rsonpath_lib::query::automaton::minimizer] Normalized transitions: {"a": {NfaStateId(0), NfaStateId(1)}}
    DEBUG [rsonpath_lib::query::automaton::minimizer] Translated transitions: [("a", State(2))]
    DEBUG [rsonpath_lib::stackless] DFA:
     digraph {
      0 -> 0 [label="*"]
      1 -> 2 [label="a"]
      1 -> 1 [label="*"]
      2 -> 2 [label="a"]
      2 -> 3 [label="b"]
      2 -> 1 [label="*"]
      3 -> 2 [label="a"]
      3 -> 1 [label="*"]
    }
    INFO  [rsonpath] Compilation finished.
    

    Describe alternatives you've considered A dedicated flag could be used, --compile (short -c). It would be mutually exclusive with -e. Since the run would be completely different we could have the binary just output the compiled automaton to stdout in .dot format, which would make inspecting it easier.

    type: feature mod: cli area: app 
    opened by V0ldek 2
  • Parse the wildcard descendant selector `..*`/`..[*]`

    Parse the wildcard descendant selector `..*`/`..[*]`

    Is your feature request related to a problem? Please describe. The wildcard descendant selector should be recognized by the parser and parsed into an appropriate JsonPathQueryNode. Note that this means both the ..* and ..[*] patterns. The index and non-index version have identical semantics.

    Describe the solution you'd like This should follow the same approach as #6, which should be completed first.

    Additional context Find the syntax for the selectors in the RFC:

    type: feature contribute: help wanted contribute: good first issue mod: parser area: selector 
    opened by V0ldek 2
  • Include datasets and queries from JSONSki into benchmarks

    Include datasets and queries from JSONSki into benchmarks

    Is your feature request related to a problem? Please describe. The datasets described in the JSONSki paper are available on Google Drive, but are not incorporated into our benchmarks.

    Describe the solution you'd like We need the datasets included and experiments configured for those datasets. If wildcard selectors get implemented (#9), we can take the queries as-is. However, it is still beneficial to consider rewrites of these queries using the descendant selector not available to JSONSki for potential performance gains (and easier query formulations). This needs to be investigated.

    type: reliability area: benchmarks 
    opened by V0ldek 2
  • dep: bump test-case from 3.0.0 to 3.1.0

    dep: bump test-case from 3.0.0 to 3.1.0

    Bumps test-case from 3.0.0 to 3.1.0.

    Release notes

    Sourced from test-case's releases.

    Test Case - v3.1.0

    What's Changed

    New Contributors

    Full Changelog: https://github.com/frondeus/test-case/compare/v3.0.0...v3.1.0

    Changelog

    Sourced from test-case's changelog.

    3.1.0

    New features

    • Copy attribute span to generated test functions so that IDEs recognize them properly as individual tests

    Changes

    • Added LICENSE file to child crates
    Commits
    • ea03c58 Bump version to 3.1.0
    • fa1c648 Set the span of test function Idents to the span of the origin attribute (#119)
    • 39faeaa Update tests to most recent nightly
    • 6a2540a Set MSRV in Cargo.toml to 1.58.1 (#116)
    • See full diff in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies rust 
    opened by dependabot[bot] 1
  • Add `jsonpath-rust` as a benchmark target

    Add `jsonpath-rust` as a benchmark target

    Is your feature request related to a problem? Please describe. The jsonpath-rust crate is arguably the most potent competitor as a Rust-native JSONPath crate. Other JSONPath crates for Rust exist, but are umaintained. This one is frequently updated and has a lot of users. We need to integrate it with the benches and run performance comparisons.

    type: reliability area: benchmarks 
    opened by V0ldek 1
  • Add `serde_json_path` as a benchmark target

    Add `serde_json_path` as a benchmark target

    Is your feature request related to a problem? Please describe. TIL there exists serde_json_path, a crate for JSONPath integrated to serde, which makes it a viable competitor as a Rust-native DOM-building competitor. We need to add it as a benchmark target and gather performance results.

    type: reliability area: benchmarks 
    opened by V0ldek 1
  • Properly handle UTF-8 labels

    Properly handle UTF-8 labels

    Is your feature request related to a problem? Please describe. The engine currently violates the JSON spec by not normalizing Unicode escapes. We do this for performance purposes, since ordinal comparison can be easily SIMDified, but it's not correct.

    For a simple example, the UTF-8 codepoint for the letter "a" is 0x0061. These JSONs are equivalent under RFC 8259:

    {"a":42}
    
    {"\u0061":42}
    

    Therefore the query $["a"] should in both cases match the value 42.

    Quite sensibly, and indeed officially under the current JSONPath RFC Draft, the queries $["a"] and $["\u0061"] must also be equivalent. All four combinations of the two documents above and the two queries must yield the same result -- the value 42.

    Describe the solution you'd like The tradeoff here is important. We expect the difference in performance to be staggering, especially since the head-skip optimisation is by design incompatible with this. We need a flag that will toggle this behaviour. I propose we make this the optional behaviour – we expect the vast majority of labels to be ASCII, if a user wants to match unicode they can use the flag.

    type: feature acceptance: go ahead mod: engine mod: cli mod: compiler area: app 
    opened by V0ldek 1
  • Complete JSONPath parser

    Complete JSONPath parser

    Is your feature request related to a problem? Please describe. Currently our approach was to bundle parsing JSONPath selectors a part of enabling support for them entirely, as the first step in the process.

    Pros of the current approach

    The pros of that approach is that we're rather immune to any syntax changes that could be proposed in further refinements of the JSONPath RFC, and the parser is relatively small.

    Cons of the current approach

    The cons are threefold, engineering-wise, UX-wise, and from the usability of rsonpath-lib as an API.

    UX

    As of this writing we don't have descendant wildcards yet (#68), but we do have child wildcards. What happens if a user tries to query for $.data[*]..*.value?

    Error: 0: Could not parse JSONPath query. 1: One or more syntax errors occurred. Parse error: $.data[*]..*.value ^ invalid tokens

    That's technically correct, but utterly useless. Without that one dot token it compiles, yes, but the true issue here is that this is valid JSONPath, we just don't support the ..* construct yet. Writing proper error messages for this would require the parser to understand ..*, and then an error from the compiler saying "we can't handle this yet".

    Engineering

    We need to basically maintain a second JSONPath grammar. While it's essentially a subset, we need to document the differences and any additions will result in an effective change of the public API and the CLI experience. It's also hard to maintain the parser's error messages consistent, since what is now an "invalid token" with no further explanation could be a valid token later.

    Library

    The query and engine modules are separate, and query has a clearly designated compiler part (automaton). We strive to be a comprehensive JSONPath library, which means that the query module should recognise as much of JSONPath as it can without relying on the engine to understand that. That way we make extensibility easier – say we wanted to roll out a DOM-based engine for all of JSONPath tomorrow, because it's vastly easier than a streaming one and still useful. It makes no modular sense to restrict the query API based on what the consumer of that API, automaton and engine, support at the moment.

    Describe the solution you'd like

    1. Take the newest RFC proposal and produce a syntax file. Unfortunately the grammar is scattered around the doc and needs to be assembled together. We want the explicit EBNF grammar as part of our docs.

    2. Implement the parser to fulfill that grammar and extend JsonPathQuery to support all constructs. Extensive testing is needed here.

    3. Make error messages nice.

    Describe alternatives you've considered We can just do nothing and take these in stride as new selectors come, but with all the cons it entails (described above).

    type: feature contribute: help wanted acceptance: go ahead mod: parser area: app 
    opened by V0ldek 0
  • Add support for NEON (128-bit wide SIMD for ARM) for 64-bit architectures

    Add support for NEON (128-bit wide SIMD for ARM) for 64-bit architectures

    Is your feature request related to a problem? Please describe. SIMD acceleration is implemented for x86 and tracked in #21 for 32-bit ARM. We also need support for 64-bit ARM.

    Describe the solution you'd like We expect most of the design around extracting architecture-specific bits to be done in #14. After that, a similar approach can be used here and in #21.

    Additional context Find NEON intrinsics documentation here.

    I am not knowledgable in NEON and I don't even know how to emulate an ARM system locally, so help here is really needed.

    type: feature contribute: help wanted acceptance: go ahead contribute: simd area: performance mod: classifier 
    opened by V0ldek 1
  • Make the compiler produce minimal DFAs

    Make the compiler produce minimal DFAs

    Is your feature request related to a problem? Please describe. After #7 the compiler no longer necessarily outputs a minimal DFA. It's still a correct DFA, just not minimal.

    Describe the solution you'd like The algorithm in query::automaton::minimizer needs to be rewritten to minimize the automaton.

    Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

    Additional context Example query that produces a non-minimal automaton: $..a.*.*..b.*.*. In the resulting automaton below states 3 and 4 are equivalent.

    garph

    type: feature contribute: help wanted acceptance: go ahead area: performance mod: compiler 
    opened by V0ldek 3
  • Parse the negative index selector

    Parse the negative index selector

    Is your feature request related to a problem? Please describe. Extend the support for the index selector intriduced in #60 to negative values. The negative value selects from the end of the list.

    Describe the solution you'd like The error message introduced in #60 should be replaced with actual parsing code.

    Additional context Syntax formulation in the RFS.

    type: feature acceptance: triage 
    opened by V0ldek 1
  • Introduce the index selector (non-negative)

    Introduce the index selector (non-negative)

    Tracking issue for the list Index Selector [n], where n is a positive constant. Support for negative (selecting from array end) is tracked at TODO.

    • [ ] Parser #60
    • [ ] Compiler #61
    • [ ] Engine (recursive) #62
    • [ ] Engine (main) #63
    acceptance: go ahead type: tracking area: selector 
    opened by V0ldek 1
Releases(v0.4.0)
Owner
V0ldek
Mateusz Gienieczko. C# .NET developer. Programming language aficionado.
V0ldek
A JSON Query Language CLI tool built with Rust πŸ¦€

JQL A JSON Query Language CLI tool built with Rust ?? ?? Core philosophy ?? Stay lightweight ?? Keep its features as simple as possible ?? Avoid redun

Davy Duperron 872 Jan 1, 2023
A fast and simple command-line tool for common operations over JSON-lines files

rjp: Rapid JSON-lines processor A fast and simple command-line tool for common operations over JSON-lines files, such as: converting to and from text

Ales Tamchyna 3 Jul 8, 2022
A fast way to minify JSON

COMPACTO (work in progress) A fast way to minify JSON. Usage/Examples # Compress # Input example (~0.11 KB) # { # "id": "123", # "name": "Edua

Eduardo Stuart 4 Feb 27, 2022
JSON parser which picks up values directly without performing tokenization in Rust

Pikkr JSON parser which picks up values directly without performing tokenization in Rust Abstract Pikkr is a JSON parser which picks up values directl

Pikkr 615 Dec 29, 2022
Strongly typed JSON library for Rust

Serde JSON   Serde is a framework for serializing and deserializing Rust data structures efficiently and generically. [dependencies] serde_json = "1.0

null 3.6k Jan 5, 2023
Rust port of simdjson

SIMD Json for Rust   Rust port of extremely fast simdjson JSON parser with serde compatibility. readme (for real!) simdjson version Currently tracking

null 737 Dec 30, 2022
JSON implementation in Rust

json-rust Parse and serialize JSON with ease. Changelog - Complete Documentation - Cargo - Repository Why? JSON is a very loose format where anything

Maciej Hirsz 500 Dec 21, 2022
Rust port of gjson,get JSON value by dotpath syntax

A-JSON Read JSON values quickly - Rust JSON Parser change name to AJSON, see issue Inspiration comes from gjson in golang Installation Add it to your

Chen Jiaju 90 Dec 6, 2022
Get JSON values quickly - JSON parser for Rust

get json values quickly GJSON is a Rust crate that provides a fast and simple way to get values from a json document. It has features such as one line

Josh Baker 160 Dec 29, 2022
This library is a pull parser for CommonMark, written in Rust

This library is a pull parser for CommonMark, written in Rust. It comes with a simple command-line tool, useful for rendering to HTML, and is also designed to be easy to use from as a library.

Raph Levien 1.5k Jan 1, 2023
A rust script to convert a better bibtex json file from Zotero into nice organised notes in Obsidian

Zotero to Obsidian script This is a script that takes a better bibtex JSON file exported by Zotero and generates an organised collection of reference

Sashin Exists 3 Oct 9, 2022
Fontdue - The fastest font renderer in the world, written in pure rust.

Fontdue is a simple, no_std (does not use the standard library for portability), pure Rust, TrueType (.ttf/.ttc) & OpenType (.otf) font rasterizer and layout tool. It strives to make interacting with fonts as fast as possible, and currently has the lowest end to end latency for a font rasterizer.

Joe C 1k Jan 2, 2023
CLI tool to convert HOCON into valid JSON or YAML written in Rust.

{hocon:vert} CLI Tool to convert HOCON into valid JSON or YAML. Under normal circumstances this is mostly not needed because hocon configs are parsed

Mathias Oertel 23 Jan 6, 2023
Typify - Compile JSON Schema documents into Rust types.

Typify Compile JSON Schema documents into Rust types. This can be used ... via the macro import_types!("types.json") to generate Rust types directly i

Oxide Computer Company 73 Dec 27, 2022
A easy and declarative way to test JSON input in Rust.

assert_json A easy and declarative way to test JSON input in Rust. assert_json is a Rust macro heavily inspired by serde json macro. Instead of creati

Charles Vandevoorde 8 Dec 5, 2022
Hjson for Rust

hjson-rust for serde { # specify rate in requests/second (because comments are helpful!) rate: 1000 // prefer c-style comments? /* feeling ol

Hjson 83 Oct 5, 2022
A small rust database that uses json in memory.

Tiny Query Database (TQDB) TQDB is a small library for creating a query-able database that is encoded with json. The library is well tested (~96.30% c

Kace Cottam 2 Jan 4, 2022
An HCL serializer/deserializer for rust

hcl-rs This crate provides functionality to deserialize, serialize and manipulate HCL data. The main types are Deserializer for deserializing data, Se

null 56 Dec 31, 2022
Rust libraries and tools to help with interoperability and testing of serialization formats based on Serde.

The repository zefchain/serde-reflection is based on Facebook's repository novifinancial/serde-reflection. We are now maintaining the project here and

Zefchain Labs 46 Dec 22, 2022