Query textual streams with PromQL-like language

Related tags

Text processing pq
Overview

pq - query textual streams with PromQL

Glossary

  • Time Series - a stream of timestamped values, aka samples sharing the same metric name and, optionally, the same set of labels (i.e. a unique combination of key-value pairs).
  • Metric name - a human-readable name of a measurement. E.g. http_requests_total, content_length, etc).
  • Metric type - counter, gauge, histogram, and summary.
  • Label - a dimension of the measurement. E.g. method, url, etc.
  • Sample - aka data point - a (value, timestamp) tuple. Value is always float64 and timestamp is always with millisecond precision.
  • Instant vector - a type of expression evaluation - a set of time series (vector) containing a single sample for each time series, all sharing the same timestamp.
  • Range vector - a type of expression evaluation - a set of time series containing a range of data points over time for each time series.
  • Scalar and string - two other expression evaluation results.
  • Vector selector - expression of a form <metric_name>[{label1=value1[, label2=value2, ...]}][[time_duration]].

Run

$ cargo test

$ cat | cargo run -- -d '([^\s]+)\s(\w+)\s(\d+)' -t '0:%Y-%m-%dT%H:%M:%S' -l 1:name -m 2:age -- '-age{name=~"(bob|sarah)", name!~"b.*"}' <<EOF
2021-01-01T05:40:41 bob 42
2021-01-01T23:59:58 sarah 25
2021-01-02T00:00:02 bob 42
2021-01-02T00:00:03 sarah 26
EOF

# Expected output:
InstantVector(InstantVector { instant: 1609545598000, samples: [({"name": "sarah", "__name__": "age"}, -25.0)] })
InstantVector(InstantVector { instant: 1609545599000, samples: [] })
InstantVector(InstantVector { instant: 1609545600000, samples: [] })
InstantVector(InstantVector { instant: 1609545601000, samples: [] })
InstantVector(InstantVector { instant: 1609545602000, samples: [] })
InstantVector(InstantVector { instant: 1609545603000, samples: [({"name": "sarah", "__name__": "age"}, -26.0)] })
Comments
  • Fix clippy warningis & errors

    Fix clippy warningis & errors

    A follow up to the issue I opened: https://github.com/iximiuz/pq/issues/3 This PR is a step towards having a working CI/CD - only fixing clippy warnings / errors.

    Edit: I fixed the warning below, @iximiuz please take a careful look there to make sure I maintained the logic there.

    Note:

    One error still remains which I'm currently not sure how to solve, and would like any help!

    error: this loop never actually loops
       --> src/query/binary.rs:273:24
        |
    273 |           let (lv, rv) = loop {
        |  ________________________^
    274 | |             let (lv, rv) = match (self.left.peek(), self.right.peek()) {
    275 | |                 (Some(InstantVector(lv)), Some(InstantVector(rv))) => (lv, rv),
    276 | |                 (None, _) | (_, None) => return None,
    ...   |
    301 | |             )));
    302 | |         };
        | |_________^
        |
        = note: `#[deny(clippy::never_loop)]` on by default
        = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#never_loop
    
    opened by yonipeleg33 7
  • pq programs with arbitrary number of filters

    pq programs with arbitrary number of filters

    At the moment, a valid pq program must always start from a decoder step, then have 0 or 1 mapping (transform step), and then 0 or 1 query step, potentially followed by a formatter. While I don't see how multiple query steps can be added, a more generic format like decode | map | map | ... | map | query | format would make the pq language much more expressive. Actually, the very first decode step would just become a syntactic sugar for map { .0: decode }.

    The change would require some moderate refactoring, of course.

    opened by iximiuz 0
  • Add Grok format support

    Add Grok format support

    ELK's Grok is a powerful and quite concise format to parse arbitrary text and structure it. pq partially reinvents it by combining a regex decoder with a map filter. But having a native Grok support would help the adoption - there are likely people already familiar with Grok.

    opened by iximiuz 1
  • Improve string literal parser

    Improve string literal parser

    At the moment, the string literal parser is naive. For instance, it doesn't handle escape sequences properly. nom has an example of a more robust string literal parser, so it can be adapted to suit pq use cases. However, the licensing aspect needs to be researched first. Can nom's example be just copied or it has to be copied with the license notice, or it cannot be copied at all?

    good first issue 
    opened by iximiuz 0
  • Implement missing functions

    Implement missing functions

    opened by iximiuz 0
  • Add Basic CI/CD

    Add Basic CI/CD

    Hello there!

    1. Currently, running cargo clippy yields:
    ...
    error: aborting due to 4 previous errors; 56 warnings emitted
    
    1. No formatting is enforced on new PRs.
    2. Currently, there's no way of using pq without building it locally afaik, which can be a huge drawback.

    Suggestion:

    1. Fix all clippy warnings / errors
    2. Add a GitHub Action that runs cargo clippy, cargo fmt, make test-all and make test-e2e on every PR and push to master
    3. Add a GitHub Action that publishes to crates.io so that one can install it using cargo install pq (and consider packaging it to apt and such, I offered crates.io as it's the easiest option I think)
    opened by yonipeleg33 11
Releases(v0.1-pre-alpha.2)
Owner
Ivan Velichko
Code for your life!
Ivan Velichko
Makdown-like text parser.

Makdown-like text parser.

Ryo Nakamura 1 Dec 7, 2021
Natural language detection library for Rust. Try demo online: https://www.greyblake.com/whatlang/

Whatlang Natural language detection for Rust with focus on simplicity and performance. Content Features Get started Documentation Supported languages

Sergey Potapov 805 Dec 28, 2022
👄 The most accurate natural language detection library in the Rust ecosystem, suitable for long and short text alike

Table of Contents What does this library do? Why does this library exist? Which languages are supported? How good is it? Why is it better than other l

Peter M. Stahl 569 Jan 3, 2023
A fast, low-resource Natural Language Processing and Text Correction library written in Rust.

nlprule A fast, low-resource Natural Language Processing and Error Correction library written in Rust. nlprule implements a rule- and lookup-based app

Benjamin Minixhofer 496 Jan 8, 2023
Natural Language Processing for Rust

rs-natural Natural language processing library written in Rust. Still very much a work in progress. Basically an experiment, but hey maybe something c

Chris Tramel 211 Dec 28, 2022
Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigram (SentencePiece) models

rust-tokenizers Rust-tokenizer offers high-performance tokenizers for modern language models, including WordPiece, Byte-Pair Encoding (BPE) and Unigra

null 165 Jan 1, 2023
Simple, extendable and embeddable scripting language.

duckscript duckscript SDK CLI Simple, extendable and embeddable scripting language. Overview Language Goals Installation Homebrew Binary Release Ducks

Sagie Gur-Ari 356 Dec 24, 2022
A HDPSG-inspired symbolic natural language parser written in Rust

Treebender A symbolic natural language parsing library for Rust, inspired by HDPSG. What is this? This is a library for parsing natural or constructed

Theia Vogel 32 Dec 26, 2022
Rust-nlp is a library to use Natural Language Processing algorithm with RUST

nlp Rust-nlp Implemented algorithm Distance Levenshtein (Explanation) Jaro / Jaro-Winkler (Explanation) Phonetics Soundex (Explanation) Metaphone (Exp

Simon Paitrault 34 Dec 20, 2022
lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike.

lingua-py lingua-rs Python binding. An accurate natural language detection library, suitable for long and short text alike. Installation pip install l

messense 7 Dec 30, 2022
The Reactive Extensions for the Rust Programming Language

This is an implementation of reactive streams, which, at the high level, is patterned off of the interfaces and protocols defined in http://reactive-s

ReactiveX 468 Dec 20, 2022
Ultra-fast, spookily accurate text summarizer that works on any language

pithy 0.1.0 - an absurdly fast, strangely accurate, summariser Quick example: pithy -f your_file_here.txt --sentences 4 --help: Print this help messa

Catherine Koshka 13 Oct 31, 2022
A Google-like web search engine that provides the user with the most relevant websites in accordance to his/her query, using crawled and indexed textual data and PageRank.

Mini Google Course project for the Architecture of Computer Systems course. Overview: Architecture: We are working on multiple components of the web c

Max 11 Aug 10, 2022
PromQL Parser in Rust w/ native Node bindings

⚙️ promql-parser-js PromQL parsing wasm module based on Rust crate promql-rs Status Experiemental, don't use it! Install npm install @qxip/promql-pars

Lorenzo Mangani 2 Aug 7, 2022
frawk is a small programming language for writing short programs processing textual data

frawk frawk is a small programming language for writing short programs processing textual data. To a first approximation, it is an implementation of t

Eli Rosenthal 1k Jan 7, 2023
Converts images into textual line art.

img2utf Transform images to textual line art! Images require pre-processing to come out nice. It's expected that users will do the following: Apply ga

Lee 149 Dec 10, 2022
Access German-language public broadcasting live streams and archives on the Linux Desktop

Deutsche Version Televido Televido (“Television” in Esperanto) lets you livestream, search, play and download media from German-language public televi

David C. 10 Nov 4, 2023
Putting a brain behind `cat`🐈‍⬛ Integrating language models in the Unix commands ecosystem through text streams.

smartcat (sc) Puts a brain behind cat! CLI interface to bring language models in the Unix ecosystem and allow power users to make the most out of llms

Emilien Fugier 28 Dec 2, 2023
An object-relational in-memory cache, supports queries with an SQL-like query language.

qlcache An object-relational in-memory cache, supports queries with an SQL-like query language. Warning This is a rather low-level library, and only p

null 3 Nov 14, 2021
Bind the Prisma ORM query engine to any programming language you like ❤️

Prisma Query Engine C API Bind the Prisma ORM query engine to any programming language you like ❤️ Features Rust bindings for the C API Static link li

Prisma ORM for community 10 Dec 15, 2022