PRQL is a modern language for transforming data — a simpler and more powerful SQL

Overview

PRQL

Language Docs Discord

GitHub CI Status GitHub contributors Stars

Pipelined Relational Query Language, pronounced "Prequel".

PRQL is a modern language for transforming data — a simpler and more powerful SQL. Like SQL, it's readable, explicit and declarative. Unlike SQL, it forms a logical pipeline of transformations, and supports abstractions such as variables and functions. It can be used with any database that uses SQL, since it transpiles to SQL.

PRQL was discussed on Hacker News and Lobsters earlier this year when it was just a proposal.

Here's a short example of the language; for more examples, visit prql-lang.org. To experiment with PRQL in the browser, check out PRQL Playground.

from employees                                # Each line transforms the previous result.
filter start_date > @2021-01-01               # Clear date syntax.
derive [                                      # `derive` adds columns / variables.
  gross_salary = salary + payroll_tax,
  gross_cost = gross_salary + benefits_cost   # Variables can use other variables.
]
filter gross_cost > 0
group [title, country] (                      # `group` runs a pipeline over each group.
  aggregate [                                 # `aggregate` reduces each group to a row.
    average salary,
    sum     salary,
    average gross_salary,
    sum     gross_salary,
    average gross_cost,
    sum_gross_cost = sum gross_cost,          # `=` sets a column name.
    ct = count,
  ]
)
sort [sum_gross_cost, -country]               # `-country` means descending order.
filter ct > 200
take 20

Resources

To learn more, check out the PRQL Website.

For specific resources, check out:

Contributors

Many thanks to those who've made our progress possible:

Contributors

Core developers

We have a few core developers who are responsible for reviewing code, making decisions on the direction of the language, and project administration:

We welcome others to join who have a track record of contributions.

Comments
  • Orthogonal treatment of groups (nee Window functions)

    Orthogonal treatment of groups (nee Window functions)

    Edit — this issue evolved into a much broader discussion of how we do groups. The original message is below, and I've tried to summarize the current state towards the end of the issue so it's easier for others to engage.


    More than half the people I speak with about PRQL tell me that window functions are painful in SQL and would be a great feature to add to PRQL. I put window funcitons in the examples in the Readme at first, since I had the same experience in SQL.

    For clarity: a window / analytic / analytical function is a function that takes other rows as an input, but instead of aggregating over many rows, it creates a new value for every row. For example, a moving average or lag. In SQL it's represented by the OVER keyword.

    I haven't thought about this enough, but wanted to post an issue so we could start thinking about how to do it. A couple of initial options:

    • A separate pipeline, with the column at the end, and the function (lag in this case) in the pipeline:
    from prices
    derive price_yesterday: (
      window
      by sec_id
      sort date
      lag 1
      price
    )
    
    • A function with lots of named args, with the function lag as a named arg (but how would this work for functions with 0 or >1 parameters?):
    from prices
    derive price_yesterday: (window by:sec_id sort:date lag:1 price)
    
    • Eliminate window — because we have aggregate, we can't get confused about sum(foo) OVER and sum(foo), so possibly we don't need to have window:
    from prices
    derive price_yesterday: (price | lag rows:1 sort:date by:sec_id)
    

    I don't think this is sufficiently thought through, and I'm sure these examples have inadequacies.

    Here are some example queries we can use as cases: https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts#get_the_most_popular_item_in_each_category

    enhancement language-design 
    opened by max-sixty 33
  • Functions - syntax, scope and limitations

    Functions - syntax, scope and limitations

    In issue #437 we ended up with quite a lot of discussion around the syntax for functions. For anyone browsing the issue log that might be quite difficult to find so I'm opening a separate issue for it here. The syntax also gets updated in PR #442 .

    The current syntax is described in: https://github.com/prql/prql/blob/main/reference/src/functions.md

    @aljazerzen also gave some descriptions on Discord which I've reproduced below:

    Currently they are limited to "any expression", but there are plans for them to allow:

    • "late binding variables"
    func net gross = gross * (1 - tax)
    
    from finances
    select [net gross_salaries, net gross_materials, net gross_something]
    

    where tax is column from the finances, but is not resolved until after the function is materialised

    • currying
    func add a b = a + b
    func add_one = add 1
    
    from employees
    select [ age_next_year: add_one age ]
    
    • transform functions
    func take_first col = (sort col | take 1)
    
    from employees
    take_first age
    
    opened by snth 29
  • Same column name in multiple tables causes buggy behaviour

    Same column name in multiple tables causes buggy behaviour

    When multiple PRQL tables share an identical column name (e.g. id): joining them together causes the compiler to reuse the meaning of one id in multiple places (even where it's incorrect).

    This reduced example should illustrate it best:

    table x = (
      from x_orig # table with columns: id
      derive [
        x_id = id
      ]
      select [x_id]
    )
    
    table y = (
      from y_orig # table with columns: id, x_id
      derive [
        y_id = id
      ]
      select [y_id, x_id]
    )
    
    from x
    join y [x_id]
    

    compiled SQL:

    WITH x AS (
      SELECT
        id AS x_id
      FROM
        x_orig
    ),
    y AS (
      SELECT
        x_id AS y_id, --------- this should instead be `id AS y_id`
        x_id
      FROM
        y_orig
    )
    SELECT
      x.*,
      y.*,
      x_id
    FROM
      x
      JOIN y USING(x_id)
    

    Note that if the PRQL derive statement of y_id = id is changed to y_id = y_orig.id, then the buggy output SQL line correctly becomes id AS y_id. But, we can't expect users to always add these (unless this becomes an explicit requirement)

    bug compiler priority 
    opened by mklopets 24
  • Merge prql-tool

    Merge prql-tool

    @snth has created a CLI tool that utilizes PRQL queries to read data from various sources and write them as CSV or parquet files.

    Compared to prql-compiler CLI, it is actually useful, which is why I think it should be the binary we are promoting to users to try out.

    But I would suggest a name change, because PRQL is the language and it may be confusing to use prql my_query.prql. I suggest the name pipe.

    TODO:

    • [ ] merge snth's fork (or split it into prql/pipe repo to avoid long compilation times)
    • [ ] setup CI/CD
    • [ ] change homebrew recipe to install pipe instead of prql-compiler
    • [ ] change website to promote pipe
    opened by aljazerzen 22
  • Precedence issue

    Precedence issue

    Currently:

    from cities
    select temp_c = (temp - 32) * 3
    

    compiles to:

    SELECT
      temp - 32 * 3 AS temp_c
    FROM
      cities
    

    ...which has the wrong precedence — we should have (temp - 32) * 3

    bug compiler priority 
    opened by max-sixty 22
  • Variable replacement doesn't respect precedence

    Variable replacement doesn't respect precedence

    In this example, the resulting code for diff should be c - a - b or c - (a + b)

    from foo
    select [
       sum = (a+b),
       diff = c-sum  # actually outputs "c - a + b AS diff"
    ]
    

    (Tested with current VScode plugin and Playground)

    bug compiler 
    opened by richb-hanover 21
  • Implicit vs Explicit column arguments to `join`

    Implicit vs Explicit column arguments to `join`

    I was planning to release 0.3 as discussed, with the change in join such that supplying columns to an equi-join requires an additional character to make explicit columns vs. conditions:

    from e=employees
    -join salaries [id]   # implicit
    +join salaries [~id]  # explicit
    join l=locations [e.office_address == l.address]
    

    This is because a bare id is treated as a bool condition, like e.office_address == l.address is treated. (and in theory, id could be a bool column)

    I'm still fine to do the release and assess feedback, as discussed. But I wanted to raise whether we should:

    • consider making a change back to the previous implicit behavior before the release, given that this would be a breaking change to a fundamental part of the language. If we're 80/20 on allowing the previous implicit behavior, making the change now would avoid the breaking change without that much cost — as others pointed out yesterday, many of the immediate benefits come from better development on main, rather than big user-facing changes in a version number.
    • consider adding something to help users in making the transition; e.g. an error on the existing approach

    On the specific language change, I see it as a tradeoff between syntactic simplicity and semantic simplicity:

    • Syntactic simplicity / brevity — joining on a shared column is very frequent, particularly in a well-designed schema. Things that are expressed frequently should have small[^1] representations.
      • Almost never are we actually joining based on a bool column. (I hadn't even realized that the previous design was ambiguous). It would be doing a cross-join based on a bool column from one of the tables — so unlikely.
      • What do folks think about the extra complexity for users? Possibly it's actually not that much burden to understand "Represent [USING / an equi-join / a join between identically named columns] with ~"? (Maybe it feels bigger to us because it's a change and it's so prominent in our docs and examples?)
    • Semantic simplicity / generality — having [id] mean something different from [id==true] in a join breaks the encapsulation of the expression. The compiler needs to understand what's inside the expression; uncorrelated concepts become correlated, the language becomes less general and less orthogonal.
      • For example, an unlikely but possible example — is bar a condition or a column in the join? I guess it's a condition because we know it's baz==bax. But if bar were materialized in a column in the DB, then the behavior suddendly changes.

        from foo
        derive bar = baz==bax
        join x [bar]  # is `bar` an implicit column in both tables? Or `baz==bax`?
        
      • I've been supportive of #919, which increases generality, this would go against that theme


    If anyone has ideas for an alternative representation rather than ~, then feel free to suggest! Though I actually think that ~ is pretty good.

    One alternative would be to have a different parameter; e.g. using:[id], but then given the conditions parameter would still be required, we'd have an awkward join locations using:[id] [].


    If we do go the explicit route, is there something we can do to make this clearer for users? I would find this quite confusing if I weren't watching the releases and all of a sudden this compiles to something completely different:

    from foo
    join x [bar]
    
     SELECT
       foo.*,
    -  x.*,
    -  bar
    +  x.*
     FROM
       foo
    -  JOIN x USING(bar)
    +  JOIN x ON bar
    

    PRQL has a higher ratio of expectations&excitement vs. users than most projects, so it's fine to make breaking changes atm. But this is potentially quite severe. Assuming we go the explicit route, should we raise an error for a bare column name for a few versions so it's at least obvious when people do this?


    Without wanting to zoom out too far, possibly it's worth considering this in the context of overall joins; e.g. #716 & #723


    Where do folks end up? As I said prior, @aljazerzen has full rights to respond with 😫, and I'll do the release. semantic was really Herculean, and we're still young enough that we probably underrate velocity.

    For transparency, if we do decide to make the change, I'm flat-out with non-PRQL stuff until mid-week, after which I have more time and would be happy to work on this. I'm quite excited to get into working with the new compiler!

    [^1]: "small" here means both in character-count and syntactic complexity, in this case ~ is small in character-count but adds to syntactic complexity. For theory around compression, check out source-coding, and I can find better references if folks are interested

    language-design priority 
    opened by max-sixty 19
  • feat: add prql library generation to be used in other platforms (e.g. Golang)

    feat: add prql library generation to be used in other platforms (e.g. Golang)

    I've added a small shim that exports to_sql and to_json and generates static and dynamic libraries that can be used through FFI in other platforms. I am using it in Go.

    opened by sigxcpu76 17
  • Dialect-specific SQL

    Dialect-specific SQL

    Even with the current code, take 20 will produce TOP(20), which is incompatible with some DBs.

    The original plan was to allow a db named arg to queries, which I still think works well, but open to alternatives.

    enhancement help wanted 
    opened by max-sixty 16
  • Can't build `playground` locally

    Can't build `playground` locally

    I forked the prql repo to my own account, then cloned it to my machine (macOS 10.15.7). I then tried to build the playground so I could try it locally. I got an error (sh: wasm-pack: command not found), so I added "wasm-pack": "^0.10.3" to the devDependencies in the package.json file.

    I then tried the following. I saw a couple warnings about deprecations, but I don't know what to do about the cargo metadata file. What other troubleshooting information could I provide? Many thanks.

    √ playground % node --version
    v16.14.0
    √ playground % npm --version
    8.3.1
    √ playground % npm install
    npm WARN deprecated [email protected]: Modern JS already guarantees Array#sort() is a stable sort, so this library is deprecated. See the compatibility table on MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#browser_compatibility
    npm WARN deprecated [email protected]: This SVGO version is no longer supported. Upgrade to v2.x.x.
    
    > [email protected] preinstall
    > cd ../prql-js && npm run build-bundler
    
    
    > [email protected] build-bundler
    > wasm-pack build --target bundler --release --out-dir dist/bundler
    
    Error: Error during execution of `cargo metadata`: error: failed to parse lock file at: /Users/richb/github/prql/Cargo.lock
    
    To learn more, run the command again with --verbose.
    
    npm ERR! code 1
    npm ERR! path /Users/richb/github/prql/playground
    npm ERR! command failed
    npm ERR! command sh -c cd ../prql-js && npm run build-bundler
    
    npm ERR! A complete log of this run can be found in:
    npm ERR!     /Users/richb/.npm/_logs/2022-09-14T17_44_20_052Z-debug-0.log
    ?1 playground %
    
    opened by richb-hanover 15
  • Named args vs assignments

    Named args vs assignments

    One issue https://github.com/max-sixty/prql/pull/52 raises is how assignments are parsed; specifically:

    derive gross_salary: salary + payroll_tax,
    

    Should parse into:

    • func: derive
    • assign:
      • lvalue: gross_salary
      • rvalue: salary + payroll_tax

    But named args' syntax conflict with this. For example:

    aggregate by:[title] [sum salary]
    

    should parse into:

    • func: aggregate
    • assign:
      • lvalue: by
      • rvalue: [title]
    • arg: [sum salary]

    ...and not

    • func: aggregate
    • assign:
      • lvalue: by
      • rvalue:[title] [sum salary]

    ...which is what the assign implies.

    Options:

    • change syntax so only one of these uses :
      • Though trying to avoid using = so we can keep the SQL (and human!) use of this
    • force the assign to always take one arg, so gross_salary: salary + payroll_tax needs to be gross_salary: (salary + payroll_tax)
    • restore the rules around whitespace, such that passing named args disallows whitespace (i.e. by:[title] and not by: [title]). But this makes semantics very dependent on small changes in whitespace, which could get confusing. An autoformatter isn't going to be able to deal with this

    References:

    language-design 
    opened by max-sixty 15
  • Alternative Backends | Support open table projects like Apache Iceberg

    Alternative Backends | Support open table projects like Apache Iceberg

    Hi,

    First of all, thank you for the great project !

    I was wondering if under "Alternative Backends" - can integrations with Open Table format like Apache Iceberg can be considered / added to the roadmap?

    opened by ramkumarkb 1
  • docs: Update roadmap

    docs: Update roadmap

    This is now ready for review. I'm happy to discuss things here. It's also fine if we merge without complete agreement and discuss on the next dev call.

    opened by max-sixty 1
  • Additional CTE when using an alias

    Additional CTE when using an alias

    This is extremely minor, to the extent I hesitated to put an issue in, but generally I think the bar should be quite low, so:

    Here we get a CTE which I think we could avoid:

    from employees
    select fname = first_name
    filter fname == "Fred"
    
    WITH table_1 AS (
      SELECT
        first_name AS fname
      FROM
        employees
    )
    SELECT
      fname
    FROM
      table_1
    WHERE
      fname = 'Fred'
    

    Possibly this could be:

    SELECT
      first_name as fname
    FROM
      employees
    WHERE
      first_name = 'Fred'
    

    ...since in SQL we can still use first_name within that select statement.

    friendliness 
    opened by max-sixty 2
  • Python API doesn't match documentation

    Python API doesn't match documentation

    Thanks for creating prql! It seems like a great idea.

    I tried to try prql for the first time today. I followed the installation instructions and installed prql-python 0.3.1 in my python 3.10 environment. Then I tried to run the example code. On

    sql = prql.compile(prql_query)
    

    I got AttributeError: module 'prql_python' has no attribute 'compile'. Looking closer, and comparing the to tests

    https://github.com/PRQL/prql/blob/eedc7fa3ac915ac79cf5a3de25fde4416fa49827/prql-python/python/tests/test_all.py#L14-L21

    ...I found that my API is different .dir(prql) shows

    ['__all__',
     '__builtins__',
     '__cached__',
     '__doc__',
     '__file__',
     '__loader__',
     '__name__',
     '__package__',
     '__path__',
     '__spec__',
     '__version__',
     'prql_python',
     'to_json',
     'to_sql']
    

    This is a confusing experience for me as a first-time prql user. Am I doing something wrong?

    opened by rabernat 3
  • prql-js syntax suggestion/sugar

    prql-js syntax suggestion/sugar

    It sounds very exciting that you are revamping the outdated SQL language with a pipeline-based approach. Perhaps at the beginning of the project, there is still an opportunity to incorporate modern syntax into the language. I have two suggestions:

    First is the const sql = compile(`from employees | select first_name`); instead of the const sql = prql`from employees | select first_name`; use, which immediately shows what it is about.

    The second is that expressions can be formatted much better with line breaks:

    const sql = prql`
       from employees
       select first_name
    `;
    
    integrations 
    opened by BCsabaEngine 1
Releases(0.3.1)
  • 0.3.1(Dec 4, 2022)

    0.3.1 brings a couple of small improvements and fixes.

    Features:

    • Support for using s-strings for from (#1197, @aljazerzen)
      from s"SELECT * FROM employees WHERE foo > 5"
      
    • Helpful error message when referencing a table in an s-string (#1203, @aljazerzen)

    Fixes:

    • Multiple columns with same name created (#1211, @aljazerzen)
    • Renaming via select breaks preceding sorting (#1204, @aljazerzen)
    • Same column gets selected multiple times (#1186, @mklopets)

    Internal:

    • Update Github Actions and Workflows to current version numbers (and avoid using Node 12) (#1201)
    Source code(tar.gz)
    Source code(zip)
  • 0.3.0(Nov 30, 2022)

    🎉 0.3.0 is the biggest ever change in PRQL's compiler 🎉. It rewrites much of the internals: the compiler now has a semantic understanding of expressions, including resolving names & building a DAG of column lineage.

    While the immediate changes to the language are modest — some long-running bugs are fixed — this unlocks beginning development of many of the broad features we've had ambitions for, such as type-checking & auto-complete. And it simplifies building our next language features, such as match-case expressions, unions & table expressions.

    @aljazerzen has (mostly single-handedly) done this work over the past few months. The project owes him immense appreciation.

    Breaking changes:

    We've had to make some modest breaking changes for 0.3:

    • Pipelines must start with from. For example, a pipeline with only derive foo = 5, with no from transform, is no longer valid. Depending on demand for this feature, it would be possible to add this back.

    • Shared column names now require == in a join. For example:

      from employees
      -join positions [id]
      +join positions [==id]
      

      The existing approach is ambiguous to the compiler — id could be a boolean column.

    • Table references containing periods must be surrounded by backticks. For example, when referencing a schema name:

      -from public.sometable
      +from `public.sometable`
      

    Features:

    • Change self equality op to == (#1176, @aljazerzen)
    • Add logging (@aljazerzen)
    • Add clickhouse dialect (#1090, @max-sixty)
    • Allow namespaces & tables to contain . (#1079, @aljazerzen)

    Fixes:

    • Deduplicate column appearing in SELECT multiple times (#1186, @aljazerzen)
    • Fix uppercase table names (#1184, @aljazerzen)
    • Omit table name when only one ident in SELECT (#1094, @aljazerzen)

    Documentation:

    • Add chapter on semantics' internals (@aljazerzen, #1028)
    • Add note about nesting variables in s-strings (@max-sixty, #1163)

    Internal changes:

    • Flatten group and window (#1120, @aljazerzen)
    • Split ast into expr and stmt (@aljazerzen)
    • Refactor associativity (#1156, @aljazerzen)
    • Rename Ident constructor to from_name (#1084, @aljazerzen)
    • Refactor rq folding (#1177, @aljazerzen)
    • Add tests for reported bugs fixes in semantic (#1174, @aljazerzen)
    • Bump duckdb from 0.5.0 to 0.6.0 (#1132)
    • Bump once_cell from 1.15.0 to 1.16.0 (#1101)
    • Bump pest from 2.4.0 to 2.5.0 (#1161)
    • Bump pest_derive from 2.4.0 to 2.5.0 (#1179)
    • Bump sqlparser from 0.25.0 to 0.27.0 (#1131)
    • Bump trash from 2.1.5 to 3.0.0 (#1178)
    Source code(tar.gz)
    Source code(zip)
  • 0.2.11(Nov 20, 2022)

    0.2.11 contains a few helpful fixes.

    Work continues on our semantic refactor — look out for 0.3.0 soon! Many thanks to @aljazerzen for his continued contributions to this.

    Note: 0.2.10 was skipped due to this maintainer's inability to read his own docs on bumping versions...

    Features:

    • Detect when compiler version is behind query version (@MarinPostma, #1058)
    • Add __version__ to prql-python package (@max-sixty, #1034)

    Fixes:

    • Fix nesting of expressions with equal binding strength and left associativity, such as a - (b - c) (@max-sixty, #1136)
    • Retain floats without significant digits as floats (@max-sixty, #1141)

    Documentation:

    • Add documentation of prqlr bindings (@eitsupi, #1091)
    • Add a 'Why PRQL' section to the website (@max-sixty, #1098)
    • Add @snth to core-devs (@max-sixty, #1050)

    Internal changes:

    • Use workspace versioning (@max-sixty, #1065)
    Source code(tar.gz)
    Source code(zip)
  • 0.2.9(Oct 14, 2022)

  • 0.2.8(Oct 11, 2022)

    0.2.8 is another modest release with some fixes, doc improvements, bindings improvements, and lots of internal changes. Note that one of the fixes causes the behavior of round and cast to change slightly — though it's handled as a fix rather than a breaking change in semantic versioning.

    Fixes:

    • Change order of the round & cast function parameters to have the column last; for example round 2 foo_col / cast int foo. This is consistent with other functions, and makes piping possible:

      derive [
        gross_salary = (salary + payroll_tax | as int),
        gross_salary_rounded = (gross_salary | round 0),
      ]
      

    Documentation:

    • Split DEVELOPMENT.md from CONTRIBUTING.md (@richb-hanover, #1010)
    • Make s-strings more prominent in website intro (@max-sixty, #982)

    Web:

    • Add GitHub star count to website (@max-sixty, #990)

    Integrations:

    • Expose a shortened error message, in particular for the VSCode extension (@aljazerzen, #1005)

    Internal changes:

    • Specify 1.60.0 as minimum rust version (@max-sixty, #1011)
    • Remove old wee-alloc code (@max-sixty, #1013)
    • Upgrade clap to version 4 (@aj-bagwell, #1004)
    • Improve book-building script in Taskfile (@max-sixty, #989)
    • Publish website using an artifact rather than a long-lived branch (@max-sixty, #1009)
    Source code(tar.gz)
    Source code(zip)
  • 0.2.7(Sep 17, 2022)

    0.2.7 is a fairly modest release, six weeks after 0.2.6. We have some more significant features, including a union operator and an overhaul of our type system, as open PRs which will follow in future releases.

    We also have new features in the VSCode extension, courtesy of @jiripospisil, including a live output panel.

    Fixes:

    • range_of_ranges checks the Range end is smaller than its start (@shuozeli, #946)

    Documentation:

    • Improve various docs (@max-sixty, #974, #971, #972, #970, #925)
    • Add reference to EdgeDB's blog post in our FAQ (@max-sixty, #922)
    • Fix typos (@kianmeng, #943)

    Integrations:

    • Add prql-lib, enabling language bindings with go (@sigxcpu76, #923)
    • Fix line numbers in JS exceptions (@charlie-sanders, #929)

    Internal changes:

    • Lock the version of the rust-toolchain, with auto-updates (@max-sixty, #926, #927)
    Source code(tar.gz)
    Source code(zip)
  • 0.2.6(Aug 6, 2022)

    0.2.6 is a modest release following 0.2.5 a week ago. It includes lots of small fixes and some docs:

    Fixes

    • Adjust fmt to only escape names when needed (@aljazerzen, #907)
    • Fix quoting on upper case table names (@max-sixty, #893)
    • Fix scoping of identical column names from multiple tables (@max-sixty, #908)
    • Fix parse error on newlines in a table (@sebastiantoh 🆕, #902)
    • Fix quoting of upper case table names (@max-sixty, #893)

    Documentation

    • Add docs on Architecture (@aljazerzen, #904)
    • Add Changelog (@max-sixty, #890 #891)

    Internal changes

    • Start trial using Conventional Commits (@max-sixty, #889)
    • Add crates.io release workflow, docs (@max-sixty, #887)

    Thanks to @sebastiantoh for his first PR into PRQL!

    Source code(tar.gz)
    Source code(zip)
  • 0.2.5(Jul 29, 2022)

    0.2.5 is a very small release following 0.2.4 yesterday. It includes:

    • Add the ability to represent single brackets in an s-string, with two brackets (#752, @max-sixty )
    • Fix the "Copy to Clipboard" command in the Playground, for Firefox (#880, @mklopets )
    Source code(tar.gz)
    Source code(zip)
  • 0.2.4(Jul 28, 2022)

    0.2.4 is a small release following 0.2.3 a few days ago. The 0.2.4 release includes:

    • Enrich our CLI, adding commands to get different stages of the compilation process (@aljazerzen , #863)
    • Fix multiple take n statements in a query, leading to duplicate proxy columns in generated SQL (@charlie-sanders )
    • Fix BigQuery quoting of identifiers in SELECT statements (@max-sixty )
    • Some internal changes — reorganize top-level functions (@aljazerzen ), add a workflow to track our rust compilation time (@max-sixty ), simplify our simple prql-to-sql tests (@max-sixty )

    Thanks to @ankane, prql-compiler is now available from homebrew core; brew install prql-compiler[^1].

    Reiterating our plans from the 0.2.3 release notes:

    From here, we're planning to continue squashing bugs (albeit more minor than those in this release), adding some features like union, while working on bigger issues such as type-inference.

    We're also going to document and modularize the compiler further. It's important that we give more people an opportunity to contribute to the guts of PRQL, especially given the number and enthusiasm of contributions to project in general — and it's not that easy to do so at the moment. While this is ongoing if anyone has something they'd like to work on in the more difficult parts of the compiler, let us know on GitHub or Discord, and we'd be happy to work together on it.

    [^1]: we still need to update docs and add a release workflow for this: https://github.com/prql/prql/issues/866

    Source code(tar.gz)
    Source code(zip)
  • 0.2.3(Jul 25, 2022)

    A couple of weeks since the 0.2.2 release: we've squashed a few bugs, added some mid-sized features to the language, and made a bunch of internal improvements.

    The 0.2.3 release includes:

    • Allow for escaping otherwise-invalid identifiers (@aljazerzen & @max-sixty )
    • Fix a bug around operator precedence (@max-sixty )
    • Add a section the book on the language bindings (@charlie-sanders )
    • Add tests for our Display representation while fixing some existing bugs. This is gradually becoming our code formatter (@arrizalamin )
    • Add a "copy to clipboard" button in the Playground (@mklopets )
    • Add lots of guidance to our CONTRIBUTING.md around our tests and process for merging (@max-sixty )
    • Add a prql! macro for parsing a prql query at compile time (@aljazerzen )
    • Add tests for prql-js (@charlie-sanders )
    • Add a from_json method for transforming json to a PRQL string (@arrizalamin )
    • Add a workflow to release prql-java to Maven (@doki23 )
    • Enable running all tests from a PR by adding a pr-run-all-tests label (@max-sixty )
    • Have cargo-release to bump all crate & npm versions (@max-sixty )
    • Update prql-js to use the bundler build of prql-js (@mklopets )

    As well as those contribution changes, thanks to those who've reported issues, such as @mklopets @huw @mm444 @ajfriend.

    From here, we're planning to continue squashing bugs (albeit more minor than those in this release), adding some features like union, while working on bigger issues such as type-inference.

    We're also going to document and modularize the compiler further. It's important that we give more people an opportunity to contribute to the guts of PRQL, especially given the number and enthusiasm of contributions to project in general — and it's not that easy to do so at the moment. While this is ongoing if anyone has something they'd like to work on in the more difficult parts of the compiler, let us know on GitHub or Discord, and we'd be happy to work together on it.

    Thank you!

    Source code(tar.gz)
    Source code(zip)
  • 0.2.2(Jul 10, 2022)

    We're a couple of weeks since our 0.2.0 release. Thanks for the surge in interest and contributions! 0.2.2[^1] has some fixes & some internal improvements:

    • We now test against SQLite & DuckDB on every commit, to ensure we're producing correct SQL. (@aljazerzen )
    • We have the beginning of Java bindings! (@doki23 )
    • Idents surrounded by backticks are passed through to SQL (@max-sixty )
    • More examples on homepage; e.g. join & window, lots of small docs improvements
    • Automated releases to homebrew (@roG0d )
    • prql-js is now a single package for node, browsers & webpack (@charlie-sanders )
    • Parsing has some fixes, including >= and leading underscores in idents (@mklopets )
    • Ranges receive correct syntax highlighting (@max-sixty )

    Thanks to Aljaž Mur Eržen @aljazerzen , George Roldugin @roldugin , Jasper McCulloch @Jaspooky , Jie Han @doki23 , Marko Klopets @mklopets , Maximilian Roos @max-sixty , Rodrigo Garcia @roG0d , Ryan Russell @ryanrussell , Steven Maude @StevenMaude , Charlie Sanders @charlie-sanders .

    We're planning to continue collecting bugs & feature requests from users, as well as working on some of the bigger features, like type-inference.

    For those interesting in joining, we also have a new Contributing page.

    [^1]: Think of 0.2.1 like C+ :)

    Source code(tar.gz)
    Source code(zip)
  • 0.2.0(Jun 27, 2022)

    🎉 🎉 After several months of building, PRQL is ready to use! 🎉 🎉


    How we got here:

    At the end of January, we published a proposal of a better language for data transformation: PRQL. The reception was better than I could have hoped for — we were no. 2 on HackerNews for a day, and gained 2.5K GitHub stars over the next few days.

    But man cannot live on GitHub Stars alone — we had to do the work to build it. So over the next several months, during many evenings & weekends, a growing group of us gradually built the compiler, evolved the language, and wrote some integrations.

    We want to double-down on the community and its roots in open source — it's incredible that a few of us from all over the globe have collaborated on a project without ever having met. We decided early-on that PRQL would always be open-source and would never have a commercial product (despite lots of outside interest to fund a seed round!). Because languages are so deep in the stack, and the data stack has so many players, the best chance of building a great language is to build an open language.


    We still have a long way to go. While PRQL is usable, it has lots of missing features, and an incredible amount of unfulfilled potential, including a language server, cohesion with databases, and type inference. Over the coming weeks, we'd like to grow the number of intrepid users experimenting PRQL in their projects, prioritize features that will unblock them, and then start fulfilling PRQL's potential by working through our roadmap.

    The best way to experience PRQL is to try it. Check out our website and the Playground. Start using PRQL for your own projects in dbt, Jupyter notebooks, and Prefect workflows.

    Keep in touch with PRQL by following the project on Twitter, joining us on Discord, starring the repo.

    Contribute to the project — we're a really friendly community, whether you're a recent SQL user or an advanced rust programmer. We need bug reports, documentation tweaks & feature requests — just as much as we need compiler improvements written in rust.


    I especially want to give Aljaž Mur Eržen (@aljazerzen) the credit he deserves, who has contributed the majority of the difficult work of building out the compiler. Much credit also goes to Charlie Sanders (@charlie-sanders), one of PRQL's earliest supporters and the author of PyPrql, and Ryan Patterson-Cross (@rbpatt2019), who built the Jupyter integration among other Python contributions.

    Other contributors who deserve a special mention include: @roG0d, @snth, @kwigley


    Thank you, and we look forward to your feedback!

    Source code(tar.gz)
    Source code(zip)
  • 0.1.1(Apr 2, 2022)

    0.1.1 contains big improvements to the compiler

    Thank you to the contributors to this release, in particular @aljazerzen, who is responsible for an internal rewrite of much of the compiler: Aljaž Mur Eržen (@aljazerzen), Maximilian Roos (@max-sixty), shota kizawa (@kiibo382).

    The internal rewrite fixes bugs in more complicated queries — such as variables that span multiple CTEs in the generated SQL — and introduces a more coherent & extensible approach to the compilation. It also adds options to prql compile --format to understand how variables and functions are being compiled.

    It radically improves many error messages:

    $ echo '
    from employees
    join side:[left] managers [id]
    ' | prql compile
    
    Error:
       ╭─[-:2:11]
       │
     2 │ join side:[left] managers [id]
       ·           ───┬───
       ·              ╰───── expected ident, but found List
    

    We have the beginning of a language reference. Improving its content is a priority over the coming days.

    Install with cargo install prql. If you're interesting in joining, check out the Contributing section in the Readme.

    Source code(tar.gz)
    Source code(zip)
  • 0.1.0(Mar 21, 2022)

    PRQL just hit 0.1!

    A huge thank you to the contributors to this release:

    Aljaž Mur Eržen (@aljazerzen), Andrey Popp (@andreypopp), Daniel Gasienica (@gasi), Daniel Jones (@danjones1618), Ian Fisher (@iafisher), Kyle Wigley (@kwigley), Maximilian Roos (@max-sixty), Peder Bergebakken Sundt (@pbsds), charlie sando (@qorrect), and shota kizawa (@kiibo382).

    0.1 means:

    • It works™, for basic transformations such as filter, select, aggregate, take, sort, & join. Variables (derive), functions (func) and CTEs (table) work.
      • More advanced language features are forthcoming, like better inline pipelines, window clauses, and arrays.
    • It's not friendly at the moment:
      • It runs from a CLI only, taking input from a file or stdin and writing to a file or stdout.
      • Error messages are bad.
      • For an interactive experience, combine with a tool like Up.
    • The documentation is lacking.
      • Our current top priority is to have some decent documentation #233.
    • It doesn't support changing the dialect.
    • It has bugs. Please report them!
    • It has sharp corners. Please report grazes!
    • We'll release backward-incompatible changes. The versioning system for the language is not yet implemented.

    Give it a try and let us know your feedback — cargo install prql.

    There's much more on the Readme, including examples, how to contribute, and a Roadmap.

    Source code(tar.gz)
    Source code(zip)
A Rust SQL query builder with a pleasant fluent API closely imitating actual SQL

Scooby An SQL query builder with a pleasant fluent API closely imitating actual SQL. Meant to comfortably build dynamic queries with a little bit of s

Aleksei Voronov 100 Nov 11, 2022
Gh-sql - Query GitHub Projects (beta) with SQL

gh-sql: Query GitHub Projects (beta) with SQL Installation gh extension install KOBA789/gh-sql Features SELECT items DELETE items UPDATE item fields

Hidekazu Kobayashi 108 Dec 7, 2022
SQL validator tool for BigQuery standard SQL.

bqvalid What bqvalid does bqvalid is the SQL validator tool for BigQuery standard SQL. bqvalid fails with error message if there's the expression that

null 10 Dec 25, 2022
A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

Datafuse Labs 5k Jan 9, 2023
Scalable and fast data store optimised for time series data such as financial data, events, metrics for real time analysis

OnTimeDB Scalable and fast data store optimised for time series data such as financial data, events, metrics for real time analysis OnTimeDB is a time

Stuart 2 Apr 5, 2022
An object-relational in-memory cache, supports queries with an SQL-like query language.

qlcache An object-relational in-memory cache, supports queries with an SQL-like query language. Warning This is a rather low-level library, and only p

null 3 Nov 14, 2021
RisingWave is a cloud-native streaming database that uses SQL as the interface language.

RisingWave is a cloud-native streaming database that uses SQL as the interface language. It is designed to reduce the complexity and cost of building real-time applications. RisingWave consumes streaming data, performs continuous queries, and updates results dynamically. As a database system, RisingWave maintains results inside its own storage and allows users to access data efficiently.

Singularity Data 3.7k Jan 2, 2023
ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-text search.

ReefDB ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-

Sacha Arbonel 75 Jun 12, 2023
Query LDAP and AD with SQL

SQLDAP Ever wanted to query AD or LDAP with SQL like queries ? I'm going to answer this question myself: yes ! Why ? Because I never could remember al

null 9 Nov 15, 2022
FeOphant - A SQL database server written in Rust and inspired by PostreSQL.

A PostgreSQL inspired SQL database written in Rust.

Christopher Hotchkiss 27 Dec 7, 2022
Distributed, version controlled, SQL database with cryptographically verifiable storage, queries and results. Think git for postgres.

SDB - SignatureDB Distributed, version controlled, SQL database with cryptographically verifiable storage, queries and results. Think git for postgres

Fremantle Industries 5 Apr 26, 2022
Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser

This Rust library uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parse tree.

pganalyze 37 Dec 18, 2022
ReadySet is a lightweight SQL caching engine written in Rust that helps developers enhance the performance and scalability of existing applications.

ReadySet is a SQL caching engine designed to help developers enhance the performance and scalability of their existing database-backed applications. W

ReadySet 1.7k Jan 8, 2023
SQL database to read and write "discord"

GlueSQL Discord Storage After discussing how CI testing will be managed, we plan to move it upstream. Precautions for use discord ToS https://discord.

Jiseok CHOI 9 Feb 28, 2023
Execute SQL now and get the results later.

pg_later Execute SQL now and get the results later. A postgres extension to execute queries asynchronously. Installation Add pg_later to shared_preloa

Tembo 12 Jul 25, 2023
Rust client for Timeplus Proton, a fast and lightweight streaming SQL engine

Rust Client for Timeplus Proton Rust client for Timeplus Proton. Proton is a streaming SQL engine, a fast and lightweight alternative to Apache Flink,

Timeplus 4 Feb 27, 2024
TDS 7.2+ (mssql / Microsoft SQL Server) async driver for rust

Tiberius A native Microsoft SQL Server (TDS) client for Rust. Supported SQL Server versions Version Support level Notes 2019 Tested on CI 2017 Tested

Prisma 189 Dec 25, 2022
GlueSQL is a SQL database library written in Rust

GlueSQL is a SQL database library written in Rust. It provides a parser (sqlparser-rs), execution layer, and optional storage (sled) packaged into a single library.

GlueSQL 2.1k Jan 8, 2023
Run SQL queries on CSV files

zsql run SQL queries on csv files A terminal utility to easily run SQL queries on CSV files. zsql is shipped as a small single binary powered by rust

Zizaco 9 Jul 9, 2022