PRQL is a modern language for transforming data — a simpler and more powerful SQL

PRQL

Last update: Jan 5, 2023

Related tags

Overview

PRQL

Pipelined Relational Query Language, pronounced "Prequel".

PRQL is a modern language for transforming data — a simpler and more powerful SQL. Like SQL, it's readable, explicit and declarative. Unlike SQL, it forms a logical pipeline of transformations, and supports abstractions such as variables and functions. It can be used with any database that uses SQL, since it transpiles to SQL.

PRQL was discussed on Hacker News and Lobsters earlier this year when it was just a proposal.

Here's a short example of the language; for more examples, visit prql-lang.org. To experiment with PRQL in the browser, check out PRQL Playground.

from employees                                # Each line transforms the previous result.
filter start_date > @2021-01-01               # Clear date syntax.
derive [                                      # `derive` adds columns / variables.
  gross_salary = salary + payroll_tax,
  gross_cost = gross_salary + benefits_cost   # Variables can use other variables.
]
filter gross_cost > 0
group [title, country] (                      # `group` runs a pipeline over each group.
  aggregate [                                 # `aggregate` reduces each group to a row.
    average salary,
    sum     salary,
    average gross_salary,
    sum     gross_salary,
    average gross_cost,
    sum_gross_cost = sum gross_cost,          # `=` sets a column name.
    ct = count,
  ]
)
sort [sum_gross_cost, -country]               # `-country` means descending order.
filter ct > 200
take 20

Resources

To learn more, check out the PRQL Website.

For specific resources, check out:

PRQL Playground — experiment with PRQL in the browser.
PRQL Book — the language documentation.
Contributing — join us in building PRQL, through writing code or inspiring others to use it.
PyPRQL Docs — the PyPRQL documentation, the python bindings to PRQL, including Jupyter magic.
dbt-prql — write PRQL in dbt models.
PRQL VSCode Extension
PRQL-js — JavaScript bindings for PRQL.

Contributors

Many thanks to those who've made our progress possible:

Core developers

We have a few core developers who are responsible for reviewing code, making decisions on the direction of the language, and project administration:

@aljazerzen — Aljaž Mur Eržen
@max-sixty — Maximilian Roos
@charlie-sanders — Charlie Sanders

We welcome others to join who have a track record of contributions.

Comments

Orthogonal treatment of groups (nee Window functions)
Edit — this issue evolved into a much broader discussion of how we do groups. The original message is below, and I've tried to summarize the current state towards the end of the issue so it's easier for others to engage.

More than half the people I speak with about PRQL tell me that window functions are painful in SQL and would be a great feature to add to PRQL. I put window funcitons in the examples in the Readme at first, since I had the same experience in SQL.

For clarity: a window / analytic / analytical function is a function that takes other rows as an input, but instead of aggregating over many rows, it creates a new value for every row. For example, a moving average or lag. In SQL it's represented by the OVER keyword.

I haven't thought about this enough, but wanted to post an issue so we could start thinking about how to do it. A couple of initial options:

A separate pipeline, with the column at the end, and the function (lag in this case) in the pipeline:

from prices derive price_yesterday: ( window by sec_id sort date lag 1 price )

A function with lots of named args, with the function lag as a named arg (but how would this work for functions with 0 or >1 parameters?):

from prices derive price_yesterday: (window by:sec_id sort:date lag:1 price)

Eliminate window — because we have aggregate, we can't get confused about sum(foo) OVER and sum(foo), so possibly we don't need to have window:

from prices derive price_yesterday: (price | lag rows:1 sort:date by:sec_id)

I don't think this is sufficiently thought through, and I'm sure these examples have inadequacies.

Here are some example queries we can use as cases: https://cloud.google.com/bigquery/docs/reference/standard-sql/analytic-function-concepts#get_the_most_popular_item_in_each_category
enhancement language-design
opened by max-sixty 33
Functions - syntax, scope and limitations
In issue #437 we ended up with quite a lot of discussion around the syntax for functions. For anyone browsing the issue log that might be quite difficult to find so I'm opening a separate issue for it here. The syntax also gets updated in PR #442 .

The current syntax is described in: https://github.com/prql/prql/blob/main/reference/src/functions.md

@aljazerzen also gave some descriptions on Discord which I've reproduced below:

Currently they are limited to "any expression", but there are plans for them to allow:

"late binding variables"

func net gross = gross * (1 - tax) from finances select [net gross_salaries, net gross_materials, net gross_something]

where tax is column from the finances, but is not resolved until after the function is materialised

currying

func add a b = a + b func add_one = add 1 from employees select [ age_next_year: add_one age ]

transform functions

func take_first col = (sort col | take 1) from employees take_first age
opened by snth 29
Same column name in multiple tables causes buggy behaviour
When multiple PRQL tables share an identical column name (e.g. id): joining them together causes the compiler to reuse the meaning of one id in multiple places (even where it's incorrect).

This reduced example should illustrate it best:

table x = ( from x_orig # table with columns: id derive [ x_id = id ] select [x_id] ) table y = ( from y_orig # table with columns: id, x_id derive [ y_id = id ] select [y_id, x_id] ) from x join y [x_id]

compiled SQL:

WITH x AS ( SELECT id AS x_id FROM x_orig ), y AS ( SELECT x_id AS y_id, --------- this should instead be `id AS y_id` x_id FROM y_orig ) SELECT x.*, y.*, x_id FROM x JOIN y USING(x_id)

Note that if the PRQL derive statement of y_id = id is changed to y_id = y_orig.id, then the buggy output SQL line correctly becomes id AS y_id. But, we can't expect users to always add these (unless this becomes an explicit requirement)
bug compiler priority
opened by mklopets 24
Merge prql-tool
@snth has created a CLI tool that utilizes PRQL queries to read data from various sources and write them as CSV or parquet files.

Compared to prql-compiler CLI, it is actually useful, which is why I think it should be the binary we are promoting to users to try out.

But I would suggest a name change, because PRQL is the language and it may be confusing to use prql my_query.prql. I suggest the name pipe.

TODO:

[ ] merge snth's fork (or split it into prql/pipe repo to avoid long compilation times)

[ ] setup CI/CD

[ ] change homebrew recipe to install pipe instead of prql-compiler

[ ] change website to promote pipe
opened by aljazerzen 22
Precedence issue
Currently:

from cities select temp_c = (temp - 32) * 3

compiles to:

SELECT temp - 32 * 3 AS temp_c FROM cities

...which has the wrong precedence — we should have (temp - 32) * 3
bug compiler priority
opened by max-sixty 22
Variable replacement doesn't respect precedence
In this example, the resulting code for diff should be c - a - b or c - (a + b)

from foo select [ sum = (a+b), diff = c-sum # actually outputs "c - a + b AS diff" ]

(Tested with current VScode plugin and Playground)
bug compiler
opened by richb-hanover 21
Implicit vs Explicit column arguments to `join`
I was planning to release 0.3 as discussed, with the change in join such that supplying columns to an equi-join requires an additional character to make explicit columns vs. conditions:

from e=employees -join salaries [id] # implicit +join salaries [~id] # explicit join l=locations [e.office_address == l.address]

This is because a bare id is treated as a bool condition, like e.office_address == l.address is treated. (and in theory, id could be a bool column)

I'm still fine to do the release and assess feedback, as discussed. But I wanted to raise whether we should:

consider making a change back to the previous implicit behavior before the release, given that this would be a breaking change to a fundamental part of the language. If we're 80/20 on allowing the previous implicit behavior, making the change now would avoid the breaking change without that much cost — as others pointed out yesterday, many of the immediate benefits come from better development on main, rather than big user-facing changes in a version number.

consider adding something to help users in making the transition; e.g. an error on the existing approach

On the specific language change, I see it as a tradeoff between syntactic simplicity and semantic simplicity:

Syntactic simplicity / brevity — joining on a shared column is very frequent, particularly in a well-designed schema. Things that are expressed frequently should have small[^1] representations.

Almost never are we actually joining based on a bool column. (I hadn't even realized that the previous design was ambiguous). It would be doing a cross-join based on a bool column from one of the tables — so unlikely.

What do folks think about the extra complexity for users? Possibly it's actually not that much burden to understand "Represent [USING / an equi-join / a join between identically named columns] with ~"? (Maybe it feels bigger to us because it's a change and it's so prominent in our docs and examples?)

Semantic simplicity / generality — having [id] mean something different from [id==true] in a join breaks the encapsulation of the expression. The compiler needs to understand what's inside the expression; uncorrelated concepts become correlated, the language becomes less general and less orthogonal.

For example, an unlikely but possible example — is bar a condition or a column in the join? I guess it's a condition because we know it's baz==bax. But if bar were materialized in a column in the DB, then the behavior suddendly changes.

from foo derive bar = baz==bax join x [bar] # is `bar` an implicit column in both tables? Or `baz==bax`?

I've been supportive of #919, which increases generality, this would go against that theme

If anyone has ideas for an alternative representation rather than ~, then feel free to suggest! Though I actually think that ~ is pretty good.

One alternative would be to have a different parameter; e.g. using:[id], but then given the conditions parameter would still be required, we'd have an awkward join locations using:[id] [].

If we do go the explicit route, is there something we can do to make this clearer for users? I would find this quite confusing if I weren't watching the releases and all of a sudden this compiles to something completely different:

from foo join x [bar]

SELECT foo.*, - x.*, - bar + x.* FROM foo - JOIN x USING(bar) + JOIN x ON bar

PRQL has a higher ratio of expectations&excitement vs. users than most projects, so it's fine to make breaking changes atm. But this is potentially quite severe. Assuming we go the explicit route, should we raise an error for a bare column name for a few versions so it's at least obvious when people do this?

Without wanting to zoom out too far, possibly it's worth considering this in the context of overall joins; e.g. #716 & #723

Where do folks end up? As I said prior, @aljazerzen has full rights to respond with 😫, and I'll do the release. semantic was really Herculean, and we're still young enough that we probably underrate velocity.

For transparency, if we do decide to make the change, I'm flat-out with non-PRQL stuff until mid-week, after which I have more time and would be happy to work on this. I'm quite excited to get into working with the new compiler!

[^1]: "small" here means both in character-count and syntactic complexity, in this case ~ is small in character-count but adds to syntactic complexity. For theory around compression, check out source-coding, and I can find better references if folks are interested
language-design priority
opened by max-sixty 19
feat: add prql library generation to be used in other platforms (e.g. Golang)

I've added a small shim that exports to_sql and to_json and generates static and dynamic libraries that can be used through FFI in other platforms. I am using it in Go.

opened by sigxcpu76 17
Dialect-specific SQL

Even with the current code, take 20 will produce TOP(20), which is incompatible with some DBs.

The original plan was to allow a db named arg to queries, which I still think works well, but open to alternatives.
enhancement help wanted

opened by max-sixty 16

Can't build `playground` locally

I forked the prql repo to my own account, then cloned it to my machine (macOS 10.15.7). I then tried to build the playground so I could try it locally. I got an error (sh: wasm-pack: command not found), so I added "wasm-pack": "^0.10.3" to the devDependencies in the package.json file.

I then tried the following. I saw a couple warnings about deprecations, but I don't know what to do about the cargo metadata file. What other troubleshooting information could I provide? Many thanks.

√ playground % node --version
v16.14.0
√ playground % npm --version
8.3.1
√ playground % npm install
npm WARN deprecated [email protected]: Modern JS already guarantees Array#sort() is a stable sort, so this library is deprecated. See the compatibility table on MDN: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/sort#browser_compatibility
npm WARN deprecated [email protected]: This SVGO version is no longer supported. Upgrade to v2.x.x.

> [email protected] preinstall
> cd ../prql-js && npm run build-bundler


> [email protected] build-bundler
> wasm-pack build --target bundler --release --out-dir dist/bundler

Error: Error during execution of `cargo metadata`: error: failed to parse lock file at: /Users/richb/github/prql/Cargo.lock

To learn more, run the command again with --verbose.

npm ERR! code 1
npm ERR! path /Users/richb/github/prql/playground
npm ERR! command failed
npm ERR! command sh -c cd ../prql-js && npm run build-bundler

npm ERR! A complete log of this run can be found in:
npm ERR!     /Users/richb/.npm/_logs/2022-09-14T17_44_20_052Z-debug-0.log
?1 playground %

opened by richb-hanover 15

Named args vs assignments
One issue https://github.com/max-sixty/prql/pull/52 raises is how assignments are parsed; specifically:

derive gross_salary: salary + payroll_tax,

Should parse into:

func: derive

assign:

lvalue: gross_salary

rvalue: salary + payroll_tax

But named args' syntax conflict with this. For example:

aggregate by:[title] [sum salary]

should parse into:

func: aggregate

assign:

lvalue: by

rvalue: [title]

arg: [sum salary]

...and not

func: aggregate

assign:

lvalue: by

rvalue:[title] [sum salary]

...which is what the assign implies.

Options:

change syntax so only one of these uses :

Though trying to avoid using = so we can keep the SQL (and human!) use of this

force the assign to always take one arg, so gross_salary: salary + payroll_tax needs to be gross_salary: (salary + payroll_tax)

restore the rules around whitespace, such that passing named args disallows whitespace (i.e. by:[title] and not by: [title]). But this makes semantics very dependent on small changes in whitespace, which could get confusing. An autoformatter isn't going to be able to deal with this

References:

OCaml approach

F# approach (broadly similar)

Swift approach (broadly similar)

language-design
opened by max-sixty 15
Alternative Backends | Support open table projects like Apache Iceberg

Hi,

First of all, thank you for the great project !

I was wondering if under "Alternative Backends" - can integrations with Open Table format like Apache Iceberg can be considered / added to the roadmap?

opened by ramkumarkb 1
docs: Update roadmap

This is now ready for review. I'm happy to discuss things here. It's also fine if we merge without complete agreement and discuss on the next dev call.

opened by max-sixty 1
Additional CTE when using an alias
This is extremely minor, to the extent I hesitated to put an issue in, but generally I think the bar should be quite low, so:

Here we get a CTE which I think we could avoid:

from employees select fname = first_name filter fname == "Fred"

WITH table_1 AS ( SELECT first_name AS fname FROM employees ) SELECT fname FROM table_1 WHERE fname = 'Fred'

Possibly this could be:

SELECT first_name as fname FROM employees WHERE first_name = 'Fred'

...since in SQL we can still use first_name within that select statement.
friendliness
opened by max-sixty 2
Python API doesn't match documentation
Thanks for creating prql! It seems like a great idea.

I tried to try prql for the first time today. I followed the installation instructions and installed prql-python 0.3.1 in my python 3.10 environment. Then I tried to run the example code. On

sql = prql.compile(prql_query)

I got AttributeError: module 'prql_python' has no attribute 'compile'. Looking closer, and comparing the to tests

https://github.com/PRQL/prql/blob/eedc7fa3ac915ac79cf5a3de25fde4416fa49827/prql-python/python/tests/test_all.py#L14-L21

...I found that my API is different .dir(prql) shows

['__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__path__', '__spec__', '__version__', 'prql_python', 'to_json', 'to_sql']

This is a confusing experience for me as a first-time prql user. Am I doing something wrong?
opened by rabernat 3
prql-js syntax suggestion/sugar
It sounds very exciting that you are revamping the outdated SQL language with a pipeline-based approach. Perhaps at the beginning of the project, there is still an opportunity to incorporate modern syntax into the language. I have two suggestions:

First is the const sql = compile(`from employees | select first_name`); instead of the const sql = prql`from employees | select first_name`; use, which immediately shows what it is about.

The second is that expressions can be formatted much better with line breaks:

const sql = prql` from employees select first_name `;
integrations
opened by BCsabaEngine 1

Releases(0.3.1)

0.3.1(Dec 4, 2022)
0.3.1 brings a couple of small improvements and fixes.

Features:

Support for using s-strings for from (#1197, @aljazerzen)
from s"SELECT * FROM employees WHERE foo > 5"

Helpful error message when referencing a table in an s-string (#1203, @aljazerzen)

Fixes:

Multiple columns with same name created (#1211, @aljazerzen)

Renaming via select breaks preceding sorting (#1204, @aljazerzen)

Same column gets selected multiple times (#1186, @mklopets)

Internal:

Update Github Actions and Workflows to current version numbers (and avoid using Node 12) (#1201)

Source code(tar.gz)
Source code(zip)
0.3.0(Nov 30, 2022)
🎉 0.3.0 is the biggest ever change in PRQL's compiler 🎉. It rewrites much of the internals: the compiler now has a semantic understanding of expressions, including resolving names & building a DAG of column lineage.

While the immediate changes to the language are modest — some long-running bugs are fixed — this unlocks beginning development of many of the broad features we've had ambitions for, such as type-checking & auto-complete. And it simplifies building our next language features, such as match-case expressions, unions & table expressions.

@aljazerzen has (mostly single-handedly) done this work over the past few months. The project owes him immense appreciation.

Breaking changes:

We've had to make some modest breaking changes for 0.3:

Pipelines must start with from. For example, a pipeline with only derive foo = 5, with no from transform, is no longer valid. Depending on demand for this feature, it would be possible to add this back.

Shared column names now require == in a join. For example:

from employees -join positions [id] +join positions [==id]

The existing approach is ambiguous to the compiler — id could be a boolean column.

Table references containing periods must be surrounded by backticks. For example, when referencing a schema name:

-from public.sometable +from `public.sometable`

Features:

Change self equality op to == (#1176, @aljazerzen)

Add logging (@aljazerzen)

Add clickhouse dialect (#1090, @max-sixty)

Allow namespaces & tables to contain . (#1079, @aljazerzen)

Fixes:

Deduplicate column appearing in SELECT multiple times (#1186, @aljazerzen)

Fix uppercase table names (#1184, @aljazerzen)

Omit table name when only one ident in SELECT (#1094, @aljazerzen)

Documentation:

Add chapter on semantics' internals (@aljazerzen, #1028)

Add note about nesting variables in s-strings (@max-sixty, #1163)

Internal changes:

Flatten group and window (#1120, @aljazerzen)

Split ast into expr and stmt (@aljazerzen)

Refactor associativity (#1156, @aljazerzen)

Rename Ident constructor to from_name (#1084, @aljazerzen)

Refactor rq folding (#1177, @aljazerzen)

Add tests for reported bugs fixes in semantic (#1174, @aljazerzen)

Bump duckdb from 0.5.0 to 0.6.0 (#1132)

Bump once_cell from 1.15.0 to 1.16.0 (#1101)

Bump pest from 2.4.0 to 2.5.0 (#1161)

Bump pest_derive from 2.4.0 to 2.5.0 (#1179)

Bump sqlparser from 0.25.0 to 0.27.0 (#1131)

Bump trash from 2.1.5 to 3.0.0 (#1178)

Source code(tar.gz)
Source code(zip)
0.2.11(Nov 20, 2022)
0.2.11 contains a few helpful fixes.

Work continues on our semantic refactor — look out for 0.3.0 soon! Many thanks to @aljazerzen for his continued contributions to this.

Note: 0.2.10 was skipped due to this maintainer's inability to read his own docs on bumping versions...

Features:

Detect when compiler version is behind query version (@MarinPostma, #1058)

Add __version__ to prql-python package (@max-sixty, #1034)

Fixes:

Fix nesting of expressions with equal binding strength and left associativity, such as a - (b - c) (@max-sixty, #1136)

Retain floats without significant digits as floats (@max-sixty, #1141)

Documentation:

Add documentation of prqlr bindings (@eitsupi, #1091)

Add a 'Why PRQL' section to the website (@max-sixty, #1098)

Add @snth to core-devs (@max-sixty, #1050)

Internal changes:

Use workspace versioning (@max-sixty, #1065)

Source code(tar.gz)
Source code(zip)
0.2.9(Oct 14, 2022)
0.2.9 is a small release containing a bug fix for empty strings.

Fixes:

Fix parsing of empty strings (@aljazerzen, #1024)

Source code(tar.gz)
Source code(zip)
0.2.8(Oct 11, 2022)
0.2.8 is another modest release with some fixes, doc improvements, bindings improvements, and lots of internal changes. Note that one of the fixes causes the behavior of round and cast to change slightly — though it's handled as a fix rather than a breaking change in semantic versioning.

Fixes:

Change order of the round & cast function parameters to have the column last; for example round 2 foo_col / cast int foo. This is consistent with other functions, and makes piping possible:

derive [ gross_salary = (salary + payroll_tax | as int), gross_salary_rounded = (gross_salary | round 0), ]

Documentation:

Split DEVELOPMENT.md from CONTRIBUTING.md (@richb-hanover, #1010)

Make s-strings more prominent in website intro (@max-sixty, #982)

Web:

Add GitHub star count to website (@max-sixty, #990)

Integrations:

Expose a shortened error message, in particular for the VSCode extension (@aljazerzen, #1005)

Internal changes:

Specify 1.60.0 as minimum rust version (@max-sixty, #1011)

Remove old wee-alloc code (@max-sixty, #1013)

Upgrade clap to version 4 (@aj-bagwell, #1004)

Improve book-building script in Taskfile (@max-sixty, #989)

Publish website using an artifact rather than a long-lived branch (@max-sixty, #1009)

Source code(tar.gz)
Source code(zip)
0.2.7(Sep 17, 2022)
0.2.7 is a fairly modest release, six weeks after 0.2.6. We have some more significant features, including a union operator and an overhaul of our type system, as open PRs which will follow in future releases.

We also have new features in the VSCode extension, courtesy of @jiripospisil, including a live output panel.

Fixes:

range_of_ranges checks the Range end is smaller than its start (@shuozeli, #946)

Documentation:

Improve various docs (@max-sixty, #974, #971, #972, #970, #925)

Add reference to EdgeDB's blog post in our FAQ (@max-sixty, #922)

Fix typos (@kianmeng, #943)

Integrations:

Add prql-lib, enabling language bindings with go (@sigxcpu76, #923)

Fix line numbers in JS exceptions (@charlie-sanders, #929)

Internal changes:

Lock the version of the rust-toolchain, with auto-updates (@max-sixty, #926, #927)

Source code(tar.gz)
Source code(zip)
0.2.6(Aug 6, 2022)
0.2.6 is a modest release following 0.2.5 a week ago. It includes lots of small fixes and some docs:

Fixes

Adjust fmt to only escape names when needed (@aljazerzen, #907)

Fix quoting on upper case table names (@max-sixty, #893)

Fix scoping of identical column names from multiple tables (@max-sixty, #908)

Fix parse error on newlines in a table (@sebastiantoh 🆕, #902)

Fix quoting of upper case table names (@max-sixty, #893)

Documentation

Add docs on Architecture (@aljazerzen, #904)

Add Changelog (@max-sixty, #890 #891)

Internal changes

Start trial using Conventional Commits (@max-sixty, #889)

Add crates.io release workflow, docs (@max-sixty, #887)

Thanks to @sebastiantoh for his first PR into PRQL!
Source code(tar.gz)
Source code(zip)
0.2.5(Jul 29, 2022)
0.2.5 is a very small release following 0.2.4 yesterday. It includes:

Add the ability to represent single brackets in an s-string, with two brackets (#752, @max-sixty )

Fix the "Copy to Clipboard" command in the Playground, for Firefox (#880, @mklopets )

Source code(tar.gz)
Source code(zip)
0.2.4(Jul 28, 2022)
0.2.4 is a small release following 0.2.3 a few days ago. The 0.2.4 release includes:

Enrich our CLI, adding commands to get different stages of the compilation process (@aljazerzen , #863)

Fix multiple take n statements in a query, leading to duplicate proxy columns in generated SQL (@charlie-sanders )

Fix BigQuery quoting of identifiers in SELECT statements (@max-sixty )

Some internal changes — reorganize top-level functions (@aljazerzen ), add a workflow to track our rust compilation time (@max-sixty ), simplify our simple prql-to-sql tests (@max-sixty )

Thanks to @ankane, prql-compiler is now available from homebrew core; brew install prql-compiler[^1].

Reiterating our plans from the 0.2.3 release notes:

From here, we're planning to continue squashing bugs (albeit more minor than those in this release), adding some features like union, while working on bigger issues such as type-inference.

We're also going to document and modularize the compiler further. It's important that we give more people an opportunity to contribute to the guts of PRQL, especially given the number and enthusiasm of contributions to project in general — and it's not that easy to do so at the moment. While this is ongoing if anyone has something they'd like to work on in the more difficult parts of the compiler, let us know on GitHub or Discord, and we'd be happy to work together on it.

[^1]: we still need to update docs and add a release workflow for this: https://github.com/prql/prql/issues/866
Source code(tar.gz)
Source code(zip)
0.2.3(Jul 25, 2022)
A couple of weeks since the 0.2.2 release: we've squashed a few bugs, added some mid-sized features to the language, and made a bunch of internal improvements.

The 0.2.3 release includes:

Allow for escaping otherwise-invalid identifiers (@aljazerzen & @max-sixty )

Fix a bug around operator precedence (@max-sixty )

Add a section the book on the language bindings (@charlie-sanders )

Add tests for our Display representation while fixing some existing bugs. This is gradually becoming our code formatter (@arrizalamin )

Add a "copy to clipboard" button in the Playground (@mklopets )

Add lots of guidance to our CONTRIBUTING.md around our tests and process for merging (@max-sixty )

Add a prql! macro for parsing a prql query at compile time (@aljazerzen )

Add tests for prql-js (@charlie-sanders )

Add a from_json method for transforming json to a PRQL string (@arrizalamin )

Add a workflow to release prql-java to Maven (@doki23 )

Enable running all tests from a PR by adding a pr-run-all-tests label (@max-sixty )

Have cargo-release to bump all crate & npm versions (@max-sixty )

Update prql-js to use the bundler build of prql-js (@mklopets )

As well as those contribution changes, thanks to those who've reported issues, such as @mklopets @huw @mm444 @ajfriend.

From here, we're planning to continue squashing bugs (albeit more minor than those in this release), adding some features like union, while working on bigger issues such as type-inference.

We're also going to document and modularize the compiler further. It's important that we give more people an opportunity to contribute to the guts of PRQL, especially given the number and enthusiasm of contributions to project in general — and it's not that easy to do so at the moment. While this is ongoing if anyone has something they'd like to work on in the more difficult parts of the compiler, let us know on GitHub or Discord, and we'd be happy to work together on it.

Thank you!
Source code(tar.gz)
Source code(zip)
0.2.2(Jul 10, 2022)
We're a couple of weeks since our 0.2.0 release. Thanks for the surge in interest and contributions! 0.2.2[^1] has some fixes & some internal improvements:

We now test against SQLite & DuckDB on every commit, to ensure we're producing correct SQL. (@aljazerzen )

We have the beginning of Java bindings! (@doki23 )

Idents surrounded by backticks are passed through to SQL (@max-sixty )

More examples on homepage; e.g. join & window, lots of small docs improvements

Automated releases to homebrew (@roG0d )

prql-js is now a single package for node, browsers & webpack (@charlie-sanders )

Parsing has some fixes, including >= and leading underscores in idents (@mklopets )

Ranges receive correct syntax highlighting (@max-sixty )

Thanks to Aljaž Mur Eržen @aljazerzen , George Roldugin @roldugin , Jasper McCulloch @Jaspooky , Jie Han @doki23 , Marko Klopets @mklopets , Maximilian Roos @max-sixty , Rodrigo Garcia @roG0d , Ryan Russell @ryanrussell , Steven Maude @StevenMaude , Charlie Sanders @charlie-sanders .

We're planning to continue collecting bugs & feature requests from users, as well as working on some of the bigger features, like type-inference.

For those interesting in joining, we also have a new Contributing page.

[^1]: Think of 0.2.1 like C+ :)
Source code(tar.gz)
Source code(zip)
0.2.0(Jun 27, 2022)

🎉 🎉 After several months of building, PRQL is ready to use! 🎉 🎉

How we got here:

At the end of January, we published a proposal of a better language for data transformation: PRQL. The reception was better than I could have hoped for — we were no. 2 on HackerNews for a day, and gained 2.5K GitHub stars over the next few days.

But man cannot live on GitHub Stars alone — we had to do the work to build it. So over the next several months, during many evenings & weekends, a growing group of us gradually built the compiler, evolved the language, and wrote some integrations.

We want to double-down on the community and its roots in open source — it's incredible that a few of us from all over the globe have collaborated on a project without ever having met. We decided early-on that PRQL would always be open-source and would never have a commercial product (despite lots of outside interest to fund a seed round!). Because languages are so deep in the stack, and the data stack has so many players, the best chance of building a great language is to build an open language.

We still have a long way to go. While PRQL is usable, it has lots of missing features, and an incredible amount of unfulfilled potential, including a language server, cohesion with databases, and type inference. Over the coming weeks, we'd like to grow the number of intrepid users experimenting PRQL in their projects, prioritize features that will unblock them, and then start fulfilling PRQL's potential by working through our roadmap.

The best way to experience PRQL is to try it. Check out our website and the Playground. Start using PRQL for your own projects in dbt, Jupyter notebooks, and Prefect workflows.

Keep in touch with PRQL by following the project on Twitter, joining us on Discord, starring the repo.

Contribute to the project — we're a really friendly community, whether you're a recent SQL user or an advanced rust programmer. We need bug reports, documentation tweaks & feature requests — just as much as we need compiler improvements written in rust.

I especially want to give Aljaž Mur Eržen (@aljazerzen) the credit he deserves, who has contributed the majority of the difficult work of building out the compiler. Much credit also goes to Charlie Sanders (@charlie-sanders), one of PRQL's earliest supporters and the author of PyPrql, and Ryan Patterson-Cross (@rbpatt2019), who built the Jupyter integration among other Python contributions.

Other contributors who deserve a special mention include: @roG0d, @snth, @kwigley

Thank you, and we look forward to your feedback!
Source code(tar.gz)
Source code(zip)
0.1.1(Apr 2, 2022)
0.1.1 contains big improvements to the compiler

Thank you to the contributors to this release, in particular @aljazerzen, who is responsible for an internal rewrite of much of the compiler: Aljaž Mur Eržen (@aljazerzen), Maximilian Roos (@max-sixty), shota kizawa (@kiibo382).

The internal rewrite fixes bugs in more complicated queries — such as variables that span multiple CTEs in the generated SQL — and introduces a more coherent & extensible approach to the compilation. It also adds options to prql compile --format to understand how variables and functions are being compiled.

It radically improves many error messages:

$ echo ' from employees join side:[left] managers [id] ' | prql compile Error: ╭─[-:2:11] │ 2 │ join side:[left] managers [id] · ───┬─── · ╰───── expected ident, but found List

We have the beginning of a language reference. Improving its content is a priority over the coming days.

Install with cargo install prql. If you're interesting in joining, check out the Contributing section in the Readme.
Source code(tar.gz)
Source code(zip)
0.1.0(Mar 21, 2022)
PRQL just hit 0.1!

A huge thank you to the contributors to this release:

Aljaž Mur Eržen (@aljazerzen), Andrey Popp (@andreypopp), Daniel Gasienica (@gasi), Daniel Jones (@danjones1618), Ian Fisher (@iafisher), Kyle Wigley (@kwigley), Maximilian Roos (@max-sixty), Peder Bergebakken Sundt (@pbsds), charlie sando (@qorrect), and shota kizawa (@kiibo382).

0.1 means:

It works™, for basic transformations such as filter, select, aggregate, take, sort, & join. Variables (derive), functions (func) and CTEs (table) work.

More advanced language features are forthcoming, like better inline pipelines, window clauses, and arrays.

It's not friendly at the moment:

It runs from a CLI only, taking input from a file or stdin and writing to a file or stdout.

Error messages are bad.

For an interactive experience, combine with a tool like Up.

The documentation is lacking.

Our current top priority is to have some decent documentation #233.

It doesn't support changing the dialect.

It has bugs. Please report them!

It has sharp corners. Please report grazes!

We'll release backward-incompatible changes. The versioning system for the language is not yet implemented.

Give it a try and let us know your feedback — cargo install prql.

There's much more on the Readme, including examples, how to contribute, and a Roadmap.
Source code(tar.gz)
Source code(zip)