Generate type-checked Rust from your PostgreSQL.

Last update: Dec 25, 2022

Related tags

Overview

Cornucopia

Generate type checked Rust from your SQL

Install | Example

Cornucopia is a small CLI utility resting on postgres designed to facilitate PostgreSQL workflows in Rust.

Cornucopia aims to get out of your way, transpiling your PostgreSQL queries to Rust on demand. Each query is prepared against your schema, ensuring that the query statements are valid SQL. These prepared statements are then be used to generate properly type-checked Rust code for this query.

Features

SQL-first. Your database schema is the source of truth. No ORM.
Custom user types (composites, enums and domains).
Async (tokio_postgres) and sync drivers (postgres).
Ergonomic type mapping.
One-dimensional array types.
Granular nullity control.
Optional migration management.
Build your queries against your own live database, or let Cornucopia manage that for you.
Use the connection type that you want (pooled or not, transaction or not). You can mix and match them freely.
Compatible with build scripts to generate whenever your SQL changes.

Keep reading for more info, or take a look at the basic example for a quickstart 🚀 .

Install

Container manager

Cornucopia spawns a postgres container when it generates your Rust modules, so, you'll need a working docker or podman command. Note: If you only work in live mode, you may not need a container manager since you'll manage the database yourself.

To use docker on Linux, non-sudo users need to be in the docker group. For a step-by-step guide, please read the official docker installation and post-installation docs.

No special installation steps are needed for podman, but note that you will need to pass a CLI flag to cornucopia (-p or --podman) as it defaults to docker.

Dependencies

Required

Client code: cornucopia_client.
Postgres type utils: postgres_types.

(Optional) Async

Runtime: tokio.
Database driver: tokio_postgres .
Async tools: futures.

(Optional) Connection pooling

Pooled connections: deadpool-postgres.

(Optional) Extra types using `tokio_postgres` features

Crate	available types	`tokio_postgres` feature
`serde_json`	`Value`	`with-serde_json-1`
`time`	`Date` `Time` `PrimitiveDateTime` `OffsetDateTime`	`with-time-0_3`
`uuid`	`Uuid`	`with-uuid-1`
`eui48`	`MacAddress`	`with-eui48-1`

Full dependencies

The code block below shows what your dependencies might look like with every feature that cornucopia supports enabled:

# Cargo.toml
[dependencies]
tokio = { version = "1.18.1", features = ["full"] }
deadpool-postgres = { version = "0.10.2" }
cornucopia_client = "0.2.2"
futures = "0.3.21"
tokio-postgres = { version = "0.7.6", features = [
    "with-serde_json-1",
    "with-time-0_3",
    "with-uuid-1",
    "with-eui48-1",
] }
serde = { version = "1.0.137", features = ["derive"] }
serde_json = "1.0.81"
time = "0.3.9"
uuid = "1.0.0"
eui48 = "1.1.0"

You can omit tokio-postgres feature flags for json, time, uuid, eui48 and their corresponding crates if you don't need them.

Cornucopia CLI

Aside from the code dependencies, you will need the cornucopia CLI to generate your Rust modules. This can be done via a simple cargo install cornucopia which will pull the latest binary and install it in your cargo path. Note that once the queries have been generated, they build and run standalone without requiring the CLI.

Concepts

This section explain a bit more about how Cornucopia works. If you just want to get started, you should take a look at the basic example.

Cornucopia is pretty simple to use. In the next sections, we'll explore the basic usage, but feel free to look the CLI's whole interface using the --help option at any point. For convenience, the CLI's reference document is also available in this repository.

Migrations

The basic cornucopia generate command creates a new container, runs your migrations, generates your queries and cleanups the container. If you want to manage the database and migrations yourself, use the cornucopia generate live command to connect to an arbitrary live database. Keep in mind that your queries must still be otherwise compatible with Cornucopia (e.g. with regards to supported types and annotation syntax).

New migrations can be added using the command cornucopia migration new.

Finally, as a convenience, you can use cornucopia migration run to run migrations on your database too if you so desire. This feature worksfor simple cases, but is not yet thoroughly tested and it's advisable that you use a more robust migration system.

Queries

Each .sql file in your queries directory will be converted into a Rust module containing your generated queries. Each query is actually prepared against your database, ensuring as many errors as possible will be caught before production. The generated functions are fully typed, giving you insight into your SQL and pretty strong guards against runtime errors.

Generated modules

Assuming you have the following migration

CREATE TABLE Author (
    Id SERIAL NOT NULL,
    Name VARCHAR(70) NOT NULL,
    Country VARCHAR(100) NOT NULL,
    PRIMARY KEY(Id)
);

and the following query

--! author_name_starting_with
SELECT
    *
FROM
    Author
WHERE
    name LIKE CONCAT(:start_str::text, '%');

After generating your queries with cornucopia, you could use it like this

let authors = author_name_starting_with(client, &"Joh").vec().await?;
// Print all the authors whose name starts with "Joh"
for author in authors {
  println!("{}" author.name)
}

The generated code covers lot more than that, but the above should be fairly representative how you could use the generated code. Head over to the examples if you want to see more features in action.

The diagram below shows a very high level representation of the items generated by Cornucopia. It's not exhaustive, but it can help you wrap your head around the generated code if you're starting out.

Query annotation syntax

--! example_query
select * from authors
where first_name = :first_name and last_name = :last_name

Notice that bind parameters are specified by name with the :colon_identifier notation, instead of by index. This allows queries to be very concise while also being more expressive.

Annotations are whitespace insignificant and can be split accross multiple lines too

--! authors (
--!
--! )

Comments that do not start with --! (e.g. -- This) are simply ignored by Cornucopia, so feel free to use them as you usually would.

Nullable columns

--! authors_named_john(first_name?): (name?)
select name from authors 
where first_name = :first_name

Query parameters and columns can specify their nullity by using the (hello?, world?) notation. Fields that are not present are assumed to be non-null.

Transactions

Generated queries take a GenericClient as parameter, which accepts both Clients and Transactions. That means you can use the same generated queries for both single statements and transactions.

Automatically generate queries

You can make use of Rust's build script feature to automatically regenerate your Cornucopia queries upon building your crate, only when your SQL has changed. The simplest way to achieve this is simply to call Cornucopia's CLI inside your build.rs file. You can learn more about this feature in this example.

Supported types

Base types

PostgrsQL type	Rust type
`bool`, `boolean`	`bool`
`"char"`	`i8`
`smallint`, `int2`, `smallserial`, `serial2`	`i16`
`int`, `int4`, `serial`, `serial4`	`i32`
`bigint`, `int8`, `bigserial`, `serial8`	`i64`
`real`, `float4`	`f32`
`double precision`, `float8`	`f64`
`text`	`String`
`varchar`	`String`
`bytea`	`Vec<u8>`
`timestamp without time zone`, `timestamp`	`time::PrimitiveDateTime`
`timestamp with time zone`, `timestamptz`	`time::OffsetDateTime`
`date`	`time::Date`
`time`	`time::Time`
`json`	`serde_json::Value`
`jsonb`	`serde_json::Value`
`uuid`	`uuid::Uuid`
`inet`	`std::net::IpAddr`
`macaddr`	`eui48::MacAddress`

Custom types

Cornucopia also supports user-defined enums, composites and domains. Just like base types, custom types will be generated automatically by inspecting your database. The only requirement for your custom types is that they be based on other supported types (base or custom). Cornucopia is also aware of your types' namespaces (what PostgreSQL calls schemas), so it will correctly handle custom types like my_schema.my_custom_type.

Array types

Cornucopia supports one-dimensionnal arrays for which the element type is also a type supported . That is, Cornucopia supports example_elem_type[] if example_elem_type is itself a type supported by Cornucopia (base or custom).

MSRV

This crate uses Rust 2021 edition, which requires at least version 1.56.

License

Licensed under either of

Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT) at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Comments

New query architecture
This PR tracks the progress the new query architecture for the upcoming v0.8. This version is a significant re-architecture so there are quite a few things to do to ensure a smooth release. Some of the relevant discussions here #45, #46, and #47.

[x] Overhaul parser

[x] Overhaul type registration

[x] Implement query generation

[x] Update client code

[x] Comment/Improve source (not perfect but its barely ok)

[x] Update examples

[x] Update tests

[x] Update README

Closes #25 #39 #40 #41 #45 #46 #47 #48 #56
opened by LouisGariepy 36
Ergonomic params exploration
I am exploring using a lightweight solution using a trait :

pub trait StringSql: std::fmt::Debug + ToSql + Sync {} impl StringSql for String {} impl StringSql for &str {} impl StringSql for Cow<'_, str> {} impl StringSql for Box<str> {}

This way we do not use any wrapper or dynamic dispatch, we support all types supported by postgres and we stay strongly typed.

I work well with Option and Vec but can often be less user friendly is some cases:

insert_book() .bind(client, &None::<&str>, &"Necronomicon") .unwrap()

Will fix #140

TODO:

Generate generic params structure for composite types ? (require keeping fields info into CornucopiaType)
opened by Virgiel 28

Borrowing raw queries with mapper

Using borrowing raw query with a mapper lambda could give a very ergonomic way to deserialize struct with minimal allocation.

This code:

--! authors() #
SELECT name, country FROM Author

Would generate an additional function:

pub async fn authors_map<T: GenericClient, R>(
    client: &T,
    mut mapper: impl FnMut((&str, &str)) -> R,
) -> Result<impl Stream<Item = Result<R, Error>>, Error> {
    let stmt = client.prepare("SELECT name, country FROM Author;").await?;
    let stream = client
        .query_raw(&stmt, std::iter::empty::<u32>())
        .await?
        .map(move |res| {
            res.map(|res| {
                let return_value_1: &str = res.get(0);
                let return_value_2: &str = res.get(1);
                mapper((return_value_1, return_value_2))
            })
        });
    Ok(stream.into_stream())
}

Which can be used like that:

pub struct Person {
    name: String,
    country: String,
}

impl From<(&str, &str)> for Person {
    fn from((name, country): (&str, &str)) -> Self {
        Self {
            name: name.trim().to_owned(),
            country: country
                .split('|')
                .map(|s| s.trim())
                .nth(2)
                .unwrap_or_default()
                .to_owned(),
        }
    }
}

pub async fn get_authors<T: GenericClient>(client: &T) -> Result<Vec<Person>, Error> {
    authors_map(client, |raw| Person::from(raw)).await?.try_collect().await
}

I tried a function generic over type implementing From but it is a lifetime nightmare.

opened by Virgiel 19

CLI says column does not exist on insert sql query

Hi I am trying to generate the following query:

--! insert_user() INSERT INTO users (username) VALUES (username)

Error while preparing query "insert_user" [file: "queries/module_3.sql", line: 1] (column "username" does not exist).

I am able to retrieve users but for some reason the above is not working. Also if I put :username that throws another error.

Am using with Postgres.

On another note I was wondering how this may perform over using the Postgres driver natively or vs Diesel or sqlx?

It seems your approach is native as it's essentially taking SQL queries and turning them into functions that engage the native postgres driver directly. Closer to metal vs an ORM, a time saving function generation solution.

opened by AppyCat 16
Null everywhere
Currently, we can declare the nullity of the returned rows fields:

CREATE TYPE composite AS ( name TEXT, age INT NOT NULL ); CREATE TABLE items ( composite composite, name TEXT NOT NULL, data BYTEA, datas BYTEA[] );

--! items ?{composite, data, datas} SELECT * FROM items;

This is enough to cover the nullity of data, datas and composite but not the values of datas and composite.name.

Nullable array value

We need a way to declare the nullity of the array and the values of the array independently which current syntax does not allow.

We could get back to something close to the old syntax:

datas? means nullable array

datas[?] means nullable array values

datas?, datas[?] or datas?[?] means nullable array and array values

--! items {composite?, data?, datas?[?]} SELECT * FROM items;

Nullable composite fields

We could declare it in functions declaration:

composite? means nullable composite

composite.name? means nullable composite's name

composite?, composite.name? or composite?.name? means nullable composite and composite's name

--! items {composite?.name?, data?, datas?} SELECT * FROM items;

But this implies a redeclaration each time we use this composite type. We could add a new syntax to declare the nullity of fields of database types :

--:db composite{name?}

Nullable parameters

We could use the same syntax for parameters and rows:

--! new_items (composite?, data?, datas?[?]) {data?} INSERT INTO items (composite, name, data, datas) VALUES (:composite, :name, :data, :datas) RETURNING name, data;

Nullable named generated types fields

Depends on #63

We could also use this syntax for named generated type :

--:params ItemParams{composite?, data?, datas?[?]} --:row Item{composite?, data?, datas?[?]} --! new_items (IdParams) -> Item INSERT INTO items (composite, name, data, datas) VALUES (:composite, :name, :data, :datas) RETURNING *;
feature
opened by Virgiel 10
unresolved import `cornucopia`

The following error message occurs on VS Code: use of undeclared crate or module cornucopiarustcE0432

The above error was traced back to a 'build.rs' file with code to auto build for a docker container
support

opened by iodapson 9

Test suite

The test runner is a separate binary that can be run as a CLI with formatted colored output or as a cargo unit test. Two types of tests are available, error tests that check that errors are caught and return a well-formatted message, and example tests that check that all the examples work.

Error tests are described in TOML files:

[[test]]
name = 'ColumnNameAlreadyTaken'
query = '''
--! items
SELECT name, price AS name FROM item;
'''
migration = '''
CREATE TABLE item (
  name TEXT,
  price INTEGER
);
'''
error = '''
Error while preparing query "items" [file: "queries/module_1.sql", line: 1]:
Two or more columns have the same name: `name`. Consider disambiguing the column names with `AS` clauses.'''

Closes #60

opened by Virgiel 9

error: an inner attribute is not permitted in this context

I'm getting the following error from the generated code. If I edit cornucopia.rs directly and make the changes the compiler suggests it works again.

error: an inner attribute is not permitted in this context
  --> /workspace/target/debug/build/app-62a31f47f074433e/out/cornucopia.rs:5:5
   |
5  |       #![allow(dead_code)]
   |       ^^^^^^^^^^^^^^^^^^^^
6  | /     pub mod types { pub mod public { #[derive( Debug, postgres_types::ToSql, postgres_types::FromSql, Clone, Copy, PartialEq, Eq)]
7  | |                         #[postgres(name = "audit_action")]
8  | |                         pub enum AuditAction { AddMember,DeleteMember,AddSecret,DeleteSecret,AccessSecrets,NewServiceAccount,DeleteServic...
9  | | #[derive( Debug, postgres_types::ToSql, postgres_types::FromSql, Clone, Copy, PartialEq, Eq)]
...  |
16 | |                         #[postgres(name = "permission")]
17 | |                         pub enum Permission { ManageTeam } } }pub mod queries { pub mod audit { use futures::{{StreamExt, TryStreamExt}};...
   | |______________________________________________________________- the inner attribute doesn't annotate this module
   |
   = note: inner attributes, like `#![no_std]`, annotate the item enclosing them, and are usually found at the beginning of source files
help: to annotate the module, change the attribute from inner to outer style
   |
5  -     #![allow(dead_code)]
5  +     #[allow(dead_code)]

bug

opened by ianpurton 7

Problematic build scripts.
Our build scripts have some pretty heavy side effects (i.e. spawning a whole managed run of cornucopia). Because of this, things are starting to get weird:

Rust Analyzer runs build scripts in background, which can cause unexpected IDE crashes/slowdowns.

Errors in build scripts sometimes leave our managed container open? I'm not sure why/how this happens, but it is annoying since you then have to manually remove the container.

I'm not 100% if I got the causes right, but I'd like to investigate this.

Related to #98 .
bug CI
opened by LouisGariepy 7
Stop formatting generate code

Originally, the CLI called rustfmt to format the generated code and I proposed an alternative with prettyplease which would not use a command and rely on the presence of rustfmt. However, as we are not using a token stream, we have to parse the generated code first before formatting it and this has a significant code, about 21% of our test time.

I suggest not formatting the generated code as it is usable without being formatted. For people who need all their code formatted, they should call rustfmt themselves (our integration test does this to keep the CI happy).
feature

opened by Virgiel 7
Provide a way for users to cutomize `derive`s on custom types

I don't want to presuppose too much about how the generated code will be used, thus, so far, I've refrained from adding any superfluous derives to the generated types.

This might be limiting to some users wishing to add derives to their types (like Serialize or Clone). I'm thinking of flags like --custom-struct-derive, --custom-enum-derive and --custom-derive which could be used to add the derive only to structs, enums, or both.

This will require a bit more thought though, so if you're reading this thread because you want this feature, feel free to ping me and explain how you'd like this feature to work in practice.
feature help wanted

opened by LouisGariepy 6
Add a changelog.
Since our main branch is currently sitting at v0.9, and there's a bunch of incoming PRs to merge, now is a good time to add a CHANGELOG.md for future releases.

There are a number of different ways to go about this, but notably:

Write the changelog manually, following https://keepachangelog.com/en/1.0.0/.

Use conventional commits to automatically generate a changelog.

Option 1 (manual logging) could be done at the release level, or as @tbillington suggested, we could require each PR to maintain relevant changelog entries.

Note that if we go with route 2), I won't be enforcing a pre-commit hook for conventional commits. In my opinion, its the responsibility of the maintainers (not the casual committer) to ensure that proper conventional commits are used. Git provides tools for maintainers to achieve this, notably interactive rebasing.

I'd like to distinguish between "release notes" and "changelog".

The changelog is meant for developpers or power-users who want to know on a fine-grained level what's happening in the repository and look at the relevant PRs.

Release notes are meant for the average user, explaining at a high-level what's new, what's changed and how to upgrade. By definition they must be written manually, and I'm committed to write these out myself when release time comes, regardless of how we decide to generate the changelog.
meta
opened by LouisGariepy 6
Builder or similar pattern for Cornucopia configuration.

As mentioned in this comment, every configuration and codegen setting incurs a breaking change. We could avoid this by using a builder pattern to ensure less breakage and more ergonomic configuration for our users.

Issue #177 is kind of stalled on this currently.
feature breaking configuration

opened by LouisGariepy 0
Workspace improvements
This project has grown both in size and complexity more than I ever imagined it would. To keep up with this growth, and to prepare for the future, I think a little workspace cleanup is warranted.

This PR does not change any functionality of Cornucopia . In fact, it doesn't even modify the code of the main crates beyond some moving/renaming.

The notable changes are:

Removing some unused non-code files.

Rename the bench folder to benches (its the correct name according to the standard package layout)

Add README files to some of our internal crates to clarify their use.

Made it possible to run integration tests with podman (docker is really annoying on some systems)

Added comments in our Cargo.toml manifests to document the various dependencies.

Moved all published crates into a crates folder. This makes publishing easier and cleanly separates the internal vs external crates.

Prefix the test crates with test_ to better group them in the filesystem.

Various other minor fixes

I noticed that a couple error messages could be reworded, and some need clarifications, but I'll keep that for another PR.

I also noticed that our main crate crashes with obscure error messages in some cases, notably when there are no queries in the designated folder, or no schema file, etc. Again, this is for another PR.
opened by LouisGariepy 0
RFC: A new codegen architecture.
It was brought to my attention over a discussion on Discord that we could improve our current architecture by generating a crate instead of a single file for our codegen.

The main benefit of this is that it would allow us to automatically generate a Cargo.toml file customized to support all the necessary dependencies and features required by the user's queries, without polluting their own manifest. This would hide all the complexity of the current setup while ensuring automatically that compatible versions and features are used. This is all done without tampering with the user's own Cargo.toml so there are less risks of unwanted side effects. This would be a neat solution to #163. We also wouldn't have to read and update the toml files, we could simply generate the whole manifest, which is much simpler.

Another benefit is that it would allow us more freedom to organize the generated code in a readable/understandable way instead of having to cram everything into a single file (e.g. db types, sync/async #176 ). Taking this further, we could even generate the client crates directly into this generated crate which would save us from having to publish the 3 client crates.

As a side-effect benefit, cargo can parallelize work over crates, so if you have a lot of queries/generated code, this will improve your compile times.

So, in summary,

Much better user experience

Better compile times

More readable/scalable codegen.

Gives us more flexibility over generated code and its dependencies in general (future-proof).

Drawback

Cargo won't publish crates with path dependencies (yet, and the maintainers seem to think its too large of a change to ever be merged), so the user either has to

Not publish their crate.

Publish the generated crate separately. This is true for any internal crate so I don't think its such a big drawback, but some users might find that annoying nonetheless.

We could keep the current workflow under a "manual configuration" option (where the user must manually declare the generated modules and manage the correct dependencies) for cases where this is absolutely unwanted.
feature dependencies breaking codegen
opened by LouisGariepy 24
Allow quoted identifiers in certain places
Quoted identifiers are now allowed in places where it makes sense.

-- In type annotations (both name and fields). --: "foo.bar"(a?, "b.c"?) -- [...] -- In query data structs (both input and output). --! foo_bar Foo("a"?): Bar("b.c"?) -- [...] -- In composite type names/fields. CREATE TYPE "named_composite.with_dot" AS ( "this.is.inconceivable" "enum.with_dot" ); -- In enum type names/variants. CREATE TYPE "enum.with_dot" AS Enum('variant.with_dot');

When identifiers are converted to Rust names, they are normalized by replacing all non-alphanumeric characters with an underscore (_) in addition to escaping them with a raw identifier prefix r# if they happen to be a reserved keyword in Rust.

Fixes #184.

Implementation notes

For whatever reason, I couldn't get the quoted identifier parser working with delimited_by, but perhaps that's just my unfamiliarity with chumsky. I got it to compile, but it would keep telling me it's missing a closing quotation mark, even when it was present. I managed to work around it with a number of alternative combinators, but I'll happily replace those with the proper function if anyone knows how to get it to work.
opened by skreborn 8
Dots in type names cause parse error
Dots in type names, while uncommon and generally advised against, are perfectly valid. PostgreSQL allows using quotation marks to escape names containing otherwise illegal characters.

As this is a clearly uncommon edge case, I didn't actually expect this to work, but I believe it should not be a difficult issue to fix either.

What works

Type definitions like the following work flawlessly as-is.

--: bar(a?, b?) -- Should be translated to "bar"

What does not

But adding a dot in the name...

--: foo.bar(a?, b?) -- Should be translated to "foo.bar"

...results in a parse error.

--- stderr Error: Couldn't parse queries
feature parser
opened by skreborn 3

Owner

GitHub

postgres-ical - a PostgreSQL extension that adds features related to parsing RFC-5545 « iCalendar » data from within a PostgreSQL database

1 Feb 23, 2022

Teach your PostgreSQL database how to speak MongoDB Wire Protocol

“If it looks like MongoDB, swims like MongoDB, and quacks like MongoDB, then it probably is PostgreSQL.” ?? Discord | Online Demo | Intro Video | Quic

261 Jun 18, 2023

Native PostgreSQL driver for the Rust programming language

Rust-Postgres PostgreSQL support for Rust. postgres Documentation A native, synchronous PostgreSQL client. tokio-postgres Documentation A native, asyn

2.8k Jan 8, 2023

A Rust crate for writing servers that speak PostgreSQL's wire protocol

Convergence A Rust crate for writing servers that speak PostgreSQL's wire protocol. Additionally, the experimental convergence-arrow crate enables con

63 Jan 2, 2023

Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser

This Rust library uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parse tree.

37 Dec 18, 2022

Rust - Build a CRUD API with SQLX and PostgreSQL

In this article, you'll learn how to build a CRUD API in Rust using SQLX, Actix-web, and PostgreSQL. Learning how to build a CRUD API as a developer will equip you with valuable skills for building robust, maintainable, and scalable applications.

5 Feb 20, 2023

Materialize simplifies application development with streaming data. Incrementally-updated materialized views - in PostgreSQL and in real time. Materialize is powered by Timely Dataflow.

Materialize is a streaming database for real-time applications. Get started Check out our getting started guide. About Materialize lets you ask questi