Generate type-checked Rust from your PostgreSQL.

Last update: Jun 19, 2022

Cornucopia

Generate type checked Rust from your SQL


Cornucopia is a small CLI utility resting on postgres designed to facilitate PostgreSQL workflows in Rust.

Cornucopia aims to get out of your way, transpiling your PostgreSQL queries to Rust on demand. Each query is prepared against your schema, ensuring that the query statements are valid SQL. These prepared statements are then be used to generate properly type-checked Rust code for this query.

Features
  • SQL-first. Your database schema is the source of truth. No ORM.
  • Custom user types (composites, enums and domains).
  • Async (tokio_postgres) and sync drivers (postgres).
  • Ergonomic type mapping.
  • One-dimensional array types.
  • Granular nullity control.
  • Optional migration management.
  • Build your queries against your own live database, or let Cornucopia manage that for you.
  • Use the connection type that you want (pooled or not, transaction or not). You can mix and match them freely.
  • Compatible with build scripts to generate whenever your SQL changes.

Keep reading for more info, or take a look at the basic example for a quickstart 🚀 .


Install

Container manager

Cornucopia spawns a postgres container when it generates your Rust modules, so, you'll need a working docker or podman command. Note: If you only work in live mode, you may not need a container manager since you'll manage the database yourself.

To use docker on Linux, non-sudo users need to be in the docker group. For a step-by-step guide, please read the official docker installation and post-installation docs.

No special installation steps are needed for podman, but note that you will need to pass a CLI flag to cornucopia (-p or --podman) as it defaults to docker.

Dependencies

Required

  • Client code: cornucopia_client.
  • Postgres type utils: postgres_types.

(Optional) Async

  • Runtime: tokio.
  • Database driver: tokio_postgres .
  • Async tools: futures.

(Optional) Connection pooling

  • Pooled connections: deadpool-postgres.

(Optional) Extra types using tokio_postgres features

Crate available types tokio_postgres feature
serde_json Value with-serde_json-1
time Date Time PrimitiveDateTime OffsetDateTime with-time-0_3
uuid Uuid with-uuid-1
eui48 MacAddress with-eui48-1

Full dependencies

The code block below shows what your dependencies might look like with every feature that cornucopia supports enabled:

# Cargo.toml
[dependencies]
tokio = { version = "1.18.1", features = ["full"] }
deadpool-postgres = { version = "0.10.2" }
cornucopia_client = "0.2.2"
futures = "0.3.21"
tokio-postgres = { version = "0.7.6", features = [
    "with-serde_json-1",
    "with-time-0_3",
    "with-uuid-1",
    "with-eui48-1",
] }
serde = { version = "1.0.137", features = ["derive"] }
serde_json = "1.0.81"
time = "0.3.9"
uuid = "1.0.0"
eui48 = "1.1.0"

You can omit tokio-postgres feature flags for json, time, uuid, eui48 and their corresponding crates if you don't need them.

Cornucopia CLI

Aside from the code dependencies, you will need the cornucopia CLI to generate your Rust modules. This can be done via a simple cargo install cornucopia which will pull the latest binary and install it in your cargo path. Note that once the queries have been generated, they build and run standalone without requiring the CLI.

Concepts

This section explain a bit more about how Cornucopia works. If you just want to get started, you should take a look at the basic example.

Cornucopia is pretty simple to use. In the next sections, we'll explore the basic usage, but feel free to look the CLI's whole interface using the --help option at any point. For convenience, the CLI's reference document is also available in this repository.

Migrations

The basic cornucopia generate command creates a new container, runs your migrations, generates your queries and cleanups the container. If you want to manage the database and migrations yourself, use the cornucopia generate live command to connect to an arbitrary live database. Keep in mind that your queries must still be otherwise compatible with Cornucopia (e.g. with regards to supported types and annotation syntax).

New migrations can be added using the command cornucopia migration new.

Finally, as a convenience, you can use cornucopia migration run to run migrations on your database too if you so desire. This feature worksfor simple cases, but is not yet thoroughly tested and it's advisable that you use a more robust migration system.

Queries

Each .sql file in your queries directory will be converted into a Rust module containing your generated queries. Each query is actually prepared against your database, ensuring as many errors as possible will be caught before production. The generated functions are fully typed, giving you insight into your SQL and pretty strong guards against runtime errors.

Generated modules

Assuming you have the following migration

CREATE TABLE Author (
    Id SERIAL NOT NULL,
    Name VARCHAR(70) NOT NULL,
    Country VARCHAR(100) NOT NULL,
    PRIMARY KEY(Id)
);

and the following query

--! author_name_starting_with
SELECT
    *
FROM
    Author
WHERE
    name LIKE CONCAT(:start_str::text, '%');

After generating your queries with cornucopia, you could use it like this

let authors = author_name_starting_with(client, &"Joh").vec().await?;
// Print all the authors whose name starts with "Joh"
for author in authors {
  println!("{}" author.name)
}

The generated code covers lot more than that, but the above should be fairly representative how you could use the generated code. Head over to the examples if you want to see more features in action.

The diagram below shows a very high level representation of the items generated by Cornucopia. It's not exhaustive, but it can help you wrap your head around the generated code if you're starting out.

Query annotation syntax

--! example_query
select * from authors
where first_name = :first_name and last_name = :last_name

Notice that bind parameters are specified by name with the :colon_identifier notation, instead of by index. This allows queries to be very concise while also being more expressive.

Annotations are whitespace insignificant and can be split accross multiple lines too

--! authors (
--!
--! )

Comments that do not start with --! (e.g. -- This) are simply ignored by Cornucopia, so feel free to use them as you usually would.

Nullable columns

--! authors_named_john(first_name?): (name?)
select name from authors 
where first_name = :first_name

Query parameters and columns can specify their nullity by using the (hello?, world?) notation. Fields that are not present are assumed to be non-null.

Transactions

Generated queries take a GenericClient as parameter, which accepts both Clients and Transactions. That means you can use the same generated queries for both single statements and transactions.

Automatically generate queries

You can make use of Rust's build script feature to automatically regenerate your Cornucopia queries upon building your crate, only when your SQL has changed. The simplest way to achieve this is simply to call Cornucopia's CLI inside your build.rs file. You can learn more about this feature in this example.

Supported types

Base types

PostgrsQL type Rust type
bool, boolean bool
"char" i8
smallint, int2, smallserial, serial2 i16
int, int4, serial, serial4 i32
bigint, int8, bigserial, serial8 i64
real, float4 f32
double precision, float8 f64
text String
varchar String
bytea Vec<u8>
timestamp without time zone, timestamp time::PrimitiveDateTime
timestamp with time zone, timestamptz time::OffsetDateTime
date time::Date
time time::Time
json serde_json::Value
jsonb serde_json::Value
uuid uuid::Uuid
inet std::net::IpAddr
macaddr eui48::MacAddress

Custom types

Cornucopia also supports user-defined enums, composites and domains. Just like base types, custom types will be generated automatically by inspecting your database. The only requirement for your custom types is that they be based on other supported types (base or custom). Cornucopia is also aware of your types' namespaces (what PostgreSQL calls schemas), so it will correctly handle custom types like my_schema.my_custom_type.

Array types

Cornucopia supports one-dimensionnal arrays for which the element type is also a type supported . That is, Cornucopia supports example_elem_type[] if example_elem_type is itself a type supported by Cornucopia (base or custom).

MSRV

This crate uses Rust 2021 edition, which requires at least version 1.56.

License

Licensed under either of

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

GitHub

https://github.com/cornucopia-rs/cornucopia
Comments
  • 1. New query architecture

    This PR tracks the progress the new query architecture for the upcoming v0.8. This version is a significant re-architecture so there are quite a few things to do to ensure a smooth release. Some of the relevant discussions here #45, #46, and #47.

    • [x] Overhaul parser
    • [x] Overhaul type registration
    • [x] Implement query generation
    • [x] Update client code
    • [x] Comment/Improve source (not perfect but its barely ok)
    • [x] Update examples
    • [x] Update tests
    • [x] Update README

    Closes #25 #39 #40 #41 #45 #46 #47 #48 #56

    Reviewed by LouisGariepy at 2022-05-17 20:49
  • 2. Borrowing raw queries with mapper

    Using borrowing raw query with a mapper lambda could give a very ergonomic way to deserialize struct with minimal allocation.

    This code:

    --! authors() #
    SELECT name, country FROM Author
    

    Would generate an additional function:

    pub async fn authors_map<T: GenericClient, R>(
        client: &T,
        mut mapper: impl FnMut((&str, &str)) -> R,
    ) -> Result<impl Stream<Item = Result<R, Error>>, Error> {
        let stmt = client.prepare("SELECT name, country FROM Author;").await?;
        let stream = client
            .query_raw(&stmt, std::iter::empty::<u32>())
            .await?
            .map(move |res| {
                res.map(|res| {
                    let return_value_1: &str = res.get(0);
                    let return_value_2: &str = res.get(1);
                    mapper((return_value_1, return_value_2))
                })
            });
        Ok(stream.into_stream())
    }
    

    Which can be used like that:

    pub struct Person {
        name: String,
        country: String,
    }
    
    impl From<(&str, &str)> for Person {
        fn from((name, country): (&str, &str)) -> Self {
            Self {
                name: name.trim().to_owned(),
                country: country
                    .split('|')
                    .map(|s| s.trim())
                    .nth(2)
                    .unwrap_or_default()
                    .to_owned(),
            }
        }
    }
    
    pub async fn get_authors<T: GenericClient>(client: &T) -> Result<Vec<Person>, Error> {
        authors_map(client, |raw| Person::from(raw)).await?.try_collect().await
    }
    

    I tried a function generic over type implementing From but it is a lifetime nightmare.

    Reviewed by Virgiel at 2022-05-10 11:49
  • 3. CLI says column does not exist on insert sql query

    Hi I am trying to generate the following query:

    --! insert_user() INSERT INTO users (username) VALUES (username)

    Error while preparing query "insert_user" [file: "queries/module_3.sql", line: 1] (column "username" does not exist).

    I am able to retrieve users but for some reason the above is not working. Also if I put :username that throws another error.

    Am using with Postgres.

    On another note I was wondering how this may perform over using the Postgres driver natively or vs Diesel or sqlx?

    It seems your approach is native as it's essentially taking SQL queries and turning them into functions that engage the native postgres driver directly. Closer to metal vs an ORM, a time saving function generation solution.

    Reviewed by AppyCat at 2022-05-29 06:39
  • 4. Null everywhere

    Currently, we can declare the nullity of the returned rows fields:

    CREATE TYPE composite AS (
        name TEXT,
        age INT NOT NULL
    );
    
    CREATE TABLE items (
        composite composite,
        name TEXT NOT NULL,
        data BYTEA,
        datas BYTEA[]
    );
    
    --! items ?{composite, data, datas}
    SELECT * FROM items;
    

    This is enough to cover the nullity of data, datas and composite but not the values of datas and composite.name.

    Nullable array value

    We need a way to declare the nullity of the array and the values of the array independently which current syntax does not allow.

    We could get back to something close to the old syntax:

    • datas? means nullable array
    • datas[?] means nullable array values
    • datas?, datas[?] or datas?[?] means nullable array and array values
    --! items {composite?, data?, datas?[?]}
    SELECT * FROM items;
    

    Nullable composite fields

    We could declare it in functions declaration:

    • composite? means nullable composite
    • composite.name? means nullable composite's name
    • composite?, composite.name? or composite?.name? means nullable composite and composite's name
    --! items {composite?.name?, data?, datas?}
    SELECT * FROM items;
    

    But this implies a redeclaration each time we use this composite type. We could add a new syntax to declare the nullity of fields of database types :

    --:db composite{name?}
    

    Nullable parameters

    We could use the same syntax for parameters and rows:

    --! new_items (composite?, data?, datas?[?]) {data?}
    INSERT INTO items (composite, name, data, datas) 
      VALUES (:composite, :name, :data, :datas) 
      RETURNING name, data;
    

    Nullable named generated types fields

    Depends on #63

    We could also use this syntax for named generated type :

    --:params ItemParams{composite?, data?, datas?[?]}
    --:row Item{composite?, data?, datas?[?]}
    
    --! new_items (IdParams) -> Item
    INSERT INTO items (composite, name, data, datas) 
      VALUES (:composite, :name, :data, :datas) 
      RETURNING *;
    
    Reviewed by Virgiel at 2022-05-24 11:59
  • 5. Test suite

    The test runner is a separate binary that can be run as a CLI with formatted colored output or as a cargo unit test. Two types of tests are available, error tests that check that errors are caught and return a well-formatted message, and example tests that check that all the examples work.

    Error tests are described in TOML files:

    [[test]]
    name = 'ColumnNameAlreadyTaken'
    query = '''
    --! items
    SELECT name, price AS name FROM item;
    '''
    migration = '''
    CREATE TABLE item (
      name TEXT,
      price INTEGER
    );
    '''
    error = '''
    Error while preparing query "items" [file: "queries/module_1.sql", line: 1]:
    Two or more columns have the same name: `name`. Consider disambiguing the column names with `AS` clauses.'''
    

    Closes #60

    Reviewed by Virgiel at 2022-05-22 15:54
  • 6. Problematic build scripts.

    Our build scripts have some pretty heavy side effects (i.e. spawning a whole managed run of cornucopia). Because of this, things are starting to get weird:

    1. Rust Analyzer runs build scripts in background, which can cause unexpected IDE crashes/slowdowns.

    2. Errors in build scripts sometimes leave our managed container open? I'm not sure why/how this happens, but it is annoying since you then have to manually remove the container.

    I'm not 100% if I got the causes right, but I'd like to investigate this.

    Related to #98 .

    Reviewed by LouisGariepy at 2022-06-16 15:25
  • 7. Provide a way for users to cutomize `derive`s on custom types

    I don't want to presuppose too much about how the generated code will be used, thus, so far, I've refrained from adding any superfluous derives to the generated types.

    This might be limiting to some users wishing to add derives to their types (like Serialize or Clone). I'm thinking of flags like --custom-struct-derive, --custom-enum-derive and --custom-derive which could be used to add the derive only to structs, enums, or both.

    This will require a bit more thought though, so if you're reading this thread because you want this feature, feel free to ping me and explain how you'd like this feature to work in practice.

    Reviewed by LouisGariepy at 2022-04-26 21:20
  • 8. Add bench

    Inspired by diesel benchmark but modified to prepare the database once per benchmark (more accurate and faster) and only against Postgres sync for the moment. Results are subject to variation but show that cornucopia adds a little bit of overhead.

    For insertion, diesel generates multi-rows insertion on the fly and we can't compete with that yet. We could add a feature for these types of bulk insertion.

    On the bright side, the cornucopia benchmark is immensely more succinct and easy to use!

    Close #97

    Reviewed by Virgiel at 2022-06-15 16:00
  • 9. Better codegen

    In my quest to improve code generation, I saw that we were doing a lot of string allocations. While we are not very performance sensitive, I would still prefer to limit them before it gets out of hand. I managed to replace most of the format! with write! while simplifying some of the logic.

    Reviewed by Virgiel at 2022-05-25 19:09
  • 10. feature request: support for arrays

    If we add such modifications:

    1. migrations/1648515066_second.sql:
    CREATE TABLE CustomTable (
        col1 custom_composite,
        col2 spongebob_character,
        col3 custom_composite[]
    );
    
    INSERT INTO CustomTable (col1, col2, col3)
        VALUES (ROW('incredible', 42, 'Patrick'), 'Bob', ARRAY[ROW('incredible', 42, 'Patrick')]::custom_composite[]);
    
    1. queries/module_2.sql
    --! return_custom_type_array()
    SELECT
        col3
    FROM
        CustomTable;
    

    Then cornucopia generate returns: unsupported type _custom_composite

    1. Will be possible to add support for arrays?
    2. It would be nice to support RUST_LOG flag or --verbose sometimes it's hard to find source of error.
    Reviewed by xoac at 2022-04-29 08:54
  • 11. Dependencies

    Hi, I'm trying out cornucopia and so far I really like it.

    I have to add cornucopia to my Cargo.toml, this then adds a lot of dependencies (40+) but we only use cornucopia::GenericClient?

    so

    --! example_query(id) {id, email, reset_password_selector?}*
    SELECT 
        id, email, reset_password_selector
    FROM 
        users
    WHERE
        users.id < $1
    

    generates

    pub mod types {
                
    } 
     pub mod queries { pub mod user {
    
    use cornucopia::GenericClient;
    use tokio_postgres::Error;
    
    #[derive(Debug, Clone, PartialEq)]
    pub struct ExampleQuery {pub id : i32,pub email : String,pub reset_password_selector : Option<String>}
        pub async fn example_query<T: GenericClient>(client:&T, id : &i32) -> Result<Vec<super::super::queries::user::ExampleQuery>,Error> {let stmt = client.prepare("SELECT
    id, email, reset_password_selector
    FROM
    users
    WHERE
    users.id < $1
    ").await?;
    let res = client.query(&stmt, &[&id]).await?;
    
    let return_value = res.iter().map(|res| { let return_value_0: i32 = res.get(0); let return_value_1: String = res.get(1); let return_value_2: Option<String> = res.get(2); super::super::queries::user::ExampleQuery { id : return_value_0,email : return_value_1,reset_password_selector : return_value_2 } }).collect::<Vec<super::super::queries::user::ExampleQuery>>(); Ok(return_value)}
    } }
    

    Could we make it so that use cornucopia::GenericClient isn't needed and then a project has less dependencies as we would not have cornucopia in the cargo.toml?

    Perhaps take a Pool as a parameter?

    Reviewed by ianpurton at 2022-04-20 08:01
  • 12. Inline named type

    By default, we use the name of the function as a base for the name of generated types. This is a good default but it can generate long and weird names. At the same time, using two lines to declare named params and row is cumbersome if they are only used once.

    I propose the following:

    --! insert_everything_array EveryArrParams(): EveryArr(date?)
    INSERT INTO EverythingArray (...) VALUES (...);
    

    Having the same meaning as this:

    --: EveryArrParams
    --: EveryArr(date?)
    --! insert_everything_array EveryArrParams: EveryArr
    INSERT INTO EverythingArray (...) VALUES (...);
    

    At the exception that those named type cannot be reused in other queries.

    Reviewed by Virgiel at 2022-06-16 09:40
  • 13. Don't create row type for queries returning a single column

    Currently, every query returning columns has a dedicated type. For simple queries returning a single column, this creates a lot of API noise for minimal ergonomic improvements. How about we return the row type without a wrapping struct ?

    --! insert
    INSERT INTO table (...) VALUES (...) RETURNING id
    

    Would return an i32 instead of Insert {id: i32}.

    Reviewed by Virgiel at 2022-06-16 07:52
  • 14. Batch insertion performance.

    Right now there's no way to perform multi-inserts. That means we have to call this

    INSERT INTO table(...) VALUES (...)
    

    multiple times, instead of calling that

    INSERT INTO table(...) VALUES 
    (...),
    ...
    (...)
    

    once.

    There are cases where the former is better, notably when you are streaming your parameters, and/or they can't all fit in memory at once. But, typically, multi-row inserts are faster.

    We should find a way to use multi-row statements where approriate.

    From the benchmarks added here #98, it seems like this is our one major performance gap with hand-tuned postgres or diesel.

    Reviewed by LouisGariepy at 2022-06-16 01:58
  • 15. Outdated docs.

    Our READMEs are falling behind compared to the actual code. There's no huge discrepancies, mostly tiny errors sprinkled here and there. Notably, the examples could be explained better, and our main page is missing some feature explanations and has typos. Code snippets should be checked and corrected if need be.

    Our code itself is not particularly well documented either, but that can happen in another issue/PR.

    English is not my first language, so please bear in mind this is a best effort for me :smile:.

    Reviewed by LouisGariepy at 2022-06-13 14:11
postgres-ical - a PostgreSQL extension that adds features related to parsing RFC-5545 « iCalendar » data from within a PostgreSQL database

postgres-ical - a PostgreSQL extension that adds features related to parsing RFC-5545 « iCalendar » data from within a PostgreSQL database

Feb 23, 2022
Native PostgreSQL driver for the Rust programming language

Rust-Postgres PostgreSQL support for Rust. postgres Documentation A native, synchronous PostgreSQL client. tokio-postgres Documentation A native, asyn

Jun 23, 2022
A Rust crate for writing servers that speak PostgreSQL's wire protocol

Convergence A Rust crate for writing servers that speak PostgreSQL's wire protocol. Additionally, the experimental convergence-arrow crate enables con

Jun 16, 2022
Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser

This Rust library uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parse tree.

Jun 23, 2022
Materialize simplifies application development with streaming data. Incrementally-updated materialized views - in PostgreSQL and in real time. Materialize is powered by Timely Dataflow.
Materialize simplifies application development with streaming data. Incrementally-updated materialized views - in PostgreSQL and in real time. Materialize is powered by Timely Dataflow.

Materialize is a streaming database for real-time applications. Get started Check out our getting started guide. About Materialize lets you ask questi

Jun 22, 2022
PostgreSQL procedural language handler for Clojure via SCI

pl/sci Status This is very much an experiment and I'm open to feedback on where to take this next. Build Requirements lein GraalVM CE 20.3.0 Java 11 c

Jul 22, 2021
A tool for automated migrations for PostgreSQL, SQLite and MySQL.

Models Models is an implementation for a SQL migration management tool. It supports PostgreSQL, MySQL, and SQLite. Quick Start install the CLI by runn

Jun 19, 2022
Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes

Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes

Jun 18, 2022
The Solana AccountsDb plugin for PostgreSQL database.

The solana-accountsdb-plugin-postgres crate implements a plugin storing account data to a PostgreSQL database to illustrate how a plugin can be develo

Jun 16, 2022
RedisJSON - a JSON data type for Redis

RedisJSON RedisJSON is a Redis module that implements ECMA-404 The JSON Data Interchange Standard as a native data type. It allows storing, updating a

Jun 21, 2022
Type-safe SQL query wrappers

fnsql   The fnsql crate provides simple type-safe optional wrappers around SQL queries. Instead of calling type-less .query() and .execute(), you call

Apr 29, 2022
Some bunch of test scripts to generate a SQLite DB with 1B rows in fastest possible way

To find out the fastest way to create an SQLite DB with 1B random rows.

Jun 15, 2022
⚡🦀 🧨 make your rust types fit DynamoDB and visa versa

?? ?? dynomite dynomite makes DynamoDB fit your types (and visa versa) Overview Goals ⚡ make writing dynamodb applications in rust a productive experi

May 11, 2022
Provides a Rust-based SQLite extension for using Hypercore as the VFS for your databases.

SQLite and Hypercore A Rust library providing SQLite with an virtual file system to enable Hypercore as a means of storage. Contributing The primary r

May 5, 2022
asynchronous and synchronous interfaces and persistence implementations for your OOD architecture

OOD Persistence Asynchronous and synchronous interfaces and persistence implementations for your OOD architecture Installation Add ood_persistence = {

Feb 15, 2022
Visualize your database schema

dbviz Visualize your database schema. The tool loads database schema and draws it as a graph. Usage $ dbviz -d database_name | dot -Tpng > schema.png

Feb 13, 2022
Macros that allow for implicit await in your async code.

suspend fn Disclaimer: this was mostly made as a proof of concept for the proposal below. I haven't tested if there is a performance cost to this macr

Dec 22, 2021
SubZero - a standalone web server that turns your database directly into a REST/GraphQL api

What is this? This is a demo repository for the new subzero codebase implemented in Rust. subZero is a standalone web server that turns your database

Jun 3, 2022
Replibyte - a powerful tool to seed your databases
Replibyte - a powerful tool to seed your databases

Seed Your Development Database With Real Data ⚡️ Replibyte is a powerful tool to seed your databases with real data and other cool features ?? Feature

Jun 17, 2022