A Toy Query Engine & SQL interface

谭巍

Last update: Dec 21, 2022

Related tags

Overview

Naive Query Engine (Toy for Learning) 😄

This is a Query Engine which support SQL interface. And it is only a Toy for learn query engine only. You can check TODO to check the progress now.

Simple enough to learn (Although it is simple...but with so much work to finish.. TAT 😭 ) and Now it only has a basic architecture and most operators and planners have not implemented (will be done in the future).

This is inspired(and most ideas come) by how-query-engines-work and it is just for learning purpose. And many ideas inspired by arrow-datafusion.

Use arrow to express in-memory columnar format and use sqlparser as SQL parser.

architecture

how to use

for now, we can use NaiveDB like below, we can use csv as table storage.

use naive_db::print_result;
use naive_db::CsvConfig;
use naive_db::NaiveDB;
use naive_db::Result;

fn main() -> Result<()> {
    let mut db = NaiveDB::default();

    db.create_csv_table("t1", "data/test_data.csv", CsvConfig::default())?;

    // select
    let ret = db.run_sql("select id, name, age + 100 from t1 where id < 9 limit 3 offset 2")?;
    print_result(&ret)?;

    // Inner Join
    db.create_csv_table("employee", "data/employee.csv", CsvConfig::default())?;
    db.create_csv_table("rank", "data/rank.csv", CsvConfig::default())?;
    db.create_csv_table("department", "data/department.csv", CsvConfig::default())?;

    let ret = db.run_sql(
        "
        select id, name, rank_name, department_name
        from employee
        join rank on 
            employee.rank = rank.id  
        join department on
            employee.department_id = department.id
    ",
    )?;
    print_result(&ret)?;

    // cross join
    let ret = db.run_sql("select * from employee join rank")?;
    print_result(&ret)?;

    // aggregate
    let ret = db.run_sql(
        "
        select count(id), sum(age), sum(score), avg(score), max(score), min(score) 
        from t1 group by id % 3",
    )?;
    print_result(&ret)?;

    Ok(())
}

output will be:

+----+-------+-----------+
| id | name  | age + 100 |
+----+-------+-----------+
| 4  | lynne | 118       |
| 5  | alice | 119       |
| 6  | bob   | 120       |
+----+-------+-----------+
+----+-------+-------------+-----------------+
| id | name  | rank_name   | department_name |
+----+-------+-------------+-----------------+
| 2  | lynne | master      | IT              |
| 1  | vee   | diamond     | IT              |
| 3  | Alex  | master      | Marketing       |
| 4  | jack  | diamond     | Marketing       |
| 5  | mike  | grandmaster | Human Resource  |
+----+-------+-------------+-----------------+
+----+-------+---------------+------+----+-------------+
| id | name  | department_id | rank | id | rank_name   |
+----+-------+---------------+------+----+-------------+
| 1  | vee   | 1             | 1    | 1  | master      |
| 2  | lynne | 1             | 0    | 2  | diamond     |
| 3  | Alex  | 2             | 0    | 3  | grandmaster |
| 4  | jack  | 2             | 1    | 4  | master      |
| 5  | mike  | 3             | 2    | 5  | diamond     |
| 1  | vee   | 1             | 1    | 1  | grandmaster |
| 2  | lynne | 1             | 0    | 2  | master      |
| 3  | Alex  | 2             | 0    | 3  | diamond     |
| 4  | jack  | 2             | 1    | 4  | grandmaster |
| 5  | mike  | 3             | 2    | 5  | master      |
| 1  | vee   | 1             | 1    | 1  | diamond     |
| 2  | lynne | 1             | 0    | 2  | grandmaster |
| 3  | Alex  | 2             | 0    | 3  | master      |
| 4  | jack  | 2             | 1    | 4  | diamond     |
| 5  | mike  | 3             | 2    | 5  | grandmaster |
+----+-------+---------------+------+----+-------------+
+-----------+----------+--------------------+-------------------+------------+------------+
| count(id) | sum(age) | sum(score)         | avg(score)        | max(score) | min(score) |
+-----------+----------+--------------------+-------------------+------------+------------+
| 3         | 61       | 255.6              | 85.2              | 90.1       | 81.1       |
| 3         | 62       | 243.29000000000002 | 81.09666666666668 | 99.99      | 60         |
| 2         | 43       | 167.7              | 83.85             | 85.5       | 82.2       |
+-----------+----------+--------------------+-------------------+------------+------------+

architecture

The NaiveDB is just simple and has clear progress just like:

impl NaiveDB {
    pub fn run_sql(&self, sql: &str) -> Result<Vec<RecordBatch>> {
        // 1. sql -> statement
        let statement = SQLParser::parse(sql)?;
        // 2. statement -> logical plan
        let sql_planner = SQLPlanner::new(&self.catalog);
        let logical_plan = sql_planner.statement_to_plan(statement)?;
        // 3. optimize
        let optimizer = Optimizer::default();
        let logical_plan = optimizer.optimize(logical_plan);
        // 4. logical plan -> physical plan
        let physical_plan = QueryPlanner::create_physical_plan(&logical_plan)?;
        // 5. execute
        physical_plan.execute()
    }
}

TODO

Comments

Support Aggregate Function Group by

Signed-off-by: Veeupup [email protected]

Support group by aggregate function such as select count(id), sum(age), sum(score) from t1 group by id % 3"

opened by Veeupup 0
Roadmap 1.0

The naive-query-engine is designed to learn the query engine and the code or logic should keep it always simple and clear! We want to make this project support basic SQL grammar and run it with a basic MPP executor.

We can split the total process into two steps. First, we want to have the basic SQL grammar, and the issues are below (some have finished and do not have an issue related...) #15 #16 #17 #26 #28 #29 #32 #43 #44 #47 #48 #50 #64 #65

and the second milestone is the MPP executor #18

Any contributions are welcome!

opened by Veeupup 0
SQL Test Framework

We need a test framework to test SQL, it will be systematic testing to show which SQL we have supported.

The framework would be that we have a SQL file and an excepted result file, and we can compare the result between the actual file and result file.

opened by Veeupup 0
Support `In` list and subquery
support in sql,

[ ] in list, like select * from t where id in (1, 2, 4)

[ ] in subquery like select * from t where id in (select id from t2 where age < 10)
opened by Veeupup 0

Implement fmt::Display for all expressions

impl fmt::Display for Operator {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        let display = match &self {
            Operator::Eq => "=",
            Operator::NotEq => "!=",
            Operator::Lt => "<",
            Operator::LtEq => "<=",
            Operator::Gt => ">",
            Operator::GtEq => ">=",
            Operator::Plus => "+",
            Operator::Minus => "-",
            Operator::Multiply => "*",
            Operator::Divide => "/",
            Operator::Modulo => "%",
            Operator::And => "AND",
            Operator::Or => "OR",
            Operator::Like => "LIKE",
            Operator::NotLike => "NOT LIKE",
            Operator::RegexMatch => "~",
            Operator::RegexIMatch => "~*",
            Operator::RegexNotMatch => "!~",
            Operator::RegexNotIMatch => "!~*",
            Operator::IsDistinctFrom => "IS DISTINCT FROM",
            Operator::IsNotDistinctFrom => "IS NOT DISTINCT FROM",
            Operator::BitwiseAnd => "&",
            Operator::BitwiseOr => "|",
        };
        write!(f, "{}", display)
    }
}

Ref to the code in datafusion (datafusion/expr/src/operator.rs)

good first issue

opened by ywqzzy 0

Owner

谭巍

keep coding and learning : D

GitHub

Gh-sql - Query GitHub Projects (beta) with SQL

gh-sql: Query GitHub Projects (beta) with SQL Installation gh extension install KOBA789/gh-sql Features SELECT items DELETE items UPDATE item fields

108 Dec 7, 2022

X-Engine: A SQL Engine built from scratch in Rust.

XNGIN (pronounced "X Engine") This is a personal project to build a SQL engine from scratch. The project name is inspired by Nginx, which is a very po

111 Dec 15, 2022

🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL.

SQLx ?? The Rust SQL Toolkit Install | Usage | Docs Built with ❤️ by The LaunchBadge team SQLx is an async, pure Rust† SQL crate featuring compile-tim

7.6k Dec 31, 2022

SQL validator tool for BigQuery standard SQL.

bqvalid What bqvalid does bqvalid is the SQL validator tool for BigQuery standard SQL. bqvalid fails with error message if there's the expression that

10 Dec 25, 2022

Query LDAP and AD with SQL

SQLDAP Ever wanted to query AD or LDAP with SQL like queries ? I'm going to answer this question myself: yes ! Why ? Because I never could remember al

9 Nov 15, 2022

An object-relational in-memory cache, supports queries with an SQL-like query language.

qlcache An object-relational in-memory cache, supports queries with an SQL-like query language. Warning This is a rather low-level library, and only p

3 Nov 14, 2021

Fully typed SQL query builder for Rust [deprecated]

What is Deuterium? Deuterium is a fancy SQL builder for Rust. It's designed to provide a DSL to easily build SQL queries in safe and typed way. Like R

169 Nov 20, 2022

Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser

This Rust library uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parse tree.

37 Dec 18, 2022

Type-safe SQL query wrappers

fnsql The fnsql crate provides simple type-safe optional wrappers around SQL queries. Instead of calling type-less .query() and .execute(), you call

9 Apr 29, 2022

Query system statistics with SQL.

pg_stat_sysinfo Collects system statistics. ---- CREATE EXTENSION pg_stat_sysinfo; CREATE EXTENSION ---- SELECT * FROM pg_stat_sysinfo_collect();

12 Jun 14, 2023

ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-text search.

ReefDB ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-

75 Jun 12, 2023

RisingWave is a cloud-native streaming database that uses SQL as the interface language.

RisingWave is a cloud-native streaming database that uses SQL as the interface language. It is designed to reduce the complexity and cost of building real-time applications. RisingWave consumes streaming data, performs continuous queries, and updates results dynamically. As a database system, RisingWave maintains results inside its own storage and allows users to access data efficiently.

3.7k Jan 2, 2023

ReadySet is a lightweight SQL caching engine written in Rust that helps developers enhance the performance and scalability of existing applications.

ReadySet is a SQL caching engine designed to help developers enhance the performance and scalability of their existing database-backed applications. W

1.7k Jan 8, 2023

A Toy Query Engine & SQL interface

Related tags

Overview

Naive Query Engine (Toy for Learning) 😄

architecture

how to use

architecture

TODO

Comments

Support Aggregate Function Group by

Roadmap 1.0

SQL Test Framework

Support `In` list and subquery

Implement fmt::Display for all expressions

Owner

谭巍

Gh-sql - Query GitHub Projects (beta) with SQL

X-Engine: A SQL Engine built from scratch in Rust.

🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL.

SQL validator tool for BigQuery standard SQL.

Query LDAP and AD with SQL

An object-relational in-memory cache, supports queries with an SQL-like query language.

Fully typed SQL query builder for Rust [deprecated]

Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser

Type-safe SQL query wrappers

Query system statistics with SQL.

ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-text search.

RisingWave is a cloud-native streaming database that uses SQL as the interface language.

ReadySet is a lightweight SQL caching engine written in Rust that helps developers enhance the performance and scalability of existing applications.

SQL/JSON path engine in Rust.

Rust client for Timeplus Proton, a fast and lightweight streaming SQL engine

Bind the Prisma ORM query engine to any programming language you like ❤️

Bind the Prisma ORM query engine to any programming language you like ❤️

TDS 7.2+ (mssql / Microsoft SQL Server) async driver for rust

FeOphant - A SQL database server written in Rust and inspired by PostreSQL.