A Toy Query Engine & SQL interface

Overview

Naive Query Engine (Toy for Learning) 😄

This is a Query Engine which support SQL interface. And it is only a Toy for learn query engine only. You can check TODO to check the progress now.

Simple enough to learn (Although it is simple...but with so much work to finish.. TAT 😭 ) and Now it only has a basic architecture and most operators and planners have not implemented (will be done in the future).

This is inspired(and most ideas come) by how-query-engines-work and it is just for learning purpose. And many ideas inspired by arrow-datafusion.

Use arrow to express in-memory columnar format and use sqlparser as SQL parser.

architecture

query_engine

how to use

for now, we can use NaiveDB like below, we can use csv as table storage.

use naive_db::print_result;
use naive_db::CsvConfig;
use naive_db::NaiveDB;
use naive_db::Result;

fn main() -> Result<()> {
    let mut db = NaiveDB::default();

    db.create_csv_table("t1", "data/test_data.csv", CsvConfig::default())?;

    // select
    let ret = db.run_sql("select id, name, age + 100 from t1 where id < 9 limit 3 offset 2")?;
    print_result(&ret)?;

    // Inner Join
    db.create_csv_table("employee", "data/employee.csv", CsvConfig::default())?;
    db.create_csv_table("rank", "data/rank.csv", CsvConfig::default())?;
    db.create_csv_table("department", "data/department.csv", CsvConfig::default())?;

    let ret = db.run_sql(
        "
        select id, name, rank_name, department_name
        from employee
        join rank on 
            employee.rank = rank.id  
        join department on
            employee.department_id = department.id
    ",
    )?;
    print_result(&ret)?;

    // cross join
    let ret = db.run_sql("select * from employee join rank")?;
    print_result(&ret)?;

    // aggregate
    let ret = db.run_sql(
        "
        select count(id), sum(age), sum(score), avg(score), max(score), min(score) 
        from t1 group by id % 3",
    )?;
    print_result(&ret)?;

    Ok(())
}

output will be:

+----+-------+-----------+
| id | name  | age + 100 |
+----+-------+-----------+
| 4  | lynne | 118       |
| 5  | alice | 119       |
| 6  | bob   | 120       |
+----+-------+-----------+
+----+-------+-------------+-----------------+
| id | name  | rank_name   | department_name |
+----+-------+-------------+-----------------+
| 2  | lynne | master      | IT              |
| 1  | vee   | diamond     | IT              |
| 3  | Alex  | master      | Marketing       |
| 4  | jack  | diamond     | Marketing       |
| 5  | mike  | grandmaster | Human Resource  |
+----+-------+-------------+-----------------+
+----+-------+---------------+------+----+-------------+
| id | name  | department_id | rank | id | rank_name   |
+----+-------+---------------+------+----+-------------+
| 1  | vee   | 1             | 1    | 1  | master      |
| 2  | lynne | 1             | 0    | 2  | diamond     |
| 3  | Alex  | 2             | 0    | 3  | grandmaster |
| 4  | jack  | 2             | 1    | 4  | master      |
| 5  | mike  | 3             | 2    | 5  | diamond     |
| 1  | vee   | 1             | 1    | 1  | grandmaster |
| 2  | lynne | 1             | 0    | 2  | master      |
| 3  | Alex  | 2             | 0    | 3  | diamond     |
| 4  | jack  | 2             | 1    | 4  | grandmaster |
| 5  | mike  | 3             | 2    | 5  | master      |
| 1  | vee   | 1             | 1    | 1  | diamond     |
| 2  | lynne | 1             | 0    | 2  | grandmaster |
| 3  | Alex  | 2             | 0    | 3  | master      |
| 4  | jack  | 2             | 1    | 4  | diamond     |
| 5  | mike  | 3             | 2    | 5  | grandmaster |
+----+-------+---------------+------+----+-------------+
+-----------+----------+--------------------+-------------------+------------+------------+
| count(id) | sum(age) | sum(score)         | avg(score)        | max(score) | min(score) |
+-----------+----------+--------------------+-------------------+------------+------------+
| 3         | 61       | 255.6              | 85.2              | 90.1       | 81.1       |
| 3         | 62       | 243.29000000000002 | 81.09666666666668 | 99.99      | 60         |
| 2         | 43       | 167.7              | 83.85             | 85.5       | 82.2       |
+-----------+----------+--------------------+-------------------+------------+------------+

architecture

The NaiveDB is just simple and has clear progress just like:

impl NaiveDB {
    pub fn run_sql(&self, sql: &str) -> Result<Vec<RecordBatch>> {
        // 1. sql -> statement
        let statement = SQLParser::parse(sql)?;
        // 2. statement -> logical plan
        let sql_planner = SQLPlanner::new(&self.catalog);
        let logical_plan = sql_planner.statement_to_plan(statement)?;
        // 3. optimize
        let optimizer = Optimizer::default();
        let logical_plan = optimizer.optimize(logical_plan);
        // 4. logical plan -> physical plan
        let physical_plan = QueryPlanner::create_physical_plan(&logical_plan)?;
        // 5. execute
        physical_plan.execute()
    }
}

TODO

  • type system
  • datasource
    • mem source
    • csv as datasource
    • empty datasource
  • logical plan & expressions
  • build logical plans
    • projection
    • filter
    • aggregate
    • limit
    • join
  • physical plan & expressions
    • physical scan
    • physical projection
    • physical filter
    • physical limit
    • join
      • algorithms
        • (dumb 😊 ) nested loop join
        • hash join
        • sort-merge join
      • inner join
      • cross join
    • physical expression
      • column expr
      • binary operation expr(add/sub/mul/div/and/or...)
      • literal expr
      • unary expr
      • aggr expr
      • so many work to do... TAT
  • query planner
    • scan
    • limit
    • join
    • aggregate
    • ...
  • query optimization
    • more rules needed
  • sql support
    • parser
    • SQL planner: statement -> logical plan
      • scan
      • projection
      • selection
      • limit
      • join
      • aggregate
        • group by
      • scalar function
Comments
  • Support Aggregate Function Group by

    Support Aggregate Function Group by

    Signed-off-by: Veeupup [email protected]

    Support group by aggregate function such as select count(id), sum(age), sum(score) from t1 group by id % 3"

    opened by Veeupup 0
  • Roadmap 1.0

    Roadmap 1.0

    The naive-query-engine is designed to learn the query engine and the code or logic should keep it always simple and clear! We want to make this project support basic SQL grammar and run it with a basic MPP executor.

    We can split the total process into two steps. First, we want to have the basic SQL grammar, and the issues are below (some have finished and do not have an issue related...) #15 #16 #17 #26 #28 #29 #32 #43 #44 #47 #48 #50 #64 #65

    and the second milestone is the MPP executor #18

    Any contributions are welcome!

    opened by Veeupup 0
  • SQL Test Framework

    SQL Test Framework

    We need a test framework to test SQL, it will be systematic testing to show which SQL we have supported.

    The framework would be that we have a SQL file and an excepted result file, and we can compare the result between the actual file and result file.

    opened by Veeupup 0
  • Support `In` list and subquery

    Support `In` list and subquery

    support in sql,

    • [ ] in list, like select * from t where id in (1, 2, 4)
    • [ ] in subquery like select * from t where id in (select id from t2 where age < 10)
    opened by Veeupup 0
  • Implement fmt::Display for all expressions

    Implement fmt::Display for all expressions

    impl fmt::Display for Operator {
        fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
            let display = match &self {
                Operator::Eq => "=",
                Operator::NotEq => "!=",
                Operator::Lt => "<",
                Operator::LtEq => "<=",
                Operator::Gt => ">",
                Operator::GtEq => ">=",
                Operator::Plus => "+",
                Operator::Minus => "-",
                Operator::Multiply => "*",
                Operator::Divide => "/",
                Operator::Modulo => "%",
                Operator::And => "AND",
                Operator::Or => "OR",
                Operator::Like => "LIKE",
                Operator::NotLike => "NOT LIKE",
                Operator::RegexMatch => "~",
                Operator::RegexIMatch => "~*",
                Operator::RegexNotMatch => "!~",
                Operator::RegexNotIMatch => "!~*",
                Operator::IsDistinctFrom => "IS DISTINCT FROM",
                Operator::IsNotDistinctFrom => "IS NOT DISTINCT FROM",
                Operator::BitwiseAnd => "&",
                Operator::BitwiseOr => "|",
            };
            write!(f, "{}", display)
        }
    }
    

    Ref to the code in datafusion (datafusion/expr/src/operator.rs)

    good first issue 
    opened by ywqzzy 0
Owner
谭巍
keep coding and learning : D
谭巍
Gh-sql - Query GitHub Projects (beta) with SQL

gh-sql: Query GitHub Projects (beta) with SQL Installation gh extension install KOBA789/gh-sql Features SELECT items DELETE items UPDATE item fields

Hidekazu Kobayashi 108 Dec 7, 2022
X-Engine: A SQL Engine built from scratch in Rust.

XNGIN (pronounced "X Engine") This is a personal project to build a SQL engine from scratch. The project name is inspired by Nginx, which is a very po

Jiang Zhe 111 Dec 15, 2022
🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL.

SQLx ?? The Rust SQL Toolkit Install | Usage | Docs Built with ❤️ by The LaunchBadge team SQLx is an async, pure Rust† SQL crate featuring compile-tim

launchbadge 7.6k Dec 31, 2022
SQL validator tool for BigQuery standard SQL.

bqvalid What bqvalid does bqvalid is the SQL validator tool for BigQuery standard SQL. bqvalid fails with error message if there's the expression that

null 10 Dec 25, 2022
Query LDAP and AD with SQL

SQLDAP Ever wanted to query AD or LDAP with SQL like queries ? I'm going to answer this question myself: yes ! Why ? Because I never could remember al

null 9 Nov 15, 2022
An object-relational in-memory cache, supports queries with an SQL-like query language.

qlcache An object-relational in-memory cache, supports queries with an SQL-like query language. Warning This is a rather low-level library, and only p

null 3 Nov 14, 2021
Fully typed SQL query builder for Rust [deprecated]

What is Deuterium? Deuterium is a fancy SQL builder for Rust. It's designed to provide a DSL to easily build SQL queries in safe and typed way. Like R

Stanislav Panferov 169 Nov 20, 2022
Rust library to parse, deparse and normalize SQL queries using the PostgreSQL query parser

This Rust library uses the actual PostgreSQL server source to parse SQL queries and return the internal PostgreSQL parse tree.

pganalyze 37 Dec 18, 2022
Type-safe SQL query wrappers

fnsql   The fnsql crate provides simple type-safe optional wrappers around SQL queries. Instead of calling type-less .query() and .execute(), you call

Dan Aloni 9 Apr 29, 2022
Query system statistics with SQL.

pg_stat_sysinfo Collects system statistics. ---- CREATE EXTENSION pg_stat_sysinfo; CREATE EXTENSION ---- SELECT * FROM pg_stat_sysinfo_collect();

PostgresML 12 Jun 14, 2023
ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-text search.

ReefDB ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-

Sacha Arbonel 75 Jun 12, 2023
RisingWave is a cloud-native streaming database that uses SQL as the interface language.

RisingWave is a cloud-native streaming database that uses SQL as the interface language. It is designed to reduce the complexity and cost of building real-time applications. RisingWave consumes streaming data, performs continuous queries, and updates results dynamically. As a database system, RisingWave maintains results inside its own storage and allows users to access data efficiently.

Singularity Data 3.7k Jan 2, 2023
ReadySet is a lightweight SQL caching engine written in Rust that helps developers enhance the performance and scalability of existing applications.

ReadySet is a SQL caching engine designed to help developers enhance the performance and scalability of their existing database-backed applications. W

ReadySet 1.7k Jan 8, 2023
SQL/JSON path engine in Rust.

sql-json-path SQL/JSON Path implementation in Rust. ?? Under development ?? Features Compatible with SQL/JSON Path standard and PostgreSQL implementat

RisingWave Labs 3 Nov 22, 2023
Rust client for Timeplus Proton, a fast and lightweight streaming SQL engine

Rust Client for Timeplus Proton Rust client for Timeplus Proton. Proton is a streaming SQL engine, a fast and lightweight alternative to Apache Flink,

Timeplus 4 Feb 27, 2024
Bind the Prisma ORM query engine to any programming language you like ❤️

Prisma Query Engine C API Bind the Prisma ORM query engine to any programming language you like ❤️ Features Rust bindings for the C API Static link li

Prisma ORM for community 10 Dec 15, 2022
Bind the Prisma ORM query engine to any programming language you like ❤️

Prisma Query Engine C API Bind the Prisma ORM query engine to any programming language you like ❤️ Features Rust bindings for the C API Static link li

Odroe 6 Sep 9, 2022
TDS 7.2+ (mssql / Microsoft SQL Server) async driver for rust

Tiberius A native Microsoft SQL Server (TDS) client for Rust. Supported SQL Server versions Version Support level Notes 2019 Tested on CI 2017 Tested

Prisma 189 Dec 25, 2022
FeOphant - A SQL database server written in Rust and inspired by PostreSQL.

A PostgreSQL inspired SQL database written in Rust.

Christopher Hotchkiss 27 Dec 7, 2022