Engula empowers engineers to build reliable and cost-effective databases.

Overview

Engula

Gitter

Engula is a storage engine that empowers engineers to build reliable and cost-effective databases with less effort and more confidence.

Engula is in the demo stage now. Please check the roadmap for more details.

Welcome to review the design and join the room to discuss with us. We also offer full-time jobs. For more information, please get in touch with [email protected].

Usage

> = Arc::new(Box::new(fs)); let job = LocalJobRuntime::new(fs.clone()); let job: Arc> = Arc::new(Box::new(job)); let storage = LocalStorage::new(storage_options, fs, job)?; let journal = LocalJournal::new(dirname, false)?; let db = Database::new(options, Box::new(journal), Box::new(storage)).await; for i in 0..1024u64 { let v = i.to_be_bytes().to_vec(); db.put(v.clone(), v.clone()).await?; let got = db.get(&v).await?; assert_eq!(got, Some(v.clone())); } Ok(()) } ">
use std::sync::Arc;

use engula::{
    Database, FileSystem, JobRuntime, LocalFileSystem, LocalJobRuntime, LocalJournal, LocalStorage,
    Options, StorageOptions,
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let options = Options {
        memtable_size: 1024,
    };
    let storage_options = StorageOptions::default();
    let dirname = "/tmp/engula";
    let fs = LocalFileSystem::new(dirname)?;
    let fs: Arc<Box<dyn FileSystem>> = Arc::new(Box::new(fs));
    let job = LocalJobRuntime::new(fs.clone());
    let job: Arc<Box<dyn JobRuntime>> = Arc::new(Box::new(job));
    let storage = LocalStorage::new(storage_options, fs, job)?;
    let journal = LocalJournal::new(dirname, false)?;
    let db = Database::new(options, Box::new(journal), Box::new(storage)).await;
    for i in 0..1024u64 {
        let v = i.to_be_bytes().to_vec();
        db.put(v.clone(), v.clone()).await?;
        let got = db.get(&v).await?;
        assert_eq!(got, Some(v.clone()));
    }
    Ok(())
}

You can run the example with:

cargo run --example hello
Comments
  • A file journal implementation

    A file journal implementation

    We have a mem and grpc implementation now. But most applications will need persistent journal storage. A sophisticated implementation is not easy, we can start with a naive implementation for now. Check RocksDB LogWriter for an example.

    opened by huachaohuang 28
  • Roadmap v0.2 - Microunit

    Roadmap v0.2 - Microunit

    Non-goals:

    • Data persistence
    • Leader election and failover
    • Resource isolation and management

    Tasks:

    • [ ] Design document
    • [ ] Commands
      • [ ] Start a node
      • [ ] Bootstrap a universe
      • [ ] Join a node to a universe
    opened by huachaohuang 25
  • Roadmap 0.2

    Roadmap 0.2

    Overview

    Goals:

    • Set up basic project management
    • Present the fundamental ideas and usages of Engula

    Non-goals:

    • Reliability
    • Performance

    Project

    • [x] #59
    • [x] #144

    Modules

    • [x] #73
    • [x] #145
    • [x] #65
    • [x] #68
    • [x] ~~#66~~

    Documents

    • [x] #132
    • [x] https://github.com/engula/engula.github.io/pull/17
    • [x] https://github.com/engula/engula.github.io/issues/16
    opened by huachaohuang 25
  • A grpc kernel that stores everything in a remote grpc kernel service

    A grpc kernel that stores everything in a remote grpc kernel service

    A grpc kernel integrates a grpc journal, storage, etc. It consists of a client and a server part. We need to provide a binary to start the kernel server.

    opened by huachaohuang 23
  • docs: rewrite the design document

    docs: rewrite the design document

    A rendered version of the document is here.

    Design documents of individual modules are also pulled together to provide a better landscape.

    Microunit's Design is omitted for now. As we are shifting to k8s, I will add related designs once I clear my mind.

    opened by huachaohuang 23
  • Roadmap 0.3

    Roadmap 0.3

    Overview

    The primary goal of version 0.3 is to reflect the new design of Engula. The planned release date is Feb 28, 2022.

    Tasks

    APIs

    • [x] #364
    • [x] #366
    • [x] #376
    • [x] #446

    Client

    • [x] #368
    • [x] #377
    • [x] #381
    • [x] #383
    • [x] #389
    • [x] #391
    • [x] #396
    • [x] #398
    • [x] #369
    • [x] #402
    • [x] #405
    • [x] #407
    • [x] #447
    • [x] #472

    Server

    • [x] #371
    • [x] #379
    • [x] #394
    • [x] #420
    • [x] #422
    • [x] #442
    • [x] #448
    • [x] #450
    • [x] #452
    • [x] #455

    Stream Engine

    • [x] #412
    • [x] #415
    • [x] #429
    • [x] https://github.com/engula/engula/issues/426

    Object Engine

    • [x] #416
    • [x] #418
    • [x] #431
    • [x] #485

    Command line tool

    • [x] #373
    • [x] #424

    Documentation

    • [x] #360
    • [x] #363
    • [x] #362
    • [x] Release 0.3
    • [x] Tutorial 0.3

    Miscellaneous

    • [x] #444
    opened by huachaohuang 17
  • Roadmap 0.3 - Refactor Journal interfaces

    Roadmap 0.3 - Refactor Journal interfaces

    pub trait Journal {
        type StreamReader;
        type StreamWriter;
    
        async fn create_stream(&self, name: &str) -> Result<()>;
        async fn delete_stream(&self, name: &str) -> Result<()>;
        async fn new_stream_reader(&self, name: &str) -> Result<StreamReader>;
        async fn new_stream_writer(&self, name: &str) -> Result<StreamWriter>;
    }
    

    Introduce stream reader/writer to separate concerns. Most use cases will not read and write the same stream at the same time. Separating stream reader/writer has another benefit, allowing different compositions of reader and writer.

    opened by huachaohuang 14
  • Add a simple command to do local checks before sending PRs

    Add a simple command to do local checks before sending PRs

    We need to run several commands to check tests, style, license, unused dependencies before sending a PR, which is tedious. Maybe we can add a simple way to check everything at once locally.

    help wanted 
    opened by huachaohuang 14
  • Roadmap v0.2 - Storage

    Roadmap v0.2 - Storage

    Tasks:

    • [x] Design document
    • [x] Storage abstraction
    • [x] A mem storage implementation
    • [x] A file storage implementation
    • [x] A grpc storage implementation

    Most tasks have been done, but we still need to refactor some implementations before releasing v0.2.

    opened by huachaohuang 12
  • Write down the first version CONTRIBUTING.md

    Write down the first version CONTRIBUTING.md

    In order not to disturb your prototyping and rapidly development, I'm glad to write this file and propose a PR, with your help on collecting necessary information.

    The draft is looks like:


    How to Contribute

    I'm really glad you're reading this, because we need volunteer developers to help this project come to fruition.

    If you haven't already, come find us on gitter. We want you working on things you're excited about.

    Welcome to review our design or participant discussions about the roadmap!

    Get Started

    We develop Engula with rust stable toolchain.

    You're able to get started with Engula with three steps:

    1. Setup the environment with rustup.
    2. Build Engula via cargo build.
    3. Run the example via cargo run --example hello.

    Report an Issue

    If you think you have found an issue in Engula, you can report it to the issue tracker.

    Before filing an issue report is to see whether the problem has already been reported. You can use the search bar to search existing issues. This doesn't always work, and sometimes it's hard to know what to search for, so consider this extra credit. We won't mind if you accidentally file a duplicate report. Don't blame yourself if your issue is closed as duplicated.

    If the problem you're reporting is not already in the issue tracker, you can open a GitHub issue with your GitHub account.

    Submitting a Pull Request

    Please send a GitHub Pull Request to Engula with a clear list of what you've done (read more about pull requests). When you send a pull request, we're looking forward to an expressive description, clear commit messages, and more test coverage if it is code contribution.

    Before submitting the pull request, please make sure all tests pass locally:

    cargo build --release
    cargo test
    cargo clippy -- -D warnings
    cargo fmt --all -- --check
    

    Thank you for your participation!


    The questions are:

    1. Do we protect main and only modify it by PR?
    2. What merge strategy do we use, specially, merge with commit, rebase and merge, or squash and merge? I highly recommend the latter two where merge with commit make history hard to read - however, I'm not participant your project deeply, so it's your choice.
    3. Shall we adopt a code of conduct? If so, I'd suggest Contributor Covenant Code of Conduct and the project should provide a contact method.
    4. Any other concern on the draft above?
    opened by tisonkun 12
  • Apply license header to engula work

    Apply license header to engula work

    First of all, I must clarify that this project seems still in an early stage so this issue is a suggestion to consider along with the project grows.

    According to APL 2.0, it is a common practice to apply explicitly the license by:

    To apply the Apache License to specific files in your work, attach the following boilerplate declaration, replacing the fields enclosed by brackets "[]" with your own identifying information. (Don't include the brackets!) Enclose the text in the appropriate comment syntax for the file format. We also recommend that you include a file or class name and description of purpose on the same "printed page" as the copyright notice for easier identification within third-party archives.

    Copyright [yyyy] [name of copyright owner]
    
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
    

    If we decide to stick to APL 2.0, we'll meet this proposal sooner or later. Contributors probably complain that a template is too heavy to carry when we're still in an early stage - we don't care about how others make use of the demo. But when you think your work has its shape, it is a signal that you should consider this proposal.

    Also, thinking about the name of "copyright owner" is essential for the project.

    Last, without audit tools the license header can easily miss or with typo, etc. skywalking-eyes will be a great language-agnostic solution integrated with GitHub Actions.

    opened by tisonkun 10
  • root: refine replica & leader balance policy

    root: refine replica & leader balance policy

    Refine node balance logic (balance replica and leader by cnt)

    Compare the previous logic, it will:

    1. avoid reallocating replica or transfer leader to disk-full node and try to move data out of disk-full node as possible as it can
    2. pass-though epoch(term) from allocator to the executor (in previous, allocator may make a decision in epoch=1 but send transfer leader RPC with epoch=2 in group_client)
    3. try to reuse balance replica cnt and leader cnt logic and open door for qps based balance.

    Known issue:

    1. in some corner case , leader cnt isn't balanced due to it still can not support transfer leader to a node without existing replica
    opened by zojw 0
  • group state carries two leader

    group state carries two leader

    2022-08-22T11:14:17.355427Z TRACE engula_client::router: update event; group state GroupState { group_id: 100001, leader_id: Some(1000012), replicas: [ReplicaState { replica_id: 1000010, group_id: 100001, term: 3, voted_for: 1000010, role: Leader, node_id: 0 }, ReplicaState { replica_id: 1000011, group_id: 100001, term: 3, voted_for: 1000010, role: Follower, node_id: 1 }, ReplicaState { replica_id: 1000012, group_id: 100001, term: 2, voted_for: 1000012, role: Leader, node_id: 2 }] }       
    2022-08-22T11:14:17.393537Z TRACE engula_client::router: update event; group state GroupState { group_id: 100001, leader_id: Some(1000012), replicas: [ReplicaState { replica_id: 1000010, group_id: 100001, term: 3, voted_for: 1000010, role: Leader, node_id: 0 }, ReplicaState { replica_id: 1000011, group_id: 100001, term: 3, voted_for: 1000010, role: Follower, node_id: 1 }, ReplicaState { replica_id: 1000012, group_id: 100001, term: 2, voted_for: 1000012, role: Leader, node_id: 2 }] }
    2022-08-22T11:14:17.397597Z TRACE engula_client::router: update event; group state GroupState { group_id: 100001, leader_id: Some(1000012), replicas: [ReplicaState { replica_id: 1000010, group_id: 100001, term: 3, voted_for: 1000010, role: Leader, node_id: 0 }, ReplicaState { replica_id: 1000011, group_id: 100001, term: 3, voted_for: 1000010, role: Follower, node_id: 1 }, ReplicaState { replica_id: 1000012, group_id: 100001, term: 2, voted_for: 1000012, role: Leader, node_id: 2 }] }        
    

    The group 100001 has two members who claim that it is the leader:

    • ReplicaState { replica_id: 1000010, group_id: 100001, term: 3, voted_for: 1000010, role: Leader, node_id: 0 }
    • ReplicaState { replica_id: 1000012, group_id: 100001, term: 2, voted_for: 1000012, role: Leader, node_id: 2 }

    It is obviously that the replica 1000010 is the actually leader of this group, since it's term 3 is large than replica 1000012's 2. But the leader_id of GroupState is Some(1000012).

    opened by w41ter 0
  • engula-client receive error: no such shard exists

    engula-client receive error: no such shard exists

    2022-08-16T04:16:36.249335Z  INFO ThreadId(05) engula_client::router: start watching events...
    2022-08-16T04:16:58.337130Z  INFO ThreadId(01) engula_bench: spawn 50 workers
    2022-08-16T04:16:58.337449Z  INFO ThreadId(01) engula_bench: all workers are spawned, wait ...
    2022-08-16T04:17:02.498939Z  WARN ThreadId(02) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 9 exists")
    2022-08-16T04:17:02.499022Z ERROR ThreadId(02) engula_bench::worker: get request InvalidArgument("no such shard 9 exists")
    2022-08-16T04:17:08.338180Z  INFO ThreadId(03) engula_bench: GET total 138209
    2022-08-16T04:17:08.338219Z  INFO ThreadId(03) engula_bench: GET success total 138208
    2022-08-16T04:17:08.338227Z  INFO ThreadId(03) engula_bench: GET failure total 1
    2022-08-16T04:17:08.338233Z  INFO ThreadId(03) engula_bench: PUT total 140727
    2022-08-16T04:17:08.338239Z  INFO ThreadId(03) engula_bench: PUT success total 140727
    2022-08-16T04:17:08.338251Z  INFO ThreadId(03) engula_bench: PUT failure total 0
    2022-08-16T04:17:17.549520Z  WARN ThreadId(04) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 24 exists")
    2022-08-16T04:17:17.549566Z ERROR ThreadId(04) engula_bench::worker: get request InvalidArgument("no such shard 24 exists")
    2022-08-16T04:17:17.550239Z  WARN ThreadId(05) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 24 exists")
    2022-08-16T04:17:17.551802Z  WARN ThreadId(03) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 24 exists")
    2022-08-16T04:17:17.552821Z  WARN ThreadId(04) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 24 exists")
    
    opened by w41ter 0
  • report `is_replica_updated` to root

    report `is_replica_updated` to root

    When revising the root allocator logic, it's valuable if it can know whether the replica log is updated to the leader.

    For example: when we try to rebalance replicas between nodes, the allocator chooses which replica be removed from the source node, it's better not to remove the updated replica when its updated replica count is just equal to the quorum(e.g. 3 replicas(a, b, c), 2 of them are updated(a, b), it's better to choose to remove c instead of removing a, b... to avoid group became unavailable), and root can try move other group's replica if no suitable replica in the current group, so root allocator need know "does replica is updated?"

    opened by zojw 0
  • raftgroup panics because transport worker is disconnected

    raftgroup panics because transport worker is disconnected

    panicked at 'transport worker lifetime should large that replicas: TrySendError { kind: Disconnected }', src/server/src/raftgroup/transport.rs:127:14
    note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    
    opened by w41ter 0
Releases(v0.3.0)
  • v0.3.0(Feb 28, 2022)

    Engula has gone through a redesign in terms of its interface and architecture. The new design is based on the lessons we learned in the past few months, and it will provide a clearer guide for future development. You can check the new design document and discussion for more details.

    The most important feature in Engula 0.3 is a set of data structures. In this release, Engula provides five data types: Any, I64, Blob, List, and Map. Each data type provides a set of APIs to manipulate objects of that type. For example, you can add on I64 objects or push elements to List objects.

    Moreover, Engula supports ACID transactions across different data structures. For example, you can push an element to a list and insert an index to a map in the same transaction. This feature allows users to build more advanced applications on top of Engula.

    Engula 0.3 also comes with a new server and a Rust client. You can get started with this tutorial. Welcome to explore and have fun!

    Source code(tar.gz)
    Source code(zip)
  • v0.2.0(Dec 17, 2021)

    We have published a design document to introduce Engula's concepts and architecture. For those who don't know much about Engula or haven't been around for a while, Engula has changed a lot, and the design document is the best place to get started.

    Engula 0.2 comes with one engine and three kernels:

    • A hash engine that supports simple key-value operations. This engine relies on a kernel to perform stateful operations. It can work with the three kernels below to tackle different use cases.
    • A memory kernel that stores everything in memory. This kernel integrates a memory journal, storage, and manifest for data storage.
    • A file kernel that stores everything in local files. This kernel integrates a file journal, storage, and manifest for data storage.
    • A gRPC kernel that stores data in remote gRPC servers. An engine uses a kernel client to communicate with a kernel server. The kernel server can further connect to a journal server and a storage server for data storage. Engula 0.2 provides a binary to start different kinds of servers easily.

    If you want to try it out, we prepare a tutorial for you. Have fun!

    Source code(tar.gz)
    Source code(zip)
Provides a Rust-based SQLite extension for using Hypercore as the VFS for your databases.

SQLite and Hypercore A Rust library providing SQLite with an virtual file system to enable Hypercore as a means of storage. Contributing The primary r

Jacky Alciné 15 Jul 4, 2022
Sled - the champagne of beta embedded databases

key value buy a coffee for us to convert into databases documentation chat about databases with us sled - it's all downhill from here!!! An embedded d

Tyler Neely 6.2k Sep 23, 2022
Replibyte - a powerful tool to seed your databases

Seed Your Development Database With Real Data ⚡️ Replibyte is a powerful tool to seed your databases with real data and other cool features ?? Feature

Qovery 3.2k Sep 19, 2022
Skybase is an extremely fast, secure and reliable real-time NoSQL database with automated snapshots and SSL

Skybase The next-generation NoSQL database What is Skybase? Skybase (or SkybaseDB/SDB) is an effort to provide the best of key/value stores, document

Skybase 1.3k Sep 21, 2022
Skytable is an extremely fast, secure and reliable real-time NoSQL database with automated snapshots and TLS

Skytable is an effort to provide the best of key/value stores, document stores and columnar databases, that is, simplicity, flexibility and queryability at scale. The name 'Skytable' exemplifies our vision to create a database that has limitless possibilities. Skytable was previously known as TerrabaseDB (and then Skybase) and is also nicknamed "STable", "Sky" and "SDB" by the community.

Skytable 1.3k Sep 21, 2022
Databend aimed to be an open source elastic and reliable serverless data warehouse,

An elastic and reliable Serverless Data Warehouse, offers Blazing Fast Query and combines Elasticity, Simplicity, Low cost of the Cloud, built to make the Data Cloud easy

Datafuse Labs 4.6k Sep 22, 2022
Build SQLite virtual file systems (VFS) by implementing a simple Rust trait.

sqlite-vfs Build SQLite virtual file systems (VFS) by implementing a simple Rust trait. Documentation | Example This library is build for my own use-c

Markus Ast 52 Sep 6, 2022
Simple and handy btrfs snapshoting tool. Supports unattended snapshots, tracking, restoring, automatic cleanup and more. Backed with SQLite.

Description Simple and handy btrfs snapshoting tool. Supports unattended snapshots, tracking, restoring, automatic cleanup and more. Backed with SQLit

Eduard Tolosa 27 Jul 12, 2022
asynchronous and synchronous interfaces and persistence implementations for your OOD architecture

OOD Persistence Asynchronous and synchronous interfaces and persistence implementations for your OOD architecture Installation Add ood_persistence = {

Dmitriy Pleshevskiy 1 Feb 15, 2022
rust_arango enables you to connect with ArangoDB server, access to database, execute AQL query, manage ArangoDB in an easy and intuitive way, both async and plain synchronous code with any HTTP ecosystem you love.

rust_arango enables you to connect with ArangoDB server, access to database, execute AQL query, manage ArangoDB in an easy and intuitive way, both async and plain synchronous code with any HTTP ecosystem you love.

Foretag 3 Mar 24, 2022
Materialize simplifies application development with streaming data. Incrementally-updated materialized views - in PostgreSQL and in real time. Materialize is powered by Timely Dataflow.

Materialize is a streaming database for real-time applications. Get started Check out our getting started guide. About Materialize lets you ask questi

Materialize, Inc. 4.4k Sep 24, 2022
High performance and distributed KV store w/ REST API. 🦀

About Lucid KV High performance and distributed KV store w/ REST API. ?? Introduction Lucid is an high performance, secure and distributed key-value s

Lucid ᵏᵛ 288 Sep 28, 2022
⚡🦀 🧨 make your rust types fit DynamoDB and visa versa

?? ?? dynomite dynomite makes DynamoDB fit your types (and visa versa) Overview Goals ⚡ make writing dynamodb applications in rust a productive experi

Doug Tangren 195 Sep 22, 2022
PickleDB-rs is a lightweight and simple key-value store. It is a Rust version for Python's PickleDB

PickleDB PickleDB is a lightweight and simple key-value store written in Rust, heavily inspired by Python's PickleDB PickleDB is fun and easy to use u

null 140 Sep 20, 2022
🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL.

SQLx ?? The Rust SQL Toolkit Install | Usage | Docs Built with ❤️ by The LaunchBadge team SQLx is an async, pure Rust† SQL crate featuring compile-tim

launchbadge 6.9k Sep 30, 2022
A safe, extensible ORM and Query Builder for Rust

A safe, extensible ORM and Query Builder for Rust API Documentation: latest release – master branch Homepage Diesel gets rid of the boilerplate for da

Diesel 9.2k Sep 23, 2022
AgateDB is an embeddable, persistent and fast key-value (KV) database written in pure Rust

AgateDB is an embeddable, persistent and fast key-value (KV) database written in pure Rust. It is designed as an experimental engine for the TiKV project, and will bring aggressive optimizations for TiKV specifically.

TiKV Project 459 Sep 28, 2022
Query LDAP and AD with SQL

SQLDAP Ever wanted to query AD or LDAP with SQL like queries ? I'm going to answer this question myself: yes ! Why ? Because I never could remember al

null 8 Jun 1, 2022
🐸Slippi DB ingests Slippi replays and puts the data into a SQLite database for easier parsing.

The primary goal of this project is to make it easier to analyze large amounts of Slippi data. Its end goal is to create something similar to Ballchasing.com but for Melee.

Max Timkovich 18 Jun 12, 2022