Engula empowers engineers to build reliable and cost-effective databases.

Engula

Last update: Jan 1, 2023

Related tags

Overview

Engula is a storage engine that empowers engineers to build reliable and cost-effective databases with less effort and more confidence.

Engula is in the demo stage now. Please check the roadmap for more details.

Welcome to review the design and join the room to discuss with us. We also offer full-time jobs. For more information, please get in touch with [email protected].

Usage

> = Arc::new(Box::new(fs));
    let job = LocalJobRuntime::new(fs.clone());
    let job: Arc> = Arc::new(Box::new(job));
    let storage = LocalStorage::new(storage_options, fs, job)?;
    let journal = LocalJournal::new(dirname, false)?;
    let db = Database::new(options, Box::new(journal), Box::new(storage)).await;
    for i in 0..1024u64 {
        let v = i.to_be_bytes().to_vec();
        db.put(v.clone(), v.clone()).await?;
        let got = db.get(&v).await?;
        assert_eq!(got, Some(v.clone()));
    }
    Ok(())
}
">
 use std::sync::Arc;

use engula::{
    Database, FileSystem, JobRuntime, LocalFileSystem, LocalJobRuntime, LocalJournal, LocalStorage,
    Options, StorageOptions,
};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let options = Options {
        memtable_size: 1024,
    };
    let storage_options = StorageOptions::default();
    let dirname = "/tmp/engula";
    let fs = LocalFileSystem::new(dirname)?;
    let fs: Arc<Box<dyn FileSystem>> = Arc::new(Box::new(fs));
    let job = LocalJobRuntime::new(fs.clone());
    let job: Arc<Box<dyn JobRuntime>> = Arc::new(Box::new(job));
    let storage = LocalStorage::new(storage_options, fs, job)?;
    let journal = LocalJournal::new(dirname, false)?;
    let db = Database::new(options, Box::new(journal), Box::new(storage)).await;
    for i in 0..1024u64 {
        let v = i.to_be_bytes().to_vec();
        db.put(v.clone(), v.clone()).await?;
        let got = db.get(&v).await?;
        assert_eq!(got, Some(v.clone()));
    }
    Ok(())
}

You can run the example with:

cargo run --example hello

Comments

A file journal implementation

We have a mem and grpc implementation now. But most applications will need persistent journal storage. A sophisticated implementation is not easy, we can start with a naive implementation for now. Check RocksDB LogWriter for an example.

opened by huachaohuang 28
Roadmap v0.2 - Microunit
Non-goals:

Data persistence

Leader election and failover

Resource isolation and management

Tasks:

[ ] Design document

[ ] Commands

[ ] Start a node

[ ] Bootstrap a universe

[ ] Join a node to a universe
opened by huachaohuang 25
Roadmap 0.2
Overview

Goals:

Set up basic project management

Present the fundamental ideas and usages of Engula

Non-goals:

Reliability

Performance

Project

[x] #59

[x] #144

Modules

[x] #73

[x] #145

[x] #65

[x] #68

[x] ~~#66~~

Documents

[x] #132

[x] https://github.com/engula/engula.github.io/pull/17

[x] https://github.com/engula/engula.github.io/issues/16
opened by huachaohuang 25
A grpc kernel that stores everything in a remote grpc kernel service

A grpc kernel integrates a grpc journal, storage, etc. It consists of a client and a server part. We need to provide a binary to start the kernel server.

opened by huachaohuang 23
docs: rewrite the design document

A rendered version of the document is here.

Design documents of individual modules are also pulled together to provide a better landscape.

Microunit's Design is omitted for now. As we are shifting to k8s, I will add related designs once I clear my mind.

opened by huachaohuang 23
Roadmap 0.3
Overview

The primary goal of version 0.3 is to reflect the new design of Engula. The planned release date is Feb 28, 2022.

Tasks

APIs

[x] #364

[x] #366

[x] #376

[x] #446

Client

[x] #368

[x] #377

[x] #381

[x] #383

[x] #389

[x] #391

[x] #396

[x] #398

[x] #369

[x] #402

[x] #405

[x] #407

[x] #447

[x] #472

Server

[x] #371

[x] #379

[x] #394

[x] #420

[x] #422

[x] #442

[x] #448

[x] #450

[x] #452

[x] #455

Stream Engine

[x] #412

[x] #415

[x] #429

[x] https://github.com/engula/engula/issues/426

Object Engine

[x] #416

[x] #418

[x] #431

[x] #485

Command line tool

[x] #373

[x] #424

Documentation

[x] #360

[x] #363

[x] #362

[x] Release 0.3

[x] Tutorial 0.3

Miscellaneous

[x] #444
opened by huachaohuang 17

Roadmap 0.3 - Refactor Journal interfaces

pub trait Journal {
    type StreamReader;
    type StreamWriter;

    async fn create_stream(&self, name: &str) -> Result<()>;
    async fn delete_stream(&self, name: &str) -> Result<()>;
    async fn new_stream_reader(&self, name: &str) -> Result<StreamReader>;
    async fn new_stream_writer(&self, name: &str) -> Result<StreamWriter>;
}

Introduce stream reader/writer to separate concerns. Most use cases will not read and write the same stream at the same time. Separating stream reader/writer has another benefit, allowing different compositions of reader and writer.

opened by huachaohuang 14

Add a simple command to do local checks before sending PRs

We need to run several commands to check tests, style, license, unused dependencies before sending a PR, which is tedious. Maybe we can add a simple way to check everything at once locally.
help wanted

opened by huachaohuang 14
Roadmap v0.2 - Storage
Tasks:

[x] Design document

[x] Storage abstraction

[x] A mem storage implementation

[x] A file storage implementation

[x] A grpc storage implementation

Most tasks have been done, but we still need to refactor some implementations before releasing v0.2.
opened by huachaohuang 12
Write down the first version CONTRIBUTING.md
In order not to disturb your prototyping and rapidly development, I'm glad to write this file and propose a PR, with your help on collecting necessary information.

The draft is looks like:

How to Contribute

I'm really glad you're reading this, because we need volunteer developers to help this project come to fruition.

If you haven't already, come find us on gitter. We want you working on things you're excited about.

Welcome to review our design or participant discussions about the roadmap!

Get Started

We develop Engula with rust stable toolchain.

You're able to get started with Engula with three steps:

Setup the environment with rustup.

Build Engula via cargo build.

Run the example via cargo run --example hello.

Report an Issue

If you think you have found an issue in Engula, you can report it to the issue tracker.

Before filing an issue report is to see whether the problem has already been reported. You can use the search bar to search existing issues. This doesn't always work, and sometimes it's hard to know what to search for, so consider this extra credit. We won't mind if you accidentally file a duplicate report. Don't blame yourself if your issue is closed as duplicated.

If the problem you're reporting is not already in the issue tracker, you can open a GitHub issue with your GitHub account.

Submitting a Pull Request

Please send a GitHub Pull Request to Engula with a clear list of what you've done (read more about pull requests). When you send a pull request, we're looking forward to an expressive description, clear commit messages, and more test coverage if it is code contribution.

Before submitting the pull request, please make sure all tests pass locally:

cargo build --release cargo test cargo clippy -- -D warnings cargo fmt --all -- --check

Thank you for your participation!

The questions are:

Do we protect main and only modify it by PR?

What merge strategy do we use, specially, merge with commit, rebase and merge, or squash and merge? I highly recommend the latter two where merge with commit make history hard to read - however, I'm not participant your project deeply, so it's your choice.

Shall we adopt a code of conduct? If so, I'd suggest Contributor Covenant Code of Conduct and the project should provide a contact method.

Any other concern on the draft above?
opened by tisonkun 12
Apply license header to engula work
First of all, I must clarify that this project seems still in an early stage so this issue is a suggestion to consider along with the project grows.

According to APL 2.0, it is a common practice to apply explicitly the license by:

To apply the Apache License to specific files in your work, attach the following boilerplate declaration, replacing the fields enclosed by brackets "[]" with your own identifying information. (Don't include the brackets!) Enclose the text in the appropriate comment syntax for the file format. We also recommend that you include a file or class name and description of purpose on the same "printed page" as the copyright notice for easier identification within third-party archives.

Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

If we decide to stick to APL 2.0, we'll meet this proposal sooner or later. Contributors probably complain that a template is too heavy to carry when we're still in an early stage - we don't care about how others make use of the demo. But when you think your work has its shape, it is a signal that you should consider this proposal.

Also, thinking about the name of "copyright owner" is essential for the project.

Last, without audit tools the license header can easily miss or with typo, etc. skywalking-eyes will be a great language-agnostic solution integrated with GitHub Actions.
opened by tisonkun 10

`watch_state_changes` executes too long

2022-09-29T02:25:18.538471Z  WARN engula_server::runtime::executor: future poll() execute total 1.137175ms: core::future::from_generator::GenFuture<engula_server::node::migrate::ctrl::MigrateController::watch_state_changes::{{closure}}::{{closure}}>
2022-09-29T02:25:18.550885Z  WARN engula_server::runtime::executor: future poll() execute total 1.494231ms: core::future::from_generator::GenFuture<engula_server::node::migrate::ctrl::MigrateController::watch_state_changes::{{closure}}::{{closure}}>

opened by w41ter 0

Refine `ServiceIsBusy` handing

The current method is a simple sleep of 200ms. For some frequent operations, such as transfer leader, this delay is too high. Instead of simply waiting a fixed amount of time, we need to introduce a flexible exponential backoff.

opened by w41ter 0

group state carries two leader

2022-08-22T11:14:17.355427Z TRACE engula_client::router: update event; group state GroupState { group_id: 100001, leader_id: Some(1000012), replicas: [ReplicaState { replica_id: 1000010, group_id: 100001, term: 3, voted_for: 1000010, role: Leader, node_id: 0 }, ReplicaState { replica_id: 1000011, group_id: 100001, term: 3, voted_for: 1000010, role: Follower, node_id: 1 }, ReplicaState { replica_id: 1000012, group_id: 100001, term: 2, voted_for: 1000012, role: Leader, node_id: 2 }] }       
2022-08-22T11:14:17.393537Z TRACE engula_client::router: update event; group state GroupState { group_id: 100001, leader_id: Some(1000012), replicas: [ReplicaState { replica_id: 1000010, group_id: 100001, term: 3, voted_for: 1000010, role: Leader, node_id: 0 }, ReplicaState { replica_id: 1000011, group_id: 100001, term: 3, voted_for: 1000010, role: Follower, node_id: 1 }, ReplicaState { replica_id: 1000012, group_id: 100001, term: 2, voted_for: 1000012, role: Leader, node_id: 2 }] }
2022-08-22T11:14:17.397597Z TRACE engula_client::router: update event; group state GroupState { group_id: 100001, leader_id: Some(1000012), replicas: [ReplicaState { replica_id: 1000010, group_id: 100001, term: 3, voted_for: 1000010, role: Leader, node_id: 0 }, ReplicaState { replica_id: 1000011, group_id: 100001, term: 3, voted_for: 1000010, role: Follower, node_id: 1 }, ReplicaState { replica_id: 1000012, group_id: 100001, term: 2, voted_for: 1000012, role: Leader, node_id: 2 }] }

The group 100001 has two members who claim that it is the leader:

ReplicaState { replica_id: 1000010, group_id: 100001, term: 3, voted_for: 1000010, role: Leader, node_id: 0 }
ReplicaState { replica_id: 1000012, group_id: 100001, term: 2, voted_for: 1000012, role: Leader, node_id: 2 }

It is obviously that the replica 1000010 is the actually leader of this group, since it's term 3 is large than replica 1000012's 2. But the leader_id of GroupState is Some(1000012).

opened by w41ter 0

engula-client receive error: no such shard exists

2022-08-16T04:16:36.249335Z  INFO ThreadId(05) engula_client::router: start watching events...
2022-08-16T04:16:58.337130Z  INFO ThreadId(01) engula_bench: spawn 50 workers
2022-08-16T04:16:58.337449Z  INFO ThreadId(01) engula_bench: all workers are spawned, wait ...
2022-08-16T04:17:02.498939Z  WARN ThreadId(02) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 9 exists")
2022-08-16T04:17:02.499022Z ERROR ThreadId(02) engula_bench::worker: get request InvalidArgument("no such shard 9 exists")
2022-08-16T04:17:08.338180Z  INFO ThreadId(03) engula_bench: GET total 138209
2022-08-16T04:17:08.338219Z  INFO ThreadId(03) engula_bench: GET success total 138208
2022-08-16T04:17:08.338227Z  INFO ThreadId(03) engula_bench: GET failure total 1
2022-08-16T04:17:08.338233Z  INFO ThreadId(03) engula_bench: PUT total 140727
2022-08-16T04:17:08.338239Z  INFO ThreadId(03) engula_bench: PUT success total 140727
2022-08-16T04:17:08.338251Z  INFO ThreadId(03) engula_bench: PUT failure total 0
2022-08-16T04:17:17.549520Z  WARN ThreadId(04) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 24 exists")
2022-08-16T04:17:17.549566Z ERROR ThreadId(04) engula_bench::worker: get request InvalidArgument("no such shard 24 exists")
2022-08-16T04:17:17.550239Z  WARN ThreadId(05) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 24 exists")
2022-08-16T04:17:17.551802Z  WARN ThreadId(03) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 24 exists")
2022-08-16T04:17:17.552821Z  WARN ThreadId(04) engula_client::group_client: group client issue rpc err=InvalidArgument("no such shard 24 exists")

opened by w41ter 0

report `is_replica_updated` to root

When revising the root allocator logic, it's valuable if it can know whether the replica log is updated to the leader.

For example: when we try to rebalance replicas between nodes, the allocator chooses which replica be removed from the source node, it's better not to remove the updated replica when its updated replica count is just equal to the quorum(e.g. 3 replicas(a, b, c), 2 of them are updated(a, b), it's better to choose to remove c instead of removing a, b... to avoid group became unavailable), and root can try move other group's replica if no suitable replica in the current group, so root allocator need know "does replica is updated?"

opened by zojw 1

Releases(v0.4.0)

v0.4.0(Oct 12, 2022)

Over the past few months, we have tweaked the architecture of engula. We focused on enhancing its architectural simplicity and ease of use.

Now it is a elastic and fault-tolerant distributed key-valuev store that can be used as a cache, database and storage engine.

You can check the new design document for more details.

Welcome to explore and have fun!
Source code(tar.gz)
Source code(zip)
v0.3.0(Feb 28, 2022)

Engula has gone through a redesign in terms of its interface and architecture. The new design is based on the lessons we learned in the past few months, and it will provide a clearer guide for future development. You can check the new design document and discussion for more details.

The most important feature in Engula 0.3 is a set of data structures. In this release, Engula provides five data types: Any, I64, Blob, List, and Map. Each data type provides a set of APIs to manipulate objects of that type. For example, you can add on I64 objects or push elements to List objects.

Moreover, Engula supports ACID transactions across different data structures. For example, you can push an element to a list and insert an index to a map in the same transaction. This feature allows users to build more advanced applications on top of Engula.

Engula 0.3 also comes with a new server and a Rust client. You can get started with this tutorial. Welcome to explore and have fun!
Source code(tar.gz)
Source code(zip)
v0.2.0(Dec 17, 2021)
We have published a design document to introduce Engula's concepts and architecture. For those who don't know much about Engula or haven't been around for a while, Engula has changed a lot, and the design document is the best place to get started.

Engula 0.2 comes with one engine and three kernels:

A hash engine that supports simple key-value operations. This engine relies on a kernel to perform stateful operations. It can work with the three kernels below to tackle different use cases.

A memory kernel that stores everything in memory. This kernel integrates a memory journal, storage, and manifest for data storage.

A file kernel that stores everything in local files. This kernel integrates a file journal, storage, and manifest for data storage.

A gRPC kernel that stores data in remote gRPC servers. An engine uses a kernel client to communicate with a kernel server. The kernel server can further connect to a journal server and a storage server for data storage. Engula 0.2 provides a binary to start different kinds of servers easily.

If you want to try it out, we prepare a tutorial for you. Have fun!
Source code(tar.gz)
Source code(zip)

Owner

Engula

GitHub https://engula.com

Rust library and daemon for easily starting postgres databases per-test without Docker

pgtemp pgtemp is a Rust library and cli tool that allows you to easily create temporary PostgreSQL servers for testing without using Docker. The pgtem

165 Mar 22, 2024

CLI tool to work with Sled key-value databases.

sledtool CLI tool to work with Sled key-value databases. $ sledtool --help Usage: sledtool <dbpath> <command> [<args>] CLI tool to work with Sled da

27 Sep 26, 2022

Provides a Rust-based SQLite extension for using Hypercore as the VFS for your databases.

SQLite and Hypercore A Rust library providing SQLite with an virtual file system to enable Hypercore as a means of storage. Contributing The primary r

14 Dec 5, 2022

Sled - the champagne of beta embedded databases

key value buy a coffee for us to convert into databases documentation chat about databases with us sled - it's all downhill from here!!! An embedded d

6.6k Jan 8, 2023

Replibyte - a powerful tool to seed your databases

Seed Your Development Database With Real Data ⚡️ Replibyte is a powerful tool to seed your databases with real data and other cool features ?? Feature

3.4k Jan 9, 2023

Optimistic multi-version concurrency control (MVCC) for main memory databases, written in Rust.

MVCC for Rust This is a work-in-progress the Hekaton optimistic multiversion concurrency control library in Rust. The aim of the project is to provide

32 Apr 20, 2023

This project provides a Rust-based solution for migrating MSSQL databases to MySQL.

MSSQL to MySQL Database Migration A Rust project to migrate MSSQL databases to MySQL, including table structures, column data types, and table data ro

2 Jul 10, 2023

Skybase is an extremely fast, secure and reliable real-time NoSQL database with automated snapshots and SSL

Skybase The next-generation NoSQL database What is Skybase? Skybase (or SkybaseDB/SDB) is an effort to provide the best of key/value stores, document

1.4k Dec 29, 2022

Skytable is an extremely fast, secure and reliable real-time NoSQL database with automated snapshots and TLS

Skytable is an effort to provide the best of key/value stores, document stores and columnar databases, that is, simplicity, flexibility and queryability at scale. The name 'Skytable' exemplifies our vision to create a database that has limitless possibilities. Skytable was previously known as TerrabaseDB (and then Skybase) and is also nicknamed "STable", "Sky" and "SDB" by the community.

1.4k Dec 29, 2022

Databend aimed to be an open source elastic and reliable serverless data warehouse,

An elastic and reliable Serverless Data Warehouse, offers Blazing Fast Query and combines Elasticity, Simplicity, Low cost of the Cloud, built to make the Data Cloud easy

5k Jan 3, 2023

Rust - Build a CRUD API with SQLX and PostgreSQL

In this article, you'll learn how to build a CRUD API in Rust using SQLX, Actix-web, and PostgreSQL. Learning how to build a CRUD API as a developer will equip you with valuable skills for building robust, maintainable, and scalable applications.

5 Feb 20, 2023

Build a Full Stack App with Rust, Yew.rs and Actix Web

In this article, I'll walk you through the process of building a backend API using the Actix web framework, SQLX, PostgreSQL, and Docker. Once we've created a powerful backend, we'll move on to building a single-page app using the Yew.rs framework.

6 Mar 29, 2023

Build SQLite virtual file systems (VFS) by implementing a simple Rust trait.

sqlite-vfs Build SQLite virtual file systems (VFS) by implementing a simple Rust trait. Documentation | Example This library is build for my own use-c

56 Dec 19, 2022

Backend service to build customer facing dashboards 10x faster. Written in Rust.

Frolic is an open source backend service (written in Rust) to build customer facing dashboards 10x faster. You can directly connect your database to t

133 Aug 22, 2023

Simple and handy btrfs snapshoting tool. Supports unattended snapshots, tracking, restoring, automatic cleanup and more. Backed with SQLite.

Description Simple and handy btrfs snapshoting tool. Supports unattended snapshots, tracking, restoring, automatic cleanup and more. Backed with SQLit

27 Nov 22, 2022

asynchronous and synchronous interfaces and persistence implementations for your OOD architecture

OOD Persistence Asynchronous and synchronous interfaces and persistence implementations for your OOD architecture Installation Add ood_persistence = {

1 Feb 15, 2022

rust_arango enables you to connect with ArangoDB server, access to database, execute AQL query, manage ArangoDB in an easy and intuitive way, both async and plain synchronous code with any HTTP ecosystem you love.

rust_arango enables you to connect with ArangoDB server, access to database, execute AQL query, manage ArangoDB in an easy and intuitive way, both async and plain synchronous code with any HTTP ecosystem you love.

3 Mar 24, 2022

Efficient and fast querying and parsing of GTDB's data

xgt xgt is a Rust tool that enables efficient querying and parsing of the GTDB database. xgt consists of a collection of commands mirroring the GTDB A

7 Apr 1, 2023

High-performance, lock-free local and concurrent object memory pool with automated allocation, cleanup, and verification.

Opool: Fast lock-free concurrent and local object pool Opool is a high-performance Rust library that offers a concurrent and local object pool impleme

8 Jun 3, 2023

Engula empowers engineers to build reliable and cost-effective databases.

Related tags

Overview

Usage

Comments

Overview

Project

Modules

Documents

Overview

Tasks

APIs

Client

Server

Stream Engine

Object Engine

Command line tool

Documentation

Miscellaneous

How to Contribute

Get Started

Report an Issue

Submitting a Pull Request

Releases(v0.4.0)

v0.4.0(Oct 12, 2022)

v0.3.0(Feb 28, 2022)

v0.2.0(Dec 17, 2021)

Owner

Engula

Rust library and daemon for easily starting postgres databases per-test without Docker

CLI tool to work with Sled key-value databases.

Provides a Rust-based SQLite extension for using Hypercore as the VFS for your databases.

Sled - the champagne of beta embedded databases

Replibyte - a powerful tool to seed your databases

Optimistic multi-version concurrency control (MVCC) for main memory databases, written in Rust.

This project provides a Rust-based solution for migrating MSSQL databases to MySQL.

Skybase is an extremely fast, secure and reliable real-time NoSQL database with automated snapshots and SSL

Skytable is an extremely fast, secure and reliable real-time NoSQL database with automated snapshots and TLS

Databend aimed to be an open source elastic and reliable serverless data warehouse,

Rust - Build a CRUD API with SQLX and PostgreSQL

Build a Full Stack App with Rust, Yew.rs and Actix Web

Build SQLite virtual file systems (VFS) by implementing a simple Rust trait.

Backend service to build customer facing dashboards 10x faster. Written in Rust.

Simple and handy btrfs snapshoting tool. Supports unattended snapshots, tracking, restoring, automatic cleanup and more. Backed with SQLite.

asynchronous and synchronous interfaces and persistence implementations for your OOD architecture

rust_arango enables you to connect with ArangoDB server, access to database, execute AQL query, manage ArangoDB in an easy and intuitive way, both async and plain synchronous code with any HTTP ecosystem you love.

Efficient and fast querying and parsing of GTDB's data

High-performance, lock-free local and concurrent object memory pool with automated allocation, cleanup, and verification.