Optimistic multi-version concurrency control (MVCC) for main memory databases, written in Rust.

Overview

MVCC for Rust

This is a work-in-progress the Hekaton optimistic multiversion concurrency control library in Rust. The aim of the project is to provide a building block for implementing database management systems.

Features

  • Main memory architecture, rows are accessed via an index
  • Optimistic multi-version concurrency control

Experimental Evaluation

Single-threaded micro-benchmarks

Operations Throughput
begin_tx, read, and commit 2.2M ops/second
begin_tx, update, and commit 2.2M ops/second
read 12.9M ops/second
update 6.2M ops/second

(The cargo bench was run on a AMD Ryzen 9 3900XT 2.2 GHz CPU.)

Development

Run tests:

cargo test

Test coverage report:

cargo tarpaulin -o html

Run benchmarks:

cargo bench

Run benchmarks and generate flamegraphs:

echo -1 | sudo tee /proc/sys/kernel/perf_event_paranoid
cargo bench --bench my_benchmark -- --profile-time=5

References

Larson et al. High-Performance Concurrency Control Mechanisms for Main-Memory Databases. VLDB '11

Paper errata: The visibility check in Table 2 is wrong and causes uncommitted delete to become visible to transactions (fixed in commit 6ca3773).

Comments
  • Add persistent storage trait

    Add persistent storage trait

    This draft adds a persistent storage trait that can be used to store transaction logs and read the log for recovery purposes. Work in heavy progress, because ideally the design should also allow reading versions from the storage, so that data can be spilled from memory to disk if there's not enough RAM available.

    Fixes #3

    opened by psarna 2
  • Transaction log trait

    Transaction log trait

    The commit_tx() function needs to write to a transaction log and wait for it to persist on stable storage. Let's add a transaction log trait that mvcc-rs users can provide.

    enhancement help wanted good first issue 
    opened by penberg 1
  • Bug: Committed rows can become invisible to new transactions

    Bug: Committed rows can become invisible to new transactions

    The paper has a typo, which I have confirmed with one of the authors. Since the code closely follows the paper and the bug has crept in here.

    Let me explain with a diagram, following the same conventions as the paper:

    Untitled-2023-04-10-1232

    1. At time 60, we observe our initial state, where the value of Larry is 170. This is a committed row.
    2. At time 75, a transaction is started and is in Active state. It wants to update the value to 150, so it appended row 2
    3. At time 80, another new transaction, Tx80 is started, and it wants to read the value of Larry. Both Tx75 and Tx80 are in Active state.

    What is the value Tx80 going to read?

    Tx80 can't see row 2 because of Table 1 rules. If Tx75 is in Active state, only Tx75 can see row 2. This makes sense because row 2 is fresh, and Tx75 might drop the changes later. We don't want any other transactions to see this row.

    But can Tx80 see row 1? The rules from the paper contradict that since Tx75 is in Active state:

    Screenshot 2023-04-09 at 11 51 24 PM

    This is a typo, and Tx80 should be able to see row 1 since it is a committed row.

    Code

    This is the line which causes this bug - https://github.com/penberg/mvcc-rs/blob/44df6c/database/src/database.rs#L438

    We can reproduce this error:

    1. Lets first insert a new row and commit it
    2. Start a new transaction te, updating this row. Let te be in active state
    3. Start one more transaction tx. Let this be in the active state too. According to the code, tx won't see the committed value.

    Fix

    • tx should be able to see this row
    • te should not see this, instead the version it is updating (e.g. row2 from the previous diagram)
    • usual rules about the time ranges apply
    bug help wanted 
    opened by avinassh 0
  • Copy-on-write cloning

    Copy-on-write cloning

    Many database management systems implement "branching", which allows users to create a copy-on-write clones of a database. With a row manager (#20), we can make a copy of the in-memory index and bump the reference counts. This essentially gives us copy-on-write semantics with MVCC because updates create a new version rather than modify the rows in-place.

    For example, let's say we have a database A and we make a clone B. Initially, they both point to the same set of rows. If either one of them updates a row, the other one will not be affected because the updating clone just marks the shared row version as deleted, but the actual row is unchanged.

    As a future optimization, we could also use a copy-on-write hash table to even share the index between clones.

    enhancement help wanted good first issue 
    opened by penberg 0
  • Row manager

    Row manager

    Implement a row manager to decouple the in-memory index represented by Database and the actual stored rows. For example, let's use a log structured memory allocator to allocate records and make RowVersion point to the LSA-backed row. With reference counting, we can now return rows from the index using zero-copy.

    enhancement help wanted good first issue 
    opened by penberg 0
  • Switch to lockless data structures

    Switch to lockless data structures

    The current code uses a mutex to essentially make commit() and rollback() and other operations atomic. Let's switch to lockless data structures like Hekaton.

    enhancement help wanted good first issue 
    opened by penberg 0
Owner
Pekka Enberg
Founder and CTO at ChiselStrike. Software engineer interested in distributed systems, cloud/edge computing, and databases. Previously @ScyllaDB & Linux kernel.
Pekka Enberg
Provides a Rust-based SQLite extension for using Hypercore as the VFS for your databases.

SQLite and Hypercore A Rust library providing SQLite with an virtual file system to enable Hypercore as a means of storage. Contributing The primary r

Jacky Alciné 14 Dec 5, 2022
This project provides a Rust-based solution for migrating MSSQL databases to MySQL.

MSSQL to MySQL Database Migration A Rust project to migrate MSSQL databases to MySQL, including table structures, column data types, and table data ro

Bitalizer 2 Jul 10, 2023
Query is a Rust server for your remote SQLite databases and a CLI to manage them.

Query Query is a Rust server for your remote SQLite databases and a CLI to manage them. Table Of Contents Run A Query Server CLI Install Use The Insta

Víctor García 6 Oct 6, 2023
Rust library and daemon for easily starting postgres databases per-test without Docker

pgtemp pgtemp is a Rust library and cli tool that allows you to easily create temporary PostgreSQL servers for testing without using Docker. The pgtem

Harry Stern 165 Mar 22, 2024
CLI tool to work with Sled key-value databases.

sledtool CLI tool to work with Sled key-value databases. $ sledtool --help Usage: sledtool <dbpath> <command> [<args>] CLI tool to work with Sled da

Vitaly Shukela 27 Sep 26, 2022
Engula empowers engineers to build reliable and cost-effective databases.

Engula is a storage engine that empowers engineers to build reliable and cost-effective databases with less effort and more confidence. Engula is in t

Engula 706 Jan 1, 2023
Sled - the champagne of beta embedded databases

key value buy a coffee for us to convert into databases documentation chat about databases with us sled - it's all downhill from here!!! An embedded d

Tyler Neely 6.6k Jan 8, 2023
Replibyte - a powerful tool to seed your databases

Seed Your Development Database With Real Data ⚡️ Replibyte is a powerful tool to seed your databases with real data and other cool features ?? Feature

Qovery 3.4k Jan 9, 2023
A minecraft-like multi version client implemented in Rust.

Leafish Multi-version Minecraft-compatible client written in Rust, forked from Stevenarella. Chat Chat takes place on Matrix and Discord. The channels

null 617 Dec 27, 2022
Cassandra DB native client written in Rust language. Find 1.x versions on https://github.com/AlexPikalov/cdrs/tree/v.1.x Looking for an async version? - Check WIP https://github.com/AlexPikalov/cdrs-async

CDRS CDRS is looking for maintainers CDRS is Apache Cassandra driver written in pure Rust. ?? Looking for an async version? async-std https://github.c

Alex Pikalov 338 Jan 1, 2023
This is a small demo of how to transform a simple single-server RocksDB service written in Rust into a distributed version using OmniPaxos.

OmniPaxos Demo This is a small demo of how to transform a simple single-server RocksDB service into a distributed version using OmniPaxos. Related res

Harald Ng 6 Jun 28, 2023
A fast and simple in-memory database with a key-value data model written in Rust

Segment Segment is a simple & fast in-memory database with a key-value data model written in Rust. Features Dynamic keyspaces Keyspace level control o

Segment 61 Jan 5, 2023
Lightweight async Redis client with connection pooling written in pure Rust and 100% memory safe

redi-rs (or redirs) redi-rs is a Lightweight Redis client with connection pooling written in Rust and 100% memory safe redi-rs is a Redis client writt

Oğuz Türkay 4 May 20, 2023
ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-text search.

ReefDB ReefDB is a minimalistic, in-memory and on-disk database management system written in Rust, implementing basic SQL query capabilities and full-

Sacha Arbonel 75 Jun 12, 2023
PickleDB-rs is a lightweight and simple key-value store. It is a Rust version for Python's PickleDB

PickleDB PickleDB is a lightweight and simple key-value store written in Rust, heavily inspired by Python's PickleDB PickleDB is fun and easy to use u

null 155 Jan 5, 2023
Rust version of the Haskell ERD tool. Translates a plain text description of a relational database schema to dot files representing an entity relation diagram.

erd-rs Rust CLI tool for creating entity-relationship diagrams from plain text markup. Based on erd (uses the same input format and output rendering).

Dave Challis 32 Jul 25, 2022
Distributed, version controlled, SQL database with cryptographically verifiable storage, queries and results. Think git for postgres.

SDB - SignatureDB Distributed, version controlled, SQL database with cryptographically verifiable storage, queries and results. Think git for postgres

Fremantle Industries 5 Apr 26, 2022
A simplified version of a Redis server supporting SET/GET commands

This is a starting point for Rust solutions to the "Build Your Own Redis" Challenge. In this challenge, you'll build a toy Redis clone that's capable

Patrick Neilson 2 Nov 15, 2022
A multi-instance, Discord/Spacebar API-compatible chat client

Polyphony A multi-instance, Discord/Spacebar API-compatible chat client, written in Rust and Svelte (TypeScript) using Tauri. Explore the docs » Repor

null 5 Mar 30, 2023