Attribute-Level Caching in Heterogeneous In-Memory DBMS

Xiangpeng Hao

Last update: Jun 20, 2022

Related tags

Caching rust caching storage-engine dbms persistent-memory in-memory-database oltp

Overview

Alchemy

Attribute-Level Caching in Heterogeneous In-Memory DBMS

Alchemy is a DRAM-PM hybrid database engine built from scratch to achieve high performance at a low cost. The key is to leverages PM’s byte-addressability to cache only hot attributes in DRAM, reducing requirements on PM bandwidth. Alchemy also uses Rust's coroutines to overlap data fetching and computation, effectively hiding CPU stalls.

What is PM

PM is a new type of persistent storage technology that is byteaddressable and offers high performance. There are multiple alternatives, but Intel Optane DCPMM built on 3D XPoint is currently the only commercially available option, which is therefore our focus.

Compared to DRAM, Optane DCPMM scales to much higher capacity and has a lower price. For instance, a single server can be equipped with up to 12TB of Optane DCPMMs, making DCPMM appealing for building persistent and cost-effective memory-optimized OLTP engines. However, PM has a relatively smaller bandwidth and a higher access latency compared to DRAM.

Run Alchemy

Alchemy is built in Rust, and requires nightly Rust to build, you can download here: https://rustup.rs

Alchemy currently only supports Linux server with Intel Optane Persistent Memory Module, it must be configured as AppDirect mode.

[Optional] For more advanced internal metrics, the kernel should enable msr module and a pcm-sensor-server should be running in the background

Build

cargo build [--release]

Test

Run unit test

cargo test

To test with address sanitizers:

env ASAN_OPTIONS="detect_odr_violation=0" RUSTFLAGS="-Z sanitizer=address" cargo test -Zbuild-std --target x86_64-unknown-linux-gnu --features asans -- --skip test_alloc_oom

Benchmark

Alchemy implements YCSB-like microbenchmarks and TPC-C benchmark.

Run microbenchmark

POOL_DIR=/mnt/pmem LOG_DIR=/mnt/ssd cargo bench --bench table

The POOL_DIR specifies the persistent memory pool directory, it must point to a ext4 filesystem with dax mode enabled.

The LOG_DIR specifies the logging directory; the underlying disk must have sufficient write bandwidth (>1GB/s) otherwise logging will be the bottleneck.

Run TPC-C benchmark

POOL_DIR=/mnt/pmem LOG_DIR=/mnt/ssd cargo bench --bench tpcc

Alchemy uses Shumai benchmark framework which provides several convenient features: it captures the flamegraph of benchmark function; it records the disk bandwidth of the benchmark process; it collects the PCM counter stats.

All benchmark results (along with the statistics) will be saved to a json file.

Alchemy benchmark results

While we have tried our best, we have not confirmed that we tested everything correctly. We are happy to work with the community to validate the results and improve the implementation.

We conduct experiments on a server equipped with a 16-core (32 hyperthreads) Intel Xeon Gold 5218 CPU clocked at 2.3GHz with 22 MB of on-chip SRAM cache. The machine has 32GB of DRAM (two channels, 16GB each) and 512GB of Intel Optane DCPMM (four channels, 128GB each); both DRAM and DCPMM DIMMs are distributed across memory channels to max out the available bandwidth.

Our experiments run on Ubuntu Linux 18.04 with Linux kernel 5.0.21; the source code is implemented and compiled using Rust 1.58. The database accesses PM using normal load and store instructions enabled by the mmap interface under the fsdax mode.

For each experiment, we run the program for a warmup period so that the performance numbers stabilize. We then measure performance for four runs, with 15-second each, and report the average performance of the four runs. Every experiment has logging enabled, which sends data to an NVMe SSD with a write bandwidth of ∼2.7GB/s; the maximum write traffic of all experiments is ∼700MB/s,indicating that logging to SSD is not a bottleneck.

We open source all of our raw benchmark data in the experiments/ folder.

Related projects

Congee: A Rust implementation of concurrent ART (adaptive radix tree). Alchemy uses congee to implement range query.

Shumai: Shumai is a multi-thread benchmarking framework that produces accurate and reproducible results. All the benchmarks of Alchemy are powered by Shumai.

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, written in Rust

Datafuse Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture Datafuse is a Real-Time Data Processing & Analytics DBMS wit

5k Jan 4, 2023

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

5k Jan 9, 2023

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

5k Jan 9, 2023

Simple document-based NoSQL DBMS from scratch

cudb (a.k.a. cuda++) Simple document-based noSQL DBMS modelled after MongoDB. (Has nothing to do with CUDA, has a lot to do with the Cooper Union and

3 Dec 18, 2021

rabe is an Attribute Based Encryption library, written in Rust

Rabe rabe is a rust library implementing several Attribute Based Encryption (ABE) schemes using a modified version of the bn library of zcash (type-3

52 Dec 15, 2022

Extended attribute library for rust.

xattr A small library for setting, getting, and listing extended attributes. Supported Platforms: Linux, MacOS, FreeBSD, and NetBSD. API Documentation

33 Nov 12, 2022

Attribute macro that generates negated versions of`is_something` functions

negate negate is a simple attribute macro that negates a given function. Usage #[negate] Given a function of the form is_* that returns a boolean valu

9 Mar 4, 2022

A Rust attribute macro to limit a function's number of runs over a specified period of time

throttle_my_fn: A Rust attribute macro to throttle the execution of functions throttle_my_fn is a Rust attribute macro to limit a function's number of

8 Dec 3, 2022

Attribute for defining `macro_rules!` macros with proper visibility and scoping

macro-vis This crate provides an attribute for defining macro_rules! macros that have proper visibility and scoping. The default scoping and publicity

2 Aug 29, 2022

A Rust attribute macro that adds memoization to a function (rhymes with Mickey)

michie (sounds like Mickey) — an attribute macro that adds memoization to a function. Table of contents Features Non-features key_expr key_type store_

16 Dec 20, 2022

Rust Attribute-Based Encryption library rabe's C FFI binding , support CP-ABE and KP-ABE encrypt and decrypt, submodule of Rabe.Core c# library.

Rabe-ffi Rust Attribute-Based Encryption library rabe's C FFI binding , support CP-ABE and KP-ABE encrypt and decrypt, submodule of Rabe.Core c# libra

2 Oct 10, 2022

Hitbox is an asynchronous caching framework supporting multiple backends and suitable for distributed and for single-machine applications.

62 Dec 27, 2022

cargo search, built for caching binary artifacts, optimized for GitHub Actions

cargo-search2 A binary utility that provides a more convenient version of cargo search. Installation Grab pre-built binaries for your platform from th

2 Oct 12, 2021

Scalable and encrypted embedded database with 3-tier caching

Infinitree is a versioned, embedded database that uses uniform, encrypted blobs to store data.

116 Dec 27, 2022

Attribute-Level Caching in Heterogeneous In-Memory DBMS

Related tags

Overview

Alchemy

What is PM

Run Alchemy

Build

Test

Benchmark

Alchemy benchmark results

Related projects

You might also like...

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, written in Rust

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

A Modern Real-Time Data Processing & Analytics DBMS with Cloud-Native Architecture, built to make the Data Cloud easy

Simple document-based NoSQL DBMS from scratch

rabe is an Attribute Based Encryption library, written in Rust

Extended attribute library for rust.

Attribute macro that generates negated versions of`is_something` functions

A Rust attribute macro to limit a function's number of runs over a specified period of time

Attribute for defining `macro_rules!` macros with proper visibility and scoping

A Rust attribute macro that adds memoization to a function (rhymes with Mickey)

Rust Attribute-Based Encryption library rabe's C FFI binding , support CP-ABE and KP-ABE encrypt and decrypt, submodule of Rabe.Core c# library.

An attribute macro to simplify writing simple command line applications.

Solving context limits when working with AI LLM models by implementing a "chunkable" attribute on your prompt structs.

Attribute macro for implementing methods on both Foo and ArchivedFoo.

A buildpack for Rust applications on Heroku, with full support for Rustup, cargo and build caching.

Punic is a remote caching CLI built for Apple's .xcframework

Hitbox is an asynchronous caching framework supporting multiple backends and suitable for distributed and for single-machine applications.

cargo search, built for caching binary artifacts, optimized for GitHub Actions

Scalable and encrypted embedded database with 3-tier caching

Owner

Xiangpeng Hao

cargo search, built for caching binary artifacts, optimized for GitHub Actions

Solana JSON-RPC caching server

Key-Value based in-memory cache library which supports Custom Expiration Policies

Stretto is a Rust implementation for ristretto. A high performance memory-bound Rust cache.

A general-purpose distributed memory cache system compatible with Memcached

ConstDB - an in-memory cache store which aims at master-master replications

A read-only, memory-mapped cache.

Cleora AI is a general-purpose model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data.

An Async SDR Runtime for Heterogeneous Architectures

RustSBI support on SiFive FU740 board; FU740 is a five-core heterogeneous processor with four SiFive U74 cores, and one SiFive S7 core