Sled - the champagne of beta embedded databases

Tyler Neely

Last update: Jan 8, 2023

Related tags

Database rust tree orm database high-performance persistence fuzzing formal-methods concurrent lock-free log-structured kv b-tree sled b-plus-tree b-link-tree bw-tree incredibly-spicy embedded-kv crash-testing

Overview

key	value
buy a coffee for us to convert into databases
documentation
chat about databases with us

sled - it's all downhill from here!!!

An embedded database.

let tree = sled::open("/tmp/welcome-to-sled")?;

// insert and get, similar to std's BTreeMap
let old_value = tree.insert("key", "value")?;

assert_eq!(
  tree.get(&"key")?,
  Some(sled::IVec::from("value")),
);

// range queries
for kv_result in tree.range("key_1".."key_9") {}

// deletion
let old_value = tree.remove(&"key")?;

// atomic compare and swap
tree.compare_and_swap(
  "key",
  Some("current_value"),
  Some("new_value"),
)?;

// block until all operations are stable on disk
// (flush_async also available to get a Future)
tree.flush()?;

If you would like to work with structured data without paying expensive deserialization costs, check out the structured example!

features

API similar to a threadsafe BTreeMap<[u8], [u8]>
serializable (ACID) transactions for atomically reading and writing to multiple keys in multiple keyspaces.
fully atomic single-key operations, including compare and swap
zero-copy reads
write batches
subscribe to changes on key prefixes
multiple keyspaces
merge operators
forward and reverse iterators over ranges of items
a crash-safe monotonic ID generator capable of generating 75-125 million unique ID's per second
zstd compression (use the compression build feature, disabled by default)
cpu-scalable lock-free implementation
flash-optimized log-structured storage
uses modern b-tree techniques such as prefix encoding and suffix truncation for reducing the storage costs of long keys with shared prefixes. If keys are the same length and sequential then the system can avoid storing 99%+ of the key data in most cases, essentially acting like a learned index

expectations, gotchas, advice

Maybe one of the first things that seems weird is the IVec type. This is an inlinable Arced slice that makes some things more efficient.
Durability: sled automatically fsyncs every 500ms by default, which can be configured with the flush_every_ms configurable, or you may call flush / flush_async manually after operations.
Transactions are optimistic - do not interact with external state or perform IO from within a transaction closure unless it is idempotent.
Internal tree node optimizations: sled performs prefix encoding on long keys with similar prefixes that are grouped together in a range, as well as suffix truncation to further reduce the indexing costs of long keys. Nodes will skip potentially expensive length and offset pointers if keys or values are all the same length (tracked separately, don't worry about making keys the same length as values), so it may improve space usage slightly if you use fixed-length keys or values. This also makes it easier to use structured access as well.
sled does not support multiple open instances for the time being. Please keep sled open for the duration of your process's lifespan. It's totally safe and often quite convenient to use a global lazy_static sled instance, modulo the normal global variable trade-offs. Every operation is threadsafe, and most are implemented under the hood with lock-free algorithms that avoid blocking in hot paths.

performance

LSM tree-like write performance with traditional B+ tree-like read performance
over a billion operations in under a minute at 95% read 5% writes on 16 cores on a small dataset
measure your own workloads rather than relying on some marketing for contrived workloads

a note on lexicographic ordering and endianness

If you want to store numerical keys in a way that will play nicely with sled's iterators and ordered operations, please remember to store your numerical items in big-endian form. Little endian (the default of many things) will often appear to be doing the right thing until you start working with more than 256 items (more than 1 byte), causing lexicographic ordering of the serialized bytes to diverge from the lexicographic ordering of their deserialized numerical form.

Rust integral types have built-in to_be_bytes and from_be_bytes methods.
bincode can be configured to store integral types in big-endian form.

interaction with async

If your dataset resides entirely in cache (achievable at startup by setting the cache to a large enough value and performing a full iteration) then all reads and writes are non-blocking and async-friendly, without needing to use Futures or an async runtime.

To asynchronously suspend your async task on the durability of writes, we support the flush_async method, which returns a Future that your async tasks can await the completion of if they require high durability guarantees and you are willing to pay the latency costs of fsync. Note that sled automatically tries to sync all data to disk several times per second in the background without blocking user threads.

We support async subscription to events that happen on key prefixes, because the Subscriber struct implements Future<Output=Option<Event>>:

let sled = sled::open("my_db").unwrap();

let mut sub = sled.watch_prefix("");

sled.insert(b"a", b"a").unwrap();

extreme::run(async move {
    while let Some(event) = (&mut sub).await {
        println!("got event {:?}", event);
    }
});

minimum supported Rust version (MSRV)

We support Rust 1.48.0 and up.

architecture

lock-free tree on a lock-free pagecache on a lock-free log. the pagecache scatters partial page fragments across the log, rather than rewriting entire pages at a time as B+ trees for spinning disks historically have. on page reads, we concurrently scatter-gather reads across the log to materialize the page from its fragments. check out the architectural outlook for a more detailed overview of where we're at and where we see things going!

philosophy

don't make the user think. the interface should be obvious.
don't surprise users with performance traps.
don't wake up operators. bring reliability techniques from academia into real-world practice.
don't use so much electricity. our data structures should play to modern hardware's strengths.

known issues, warnings

if reliability is your primary constraint, use SQLite. sled is beta.
if storage price performance is your primary constraint, use RocksDB. sled uses too much space sometimes.
if you have a multi-process workload that rarely writes, use LMDB. sled is architected for use with long-running, highly-concurrent workloads such as stateful services or higher-level databases.
quite young, should be considered unstable for the time being.
the on-disk format is going to change in ways that require manual migrations before the 1.0.0 release!

priorities

rework the transaction API to eliminate surprises and limitations
reduce space and memory usage
the 1.0.0 release date is imminent! just putting the final touches on, while performing intensive testing
combine merge operators with subscribers in a way that plays nicely with transactions
typed trees for low-friction serialization
replication support for both strongly and eventually consistent systems
continue to improve testing and make certain bug classes impossible through construction
continue to optimize the hell out of everything
continue to improve documentation and examples
continue to reduce compilation latency

fund feature development

Like what we're doing? Help us out via GitHub Sponsors!

Comments

32-bit ARM compatibility
[x] add test to make sure that the system never wraps by writing to an offset > 2^32

[x] add additional concurrency verification tooling, as ARM has a weaker memory model than x86_64 (which makes it a great test platform in general)

feature
opened by spacejam 21
Transactions are deadlocking at insert_inner
Transactions are deadlocking inside of commit and never returning in my application. LLDB shows the thread as stuck on this line:

while self.tree.insert_inner(k, v_opt.clone(), &mut guard)?.is_err() { }

Source things: https://github.com/spacejam/sled/blob/master/src/transaction.rs#L364

expected result: I expect TransactionTree::commit to always return a result.

actual result: TransactionTree::commit intermittently blocks until I kill the application.

sled version: 0.34.2 + some small error handling experiments: https://github.com/D1plo1d/sled/tree/53d0e30a749f27fb808101730f1794a5f85b6216

rustc version: rustc 1.44.1 (c7087fe00 2020-06-17)

operating system: Ubuntu 20.04

minimal code sample that helps to reproduce the issue: Intermittent issue. TBH I could use a hand figuring out how to reliably reproduce my issue.

logs, panic messages, stack traces: Not sure what would be helpful here (please let me know!). I added a couple trace! macros to commit and observed that the commit started but never finished.
opened by D1plo1d 19

Cannot shutdown sled from Windows Service

sled version 0.28.0
rustc version 1.39.0
operating system Windows 10 Enterprise 1809 Build 17763.864
logs

When I try to join my main thread after receiving the stop signal, I have 2 sled threads that are not terminating.

The Db and Trees are stored using RwLock:

pub struct Table {
    pub arc_tree_lock: RwLock<Option<Arc<Tree>>>,
    name: String,
    key: String,
    key2_opt: Option<String>,
}

pub struct MyDb {
    pub my_db: Arc<RwLock<Option<Db>>>,
    db_name: String,

When I get the Windows service stop, for each Tree I call flush() and then I set the option to None to de-reference the Tree after taking the lock.

When done, I set my_db to None to de-reference the DB.

However, I have 2 threads still waiting that prevent the final join from completing, it was 3 including the log flusher, but after the code to flush then set option to None, at least the log flusher seems to have cleaned up.

The 2 are sled-io with the following stack trace (after break all):

 	[External Code]	
 	[Inline Frame] my.exe!!std::sys::windows::c::SleepConditionVariableSRW() Line 70	Unknown
 	[Inline Frame] my.exe!!std::sys::windows::condvar::Condvar::wait() Line 22	Unknown
 	[Inline Frame] my.exe!!std::sys_common::condvar::Condvar::wait() Line 41	Unknown
 	[Inline Frame] my.exe!!std::sync::condvar::Condvar::wait() Line 204	Unknown
 	my.exe!std::thread::park() Line 911	Unknown
 	my.exe!!crossbeam_channel::context::Context::wait_until(core::option::Option<std::time::Instant> self) Line 175	Unknown
 	my.exe!!crossbeam_channel::flavors::zero::{{impl}}::recv::{{closure}}<alloc::boxed::Box<FnOnce<()>>>(crossbeam_channel::flavors::zero::{{impl}}::recv::closure-1 cx, crossbeam_channel::context::Context *) Line 304	Unknown
 	my.exe!!crossbeam_channel::context::{{impl}}::with::{{closure}}<closure-1,core::result::Result<alloc::boxed::Box<FnOnce<()>>, crossbeam_channel::err::RecvTimeoutError>>(crossbeam_channel::context::{{impl}}::with::closure-0 * cx, crossbeam_channel::context::Context *) Line 51	Unknown
 	my.exe!!crossbeam_channel::context::{{impl}}::with::{{closure}}<closure-1,core::result::Result<alloc::boxed::Box<FnOnce<()>>, crossbeam_channel::err::RecvTimeoutError>>(crossbeam_channel::context::{{impl}}::with::closure-1 cell, core::cell::Cell<core::option::Option<crossbeam_channel::context::Context>> *) Line 58	Unknown
 	my.exe!!std::thread::local::LocalKey<core::cell::Cell<core::option::Option<crossbeam_channel::context::Context>>>::try_with<core::cell::Cell<core::option::Option<crossbeam_channel::context::Context>>,closure-1,core::result::Result<alloc::boxed::Box<FnOnce<()>>, crossbeam_channel::err::RecvTimeoutError>>(crossbeam_channel::context::{{impl}}::with::closure-1 self) Line 262	Unknown
>	my.exe!!crossbeam_channel::context::Context::with<closure-1,core::result::Result<alloc::boxed::Box<FnOnce<()>>, crossbeam_channel::err::RecvTimeoutError>>(crossbeam_channel::flavors::zero::{{impl}}::recv::closure-1 f) Line 53	Unknown
 	my.exe!!crossbeam_channel::flavors::zero::Channel<alloc::boxed::Box<FnOnce<()>>>::recv<alloc::boxed::Box<FnOnce<()>>>(core::option::Option<std::time::Instant> self) Line 293	Unknown
 	my.exe!!crossbeam_channel::channel::Receiver<alloc::boxed::Box<FnOnce<()>>>::recv<alloc::boxed::Box<FnOnce<()>>>() Line 718	Unknown
 	my.exe!!crossbeam_channel::channel::{{impl}}::next<alloc::boxed::Box<FnOnce<()>>>(crossbeam_channel::channel::Iter<alloc::boxed::Box<FnOnce<()>>> * self) Line 1105	Unknown
 	my.exe!!pagecache::threadpool::spawn::init_pool::{{closure}}(pagecache::threadpool::spawn::init_pool::closure-0) Line 54	Unknown
 	my.exe!!std::sys_common::backtrace::__rust_begin_short_backtrace<closure-0,()>(pagecache::threadpool::spawn::init_pool::closure-0 f) Line 127	Unknown
 	my.exe!!std::thread::{{impl}}::spawn_unchecked::{{closure}}::{{closure}}<closure-0,()>(std::thread::{{impl}}::spawn_unchecked::{{closure}}::closure-0) Line 471	Unknown
 	my.exe!!std::panic::{{impl}}::call_once<(),closure-0>(std::panic::AssertUnwindSafe<closure-0> self) Line 316	Unknown
 	my.exe!!std::panicking::try::do_call<std::panic::AssertUnwindSafe<closure-0>,()>(unsigned char * data) Line 292	Unknown
 	my.exe!!panic_unwind::__rust_maybe_catch_panic() Line 80	Unknown
 	my.exe!!std::panicking::try<(),std::panic::AssertUnwindSafe<closure-0>>(std::panic::AssertUnwindSafe<closure-0> f) Line 271	Unknown
 	my.exe!!std::panic::catch_unwind<std::panic::AssertUnwindSafe<closure-0>,()>(std::panic::AssertUnwindSafe<closure-0> f) Line 394	Unknown
 	my.exe!!std::thread::{{impl}}::spawn_unchecked::{{closure}}<closure-0,()>(std::thread::{{impl}}::spawn_unchecked::closure-0) Line 469	Unknown
 	my.exe!!core::ops::function::FnOnce::call_once<closure-0,()>(std::thread::{{impl}}::spawn_unchecked::closure-0 *) Line 227	Unknown
 	my.exe!!alloc::boxed::{{impl}}::call_once<(),FnOnce<()>>() Line 922	Unknown
 	[Inline Frame] my.exe!!alloc::boxed::{{impl}}::call_once() Line 922	Unknown
 	[Inline Frame] my.exe!!std::sys_common::thread::start_thread() Line 13	Unknown
 	my.exe!e!std::sys::windows::thread::{{impl}}::new::thread_start() Line 47	Unknown
 	[External Code]

opened by crusty-dave 17

Compilation for 32-bit architectures on nightly
Discussion points:

[x] Is a global nightly feature good in this case? Should I give it a better name?

[x] I didn't run the tests on an ARM 32-bit; would you want me to test that?

[x] pagecache::Lsn is now of type i64 when the nightly feature is enabled.

Let me know if you want me to change stuff! :-)

Issue #145 for reference.
opened by rubdos 17
Miri support
This PR will add enough conditional compilation to support running tests with miri. So far, I have disabled use of the process's PID, turned off the body of try_lock(), and ignored the quickcheck tests, since they would take far too long to run. Miri is slow as tar, so using it would only be useful for spot-checking for unsoundness, not a CI job or anything.

Miri requires additional shims for mkdir and rmdir before this will work, I have changes in progress for that as well.

The first issue I've discovered thus far is that crossbeam-epoch has an unsound use of std::mem::uninitialized, creating a std::Mem::ManuallyDrop<T>. There is a PR at crossbeam-rs/crossbeam#458 that will fix this soon.

Relevant commands:

Run a test: cargo +nightly miri test --features=testing -- -Zmiri-disable-isolation -- bug_22

Install miri from source: cargo +nightly install --path . --force --locked --offline

Edit - updated commands:

Install miri from source (can use my merged-for-sled branch now): ./rustup-toolchain one time, then ./miri install

Run a test: cargo +miri miri test --features=testing -- -Zmiri-disable-isolation -Zmiri-ignore-leaks -- bug_22

Expanding macros, as seen by miri: RUSTFLAGS="--cfg miri" cargo +nightly expand --lib --tests serialization::qc

More commands:

Default test suite: cargo +miri miri test --features=testing -- -Zmiri-disable-isolation -Zmiri-ignore-leaks -- (set ulimit -v first or this may eat all your swap space)

Default test suite, with racing stripes: cargo +miri miri test --features=testing -- -Zmiri-disable-isolation -Zmiri-ignore-leaks -Zmiri-disable-stacked-borrows -Zmiri-disable-validation -- (takes ~1hr on my computer)

Try running a longer-running test: cargo +miri miri test --features=testing -- -Zmiri-disable-isolation -Zmiri-ignore-leaks -- --ignored tree_bug_01
opened by divergentdave 16
Tree::transaction API prevents error forwarding
The function passed to Tree::transaction must currently return a Result<_, TransactionError>.

This prevents users from forwarding business logic errors that occur during a transaction.

I can see two solutions:

make the error generic, so it is under the control of the user. diesel does it this way

Add a TransactionError::Custom(Box<dyn std::error::Error) variant that can be used.

feature refactoring
opened by theduke 16
Use SHA3-256 for checksum

As of release 0.31.0, sled is using the CRC32 checksum for the checksum value. This is quite weak. Would you consider upgrading the SHA3 family of functions? I'd suggest either (SHA3-256)[https://docs.rs/sha3/0.8.2/sha3/struct.Sha3_256.html] or (SHA3-512)[https://docs.rs/sha3/0.8.2/sha3/struct.Sha3_512.html].

opened by ckaran 15
Rewrite the Data split method

I tested and assumed that the ptrs are already sorted and that prefix encoded keys are sorted the same that non encoded ones (is that named "ordering transitivity" ?).

I will rebase on master once #592 is merged, if it is 😄

So in this rewrite I no longer need to construct a decoded sorted Vec but only need to decode the middle key (split) and create a new Vec of half the size of ptrs re-encoded using the previously extracted middle key (split).

I also rewrite the prefix_decode and prefix_encode functions removing the unsafe code that doesn't seems to be needed and introduced a new prefix_reencode function that decode and encode using the the old and new prefix.

This update reduced the usage percentage of the Data::split method from 46.2% to 25.47% and make my program takes from 25mins (including the #592 patch) to near 16mins.

flamegraphs-before-after-data-split-patch.zip

opened by Kerollmops 14
Sled database is not really closed after dropping `sled::Db`.
Example code snippet (non-standalone):

let db: sled::Db = sled::open(&opts.database_path)?; db.insert(b"qqq2", b"www3")?; db.flush()?; drop(db); let db: sled::Db = sled::open(&opts.database_path)?; drop(db);

expected result: always works. No extra threads or file descriptors or other remnants remain after dropping all sled::Db instances.

actual result: sometimes fails to close the file descriptor on drop, then fails flock on the next open.

sled version: 0.34.6

rustc version: rustc 1.49.0-nightly (8dae8cdcc 2020-10-12)

operating system: Linux
opened by vi 13
Question: Is Sled multi-process safe?

I have access to a cluster system, but the nodes on the cluster can only synchronize through the filesystem. flock does work on the system, so I've been using it to ensure that any shared files aren't corrupted, and I could continue to do the same when using sled, but if sled is already multi-process safe, then there's no need for all this code (and it would likely slow sled down a lot). So, is sled multi-process safe?

opened by ckaran 12
MIPS support
Hi there,

I've recently tried to compile a project that depends on sled to OpenWRT mips device using cross.

cross build --target mips-unknown-linux-musl

This led me to type errors similar to the ones found on async-std for 32bit devices.

I've tried to locally change the types from AtomicU64 to AtomicUsize as they did there, and there were many more errors to fix...

I'm curious if there is interest on having the project working on MIPS devices, and it I could give it a try to get it compiling. Cheers.
opened by bltavares 12
Lazy batch operations

Use Case:

The Db::apply_batch requires all arguments to be materialized into a batch before writing them to sled. This potentially means I need to duplicate and serialize all my data before I can begin inserting it.

Proposed Change:

It would be nice if I could pass a lazy iterator to sled that contains the operations. Personally, I am only interested in insertions.

Who Benefits From The Change(s)?

Anyone who needs to perform big atomic batch updates. For example, persisting all entries in a local hashmap.

Alternative Approaches

Calling Db::insert per key-value pair, though this is not atomic.
feature

opened by segeljakt 0
Derive `Serialize` and `Deserialize` for `Error`?

Use Case:

We use sled::Error inside our own error enumeration that must implement ResponseError from actix_web. Inside impl ResponseError, we call bincode::serialize on the error, which requires it to be Serialize. Since sled::Error doesn't implement Serialize, we have to store it simply as a string.

Proposed Change:

Introduce a new feature serde and derive Serializable and Deserializable for sled::Error.

Who Benefits From The Change(s)?

sled users who want to use sled::Error in a (de)serializable context, e.g., in a microservice architecture.

Alternative Approaches
feature

opened by Hirrolot 2
Documenting the story with `async` handling

Use Case:

Using sled with async is not well documented.

Proposed Change:

Sled should have a readme section or (better) an example of handling it in an async case when the database is not in-memory.

Who Benefits From The Change(s)?

People using sled as storage with webservices, which are mostly asynchronous at this moment.

Alternative Approaches

Add some special in-code handling of async, but that's unnecessary in my opinion.
feature

opened by Czocher 0
Improve `Config` docs: Document defaults

Hello, I started using sled in ab-av1 & it seems to work well for me thank you!

Having the default config values in documentation would have sped up my onboarding, saving me from searching in the source for the values. So this PR adds that info to the docs.

opened by alexheretic 0

Auto flusher available for all platform

Here v0.34.7/src/db.rs#L75-L81 auto flush is restricted.

        #[cfg(all(
            not(miri),
            any(
                windows,
                target_os = "linux",
                target_os = "macos",
                target_os = "dragonfly",
                target_os = "freebsd",
                target_os = "openbsd",
                target_os = "netbsd",
            )
        ))]

@spacejam What is the original reason to restrict it?

I see here that you remove these restrictions. What are the implications? Is it possible to create a release including this commit?

opened by vincent-herlemont 0

Releases(v0.34.7)

v0.34.7(Sep 12, 2021)
Bug Fixes

#1314 Fix a bug in Subscriber's Future impl.

Source code(tar.gz)
Source code(zip)
v0.34.6(Nov 13, 2020)
Improvements

documentation improved

Source code(tar.gz)
Source code(zip)
v0.34.5(Nov 9, 2020)
Improvements

#1164 widen some trait bounds on trees and batches

Source code(tar.gz)
Source code(zip)
v0.34.4(Sep 10, 2020)
New Features

#1151 Send is implemented for Iter

#1167 added Tree::first and Tree::last functions to retrieve the first or last items in a Tree, unless the Tree is empty.

Bug Fixes

#1159 dropping a Db instance will no-longer prematurely shut-down the background flusher thread.

#1168 fixed an issue that was causing panics during recovery in 32-bit code.

#1170 when encountering corrupted storage data, the recovery process will panic less often.

Source code(tar.gz)
Source code(zip)
v0.34.3(Sep 10, 2020)
New Features

#1146 added TransactionalTree::generate_id

Source code(tar.gz)
Source code(zip)
v0.34.2(Jul 29, 2020)
Improvements

#1133 transactions and writebatch performance has been significantly improved by removing a bottleneck in the atomic batch stability tracking code.

Source code(tar.gz)
Source code(zip)
v0.34.1(Jul 28, 2020)
New Features

#1136 Added the TransactionalTree::flush method to flush the underlying database after the transaction commits and before the transaction returns.

Source code(tar.gz)
Source code(zip)
v0.34(Jul 23, 2020)
Improvements

#1132 implemented Fromsled::Error for io::Error to reduce friction in some situations.

Breaking Changes

#1131 transactions performed on Trees from different Dbs will now safely fail.

#1131 transactions may now only be performed on tuples of up to 14 elements. For higher numbers, please use slices.

Source code(tar.gz)
Source code(zip)
v0.33(Jul 15, 2020)
Breaking Changes

#1125 the backtrace crate has been made optional, which cuts several seconds off compilation time, but may cause breakage if you interacted with the backtrace field of corruption-related errors.

Bug Fixes

#1128 Tree::pop_min and Tree::pop_max had a bug where they were not atomic.

Source code(tar.gz)
Source code(zip)
v0.32.1(Jul 12, 2020)
New Features

#1116 IVec::subslice has been added to facilitate creating zero-copy subsliced IVecs that are backed by the same data.

Bug Fixes

#1120 Fixed a use-after-free caused by missing ref keyword on a Copy type in a pattern match in IVec::as_mut.

#1108 conversions from Box<[u8]> to IVec are fixed.

Source code(tar.gz)
Source code(zip)
v0.32(Jun 20, 2020)
New Features

#1079 Transactional is now implemented for [&Tree] and [Tree] so you can avoid the previous friction of using tuples, as was necessary previously.

#1058 The minimum supported Rust version (MSRV) is now 1.39.0.

#1037 Subscriber now implements Future (non-fused) so prefix watching may now be iterated over via while let Some(event) = (&mut subscriber).await {}

Improvements

A large number of crash consistency issues have been discovered during intensive testing and fixed.

#965 concurrency control is now dynamically enabled for atomic point operations, so that it may be avoided unless transactional functionality is being used in the system. This significantly increases performance for workloads that do not use transactions.

A number of memory optimizations have been implemented.

Disk usage has been significantly reduced for many workloads.

#1016 On 64-bit systems, we can now store 1-2 trillion items.

#993 Added DerefMut and AsMut<[u8]> for IVec where it works similarly to a Cow, making a private copy if the backing Arc's strong count is not 1.

#1020 The sled wiki has been moved into the documentation itself, and is accessible through the doc module exported in lib.

Breaking Changes

#975 Changed the default segment_size from 8m to 512k. This will result in far smaller database files due to better file garbage collection granularity.

#975 deprecated several Config options that will be removed over time.

#1000 rearranged some transaction-related imports, and moved them to the transaction module away from the library root to keep the top level docs clean.

#1015 TransactionalTree::apply_batch now accepts its argument by reference instead of by value.

Event has been changed to make the inner fields named instead of anonymous.

#1057 read-only mode has been removed due to not having the resources to properly keep it tested while making progress on high priority issues. This may be correctly implemented in the future if resources permit.

Source code(tar.gz)
Source code(zip)
v0.31(Jan 31, 2020)
0.31

Improvements

#947 dramatic read and recovery optimizations

#921 reduced the reliance on locks while performing multithreaded IO on windows.

#928 use sync_file_range on linux instead of a full fsync for most writes.

#946 io_uring support changed to the rio crate

#939 reduced memory consumption during zstd decompression

Breaking Changes

#927 use SQLite-style varints for serializing u64. This dramatically reduces the written bytes for databases that store small keys and values.

#943 use varints for most of the fields in message headers, causing an additional large space reduction. combined with #927, these changes reduce bytes written by 68% for workloads writing small items.

Source code(tar.gz)
Source code(zip)
0.30.2(Dec 27, 2019)
New Features

Added the open function for quickly opening a database at a path with default configuration.

Source code(tar.gz)
Source code(zip)
v0.30(Dec 13, 2019)
much faster compile speeds after aggressive dependency pruning

new storage format

bugfixes in cache invalidation and combining transactions with compression

Source code(tar.gz)
Source code(zip)
v0.29(Oct 16, 2019)
New Features

The Config::open method has been added to give Config a similar feel to std's fs::OpenOptions. The Config::build and Db::start methods are now deprecated in favor of calling Config::open directly

A checksum method has been added to Tree and Db for use in verifying backups and migrations.

Transactions may now involve up to 69 different tables. Nice.

The TransactionError::Abort variant has had a generic member added that can be returned as a way to return information from a manually-aborted transaction. An abort helper function has been added to reduce the boiler- plate required to return aborted results.

Breaking Changes

The ConfigBuilder structure has been removed in favor of a simplified Config structure with the same functionality.

The way that sled versions are detected at initialization time is now independent of serde.

The cas method is deprecated in favor of the new compare_and_swap method which now returns the proposed value that failed to be applied.

Tree nodes now have constant prefix encoding lengths.

The io_buf_size configurable renamed to segment_size.

The io_buf_size configurable method has been removed from ConfigBuilder. This can be manually set by setting the attribute directly on the ConfigBuilder, but this is discouraged. Additionally, this must now be a power of 2.

The page_consolidation_threshold method has been removed from ConfigBuilder, and this is now a constant of 10.

Source code(tar.gz)
Source code(zip)
0.27(Sep 8, 2019)
Breaking Changes

Event::Set has been renamed to Event::Insert and Event::Del has been renamed to Event::Remove. These names better align with the methods of BTreeMap from the standard library.

Bug Fixes

A deadlock was possible in very high write volume situations when the segment accountant lock was taken by all IO threads while a task was blocked trying to submit a file truncation request to the threadpool while holding the segment accountant lock.

New Features

flush_async has been added to perform time-intensive flushing in an asynchronous manner, returning a Future.

Source code(tar.gz)
Source code(zip)
0.26(Aug 19, 2019)
0.26

New Features

Transactions! You may now call Tree::transaction and perform reads, writes, and deletes within a provided closure with a TransactionalTree argument. This closure may be called multiple times if the transaction encounters a concurrent update in the process of its execution. Transactions may also be used on tuples of Tree objects, where the closure will then be parameterized on TransactionalTree instances providing access to each of the provided Tree instances. This allows you to atomically read and modify multiple Tree instances in a single atomic operation. These transactions are serializable, fully ACID, and optimistic.

Tree::apply_batch allows you to apply a Batch

TransactionalTree::apply_batch allow you to apply a Batch from within a transaction.

Breaking Changes

Tree::batch has been removed. Now you can directly create a Batch with Batch::default() and then apply it to a Tree with Tree::apply_batch or during a transaction using TransactionalTree::apply_batch. This facilitates multi-Tree batches via transactions.

Event::Merge has been removed, and Tree::merge will now send a complete Event::Set item to be distributed to all listening subscribers.

Source code(tar.gz)
Source code(zip)
0.25(Aug 15, 2019)

Source code(tar.gz)
Source code(zip)
0.21.0(Apr 1, 2019)

This release brings big improvements to the underlying GC system for keeping file sizes under control, as well as some optimizations for the writepath.
Source code(tar.gz)
Source code(zip)
0.20.0(Mar 25, 2019)

This release includes a large number of internal and external API cleanups. Some internal CAS-like operations will retry automatically when they fail due to changes that do not impact causality, like rewriting the same page to a new location. When CAS-like operations fail, they now return the new value that was created to insert in the first place, removing several double allocations. Lots of performance improvements.
Source code(tar.gz)
Source code(zip)
0.17.1(Jan 7, 2019)
non-backwards compatible release

keyspace creation and deletion

several optimizations to the pagecache

Source code(tar.gz)
Source code(zip)
v0.16(Oct 28, 2018)

This release introduces several optimizations and bugfixes. The biggest win is support for zero-copy reads when returning values to the user with the Tree's get, del, cas and scan commands. We also now have zero-allocation tree traversal, optimized node searching, and caching improvements.
Source code(tar.gz)
Source code(zip)

Sled - the champagne of beta embedded databases

Related tags

Overview

sled - it's all downhill from here!!!

features

expectations, gotchas, advice

performance

a note on lexicographic ordering and endianness

interaction with async

minimum supported Rust version (MSRV)

architecture

philosophy

known issues, warnings

priorities

fund feature development

Comments

Use Case:

Proposed Change:

Who Benefits From The Change(s)?

Alternative Approaches

Use Case:

Proposed Change:

Who Benefits From The Change(s)?

Alternative Approaches

Use Case:

Proposed Change:

Who Benefits From The Change(s)?

Alternative Approaches

Releases(v0.34.7)

v0.34.7(Sep 12, 2021)

Bug Fixes

v0.34.6(Nov 13, 2020)

Improvements

v0.34.5(Nov 9, 2020)

Improvements

v0.34.4(Sep 10, 2020)

New Features

Bug Fixes

v0.34.3(Sep 10, 2020)

New Features

v0.34.2(Jul 29, 2020)

Improvements

v0.34.1(Jul 28, 2020)

New Features

v0.34(Jul 23, 2020)

Improvements

Breaking Changes

v0.33(Jul 15, 2020)

Breaking Changes

Bug Fixes

v0.32.1(Jul 12, 2020)

New Features

Bug Fixes

v0.32(Jun 20, 2020)

New Features

Improvements

Breaking Changes

v0.31(Jan 31, 2020)

0.31

Improvements

Breaking Changes

0.30.2(Dec 27, 2019)

New Features

v0.30(Dec 13, 2019)

v0.29(Oct 16, 2019)

New Features

Breaking Changes

0.27(Sep 8, 2019)

Breaking Changes

Bug Fixes

New Features

0.26(Aug 19, 2019)

0.26

New Features

Breaking Changes

0.25(Aug 15, 2019)

0.21.0(Apr 1, 2019)

0.20.0(Mar 25, 2019)

0.17.1(Jan 7, 2019)

v0.16(Oct 28, 2018)