Part of #69
Motivation
Currently, RocksDB is linked dynamically. There's a few drawbacks to this approach:
- End users of applications built using this wrapper must install exactly the same version of RocksDB that the application was built with. RocksDB is not included in many package managers at the moment so this either needs to be built from source or provided by the application author.
- There's a problem with tcmalloc (used by RocksDB) clashing with jemalloc (used by Rust) that causes random segfaults
- A developer using rust-rocksdb can choose which version of RocksDB to link with. This makes maintainability of rust-rocksdb more difficult as it must support mulltiple versions of RocksDB. This also makes it difficult to implement features that only exist a new version of RocksDB if it still has to support older versions.
These issues can be solved by statically linking RocksDB instead.
- RocksDB is compiled in to the applications binary. End users don't need to install anything.
- We can now control how RocksDB is built to make sure it doesn't pull in tcmalloc
- Only one version of RocksDB can be used by the wrapper. This version is controlled the wrappers authors.
Implementation
A new sub crate has been added to this repo rocksdb-sys
. This includes the FFI bindings and a build script for RocksDB/Snappy. It is based on the work in the Ethcore fork of this repo (https://github.com/ethcore/rust-rocksdb) which itself is based on the rocksdb-sys crate (https://github.com/jsgf/rocksdb-sys)
The build script is written in Rust and uses gcc-rs to talk to the the C++ compiler. It works on Windows (tested with VC++ 2015) and Linux.
RocksDB and Snappy's source code are pulled in with git submodules. This means that developers of rust-rocksdb will need to run git submodule init
and git submodule update
before developing. This does not affect users of the crate as the submodules are bundled within the crate at the point of packaging it. Users who directly link to the git repo in their cargo dependencies will take a little longer on the first build, submodules are handled automatically by cargo and are cached as well.
Updating RocksDB
RocksDB is currently pinned to the latest commit of the 4.13 maintenance branch.
Instead of using the makefile, we pass a list of .cc files in RocksDB to gcc-rs
. The makefile is not used so that we can perform the build for both Linux and Windows platforms from the same build script.
The build script loads a list of .cc files from a text file rocksdb_lib_sources.txt
. This is generated using a Makefile (that calls RocksDB's makefile to get the sources list) and committed into the git repo.
The process for updating RocksDB is as follows:
cd
into rocksdb-sys/rocksdb
.
git checkout
the commit hash of the new version
cd
into rocksdb-sys
- run
make gen_lib_sources
- Change the commit sha and date in
rocksdb-sys/build_version.cc
(it would be nice to automate this in build.rs
)
- Test and commit those changes into git
There's a chance that they may change the build process/makefile causing the above steps to not work. I think it would be best to tackle these issues as they come; I don't expect they will be very common.
Alternative solutions
Pulling in rocksdb source code with git submodules
I chose to use git-submodules, despite concerns about the impact on build time when having to download the history of RocksDB.
- RocksDB will be bundled in the crate that is uploaded to crates.io so this won't afect most users
- Users who do reference the git repo from their Cargo.toml will have to download the RocksDB repo via the submodule, but this only happens on the first build as it is cached thereafter
- It takes less than a minute for me to download (10Mbit connection)
- I'm against making decisions based on performance concerns before actually trying it out
Alternative solution 1: Bundle the RocksDB source code in the repo
This is what the ethcore fork does
This would be my second choice if we find git-submodules to be a pain. I wanted to avoid this for the following reasons:
- Avoid bloat of this repo over time
- It's very tempting to make changes to the bundled version of RocksDB, but this will make maintainability a pain
Alternative solution 2: Download RocksDB during compile time
This is what the ngaut fork does.
I think this solution is horrible. You can find my thoughts about it in this comment: https://github.com/spacejam/rust-rocksdb/issues/69#issuecomment-256478979
Compiling RocksDB using a build.rs file
I decided to go with this as we can assume that all the platforms this project will be built on will have the ability to compile and run Rust code. The only dependency is a C++ compiler.
Alternative solution 1: Call the makefile from build.rs
I think this would be fine for Linux/Mac support but we will need something separate for Windows which uses CMake/MSBuild.
Alternative solution 2: Use a bash script instead of build.rs
This is an extension of "solution 1". It is what the ngaut fork does. I didn't like this for the following reasons:
- We would need to implement a separate build script for Windows
- We can already do a lot of the things that bash can do in Rust. But in a cross-platform way