Blobnet v0.2 — Design
Resolves: MOD-306, MOD-542
Blobnet: An embedded caching, content-addressed blob storage system with configurable sources and proxies.
Issues with Blobnet v0.1
- We have to download the entire file even if we only read a range, for caching reasons.
- There is no way of implementing local caching on the worker.
- The system isn’t flexible enough to allow for a gradual migration to S3 (or even just trying it out as another source of truth).
Proposal for Blobnet v0.2
The blobnet system is configured with an ordered set of sources, plus a cache.
Having two sources allows us to gradually transition from NFS to S3 without downtime.
let blobnet = Blobnet::builder()
.source(provider::S3::new("modal-blobnet"))
.source(provider::NFS::new("/efs/blobnet"))
.cache("/var/tmp/.blobnet-cache", 1 << 21)
.build();
Each provider has the following API:
#[async_trait]
trait Provider {
async fn read(hash: &str, range: (u64, u64)) -> Result<Option<Bytes>>; // Return the data
async fn write(data: Bytes) -> Result<String>; // Returns the hash of the data
}
All functions are fallible. The blobnet server has a Blobnet
struct, and the worker client also has a Blobnet
struct, which allows them to share SSD (local instance storage) caching logic.
The difference is that the worker client just uses its Blobnet
struct to handle imagefs requests, while the blobnet server uses its struct to serve requests over HTTP.
S3 Provider
Takes the S3 bucket as an argument, also probably an instance of aws_sdk_s3
's Client object to interact with them. Writes complete files to chunks in S3 by SHA-256 hash, such as /aa/bb/cc/dddddddddddddddddddddddddddddddddddddddddddddddddddddddddd
.
NFS Provider
Takes a local directory as input, checks that it is a network file system on creation, and writes complete files atomically to that directory. Similar to S3-provider.
“Client Proxy” Provider
This acts as a client to a running blobnet server. This can be used on the worker itself to connect to the blobnet instance over HTTP, but then also include a cache
to reuse the exact same caching logic on its local file system. Specifically:
let blobnet = Blobnet::builder()
.source(provider::ClientProxy::new("http://blobnet.modal.internal"))
.cache("/var/tmp/.blobnet-cache", 1 << 21)
.build();
Cache
Saves chunks of files within a certain page size in the cache whenever something is read. All file reads go through the cache first before hitting the provider if it is missing. Note that this page size is configurable. In the examples above the page size is set to 2 MB (1 << 21
) but this is configurable, since it only affects the cache.
The cache is a local file system directory. But the files saved are in the format /aa/bb/cc/ddd...dd/<chunk-num>
because we don’t want to download the entire file if our goal is to execute a small read.
Update (Oct. 26): I think I’ll first benchmark Sled as a cache instead of EBS. It has the potential to be a lot faster, especially given its built in page cache and file defragmentation. Let’s see what happens. It doesn’t seem to be mature enough yet though, maybe RocksDB is better?
Update 2: Okay, I did some quick measurements, and with reasonable confidence I think using the file system directly is better. However there is still quite some latency here and we might want to also include an in-memory LRU cache too — this is the difference between 200-400 us for a 2 MB read with warm page cache, and 700-2800 ns for the same 2 MB read from memory.