Mirroring remote repositories to s3 storage, with atomic updates and periodic garbage collection.

SJTUG

Last update: Feb 22, 2023

Related tags

Command-line rsync-sjtug

Overview

rsync-sjtug

WIP: This project is still under development, and is not ready for production use.

rsync-sjtug is an open-source project designed to provide an efficient method of mirroring remote repositories to s3 storage, with atomic updates and periodic garbage collection.

This project implements the rsync wire protocol, and is compatible with the rsync protocol version 27. All rsyncd versions older than 2.6.0 are supported.

Features

Atomic repository update: users never see a partially updated repository.
Periodic garbage collection: old versions of files can be removed from the storage.
Delta transfer: only the changed parts of files are transferred. Please see the Delta Transfer section below for details.

Commands

rsync-fetcher - fetches the repository from the remote server, and uploads it to s3.
rsync-gateway - serves the mirrored repository from s3 in http protocol.
rsync-gc - periodically removes old versions of files from s3.

Example

Sync rsync repository to S3.

$ RUST_LOG=info RUST_BACKTRACE=1 AWS_ACCESS_KEY_ID=<ID> AWS_SECRET_ACCESS_KEY=<KEY> \
  rsync-fetcher \
    --src rsync://upstream/path \
    --s3-url https://s3_api_endpoint --s3-region region --s3-bucket bucket --s3-prefix repo_name \
    --redis redis://localhost --redis-namespace repo_name \ 
    --repository repo_name
    --gateway-base http://localhost:8081/repo_name

Serve the repository over HTTP.

$ cat > config.toml <<-EOF
bind = ["localhost:8081"]

[endpoints."out"]
redis = "redis://localhost"
redis_namespace = "test"
s3_website = "http://localhost:8080/test/test-prefix"

EOF

$ RUST_LOG=info RUST_BACKTRACE=1 rsync-gateway <optional config file>

GC old versions of files periodically.

$ RUST_LOG=info RUST_BACKTRACE=1 AWS_ACCESS_KEY_ID=<ID> AWS_SECRET_ACCESS_KEY=<KEY> \
  rsync-gc \
    --s3-url https://s3_api_endpoint --s3-region region --s3-bucket bucket --s3-prefix repo_name \
    --redis redis://localhost --redis-namespace repo_name \ 
    --keep 2

It's recommended to keep at least 2 versions of files in case a gateway is still using an old revision.

Design

File data and their metadata are stored separately.

Data

Files are stored in S3 storage, named by their blake2b-160 hash (<namespace/<hash>).

Listing html pages are stored in <namespace>/listing-<timestamp>/<path>/index.html.

Metadata

Metadata is stored in Redis for fast access.

Note that there are more than one file index in Redis.

<namespace>:index:<timestamp> - an index of the repository synced at <timestamp>.
<namespace>:partial - a partial index that is still being updated and not committed yet.
<namespace>:partial-stale - a temporary index that is used to store outdated files when updating the partial index. This might happen if you interrupt a synchronization, restart it, and some files downloaded in the first run are already outdated. It's ready to be garbage collected.
<namespace>:stale:<timestamp> - an index that is taken out of production, and is ready to be garbage collected.

Not all files in partial index should be removed. For example, if a file exists both in a stale index and a "live" index, it should not be removed.

Delta Transfer

rsync-sjtug implements the delta transfer algorithm described in the rsync protocol specification, which can reduce the amount of data transferred from remote server.

However, because the basis file is not available locally, it needs to be fetched before a delta can be calculated. What's more, S3 doesn't support random writes, so the patched file must be uploaded completely.

Therefore, if your S3 storage is not close (e.g. in the same network) to your rsync server, you may want to disable it.

Comments

Idea: alternative file serving
S3 backends often have a convenient HTTP endpoint for end users, so the current implementation reuses that and redirects requests.

If we are to support more storage backends, we must consider two problems:

If the backend does not provide an HTTP endpoint, should we support them and have our gateway serve the files? We may use OpenDAL to simplify the implementation if we choose to do so.

Some backends do not have a stable URL toward a file. For example, this is the case for S3 disabling public access, which enforces pre-sign and expiration time. Another case is the IPFS storage, which might have a stable key, but its URL is not specifiable. This causes problems for: a) redirecting, because we now generate URLs based on simple string concatenation (prefix + hash); b) listing generation, because we must know the URL in advance.

To save traffic, accessing files (and jumping to subdirectories) does not require access to the gateway now. This is implemented by relative URLs. But this approach breaks for dynamic keys. A possible way is to direct all links to the gateway, which might increase traffic but should be able to handle the above cases.
opened by PhotonQuantum 1
Fix: symlink implementation
Due to an early design flaw (directory entries were not saved in the metadata server), current implementation of symlink resolving is expensive.

New procedure:

Try to look up the path directly in the hashtable.

If failed, fallback to standard posix logic. Starting from the first component, follow the fs tree.

Thus was not feasible because without dir entries stored in the metadata server, it is hard to tell whether a path is a directory or doesn't exist. Thus, backtrace is needed, and the time complexity is unacceptable.

Appendix. current implementation

Try to trace the path recursively for redirections. This works if there's no symlinked dir in ancestors of given path, or stuck.

For each ancestor, check if it's a dir/symlink. Follow once if is a symlink. Then concat target dir and remaining components.

Goto 1
opened by PhotonQuantum 0
Docs: update docs
[ ] update gateway design doc: we don't need an end slash now to list a directory

[ ] clarify intended usage: separation of data panel and control panel (if this is not needed, JuiceFS is a better choice)
opened by PhotonQuantum 0
Idea: hardlink support

Currently, all hard links are resolved as regular files. Sometimes, excessive bytes are read from the remote server. This can be problematic for repositories making heavy use of hard links, e.g., fedora.

Rsync has hard link support and can accurately transfer hard links between servers. Dev and inode ids are transmitted through the wire on the file list transfer stage. The client may recognize duplicated dev and ino pairs and initiate file content transfer only for the first instance.

One naive approach is to manually specify a source by some heuristics and treat the rest as symlinks. However, there are challenges with this implementation.

Hard links are non-directional. So it's better to see them as a cluster rather than a link. One approach is to manually specify a source by some heuristics and treat the rest as symlinks. However, if the "virtual" source is removed later, a new source must be chosen, and all other files initially sharing the same inode should be rewritten to point to the new source. Furthermore, detecting them without changing the metadata format (to book-keeping hard links) is expensive because we must reverse-track all entries pointed to this source. Therefore it's not a good choice to reuse existing symlink handling.

Another possible implementation is to use hard link info only as an optimization. When the generator requests a file, first check if there's another file with the same dev & ino already asked. If yes, do not request this file and reuse the hash (remember, we use the content hash to address files). The only extra cost we need to pay (other than receiving and storing dev & ino fields in FileEntrys) is a hash table from (dev, ino) to file idx.

opened by PhotonQuantum 0
Idea: Integration with OpenDAL to extend the storage backend to all storage services.

Nice project!

Maybe we can work together to integrate opendal and extend the storage backend to all storage services? I think there are many people looking for solutions for rsync to different storage platforms.

opened by Xuanwo 3