Cloud storage providers use erasure coding to achieve very high durability. Best practice is to use dual-region or multi-region locations.
We are working on a first erasure-coding implementation for rclone, spreading shards across different regions AND different storage providers. We see the raid3 virtual remote as the first member of a broader family of erasure-coding virtual remotes. Reed–Solomon should follow as the next, being the de facto standard.
Erasure coding changes the stored form of data, so attributes like size and hashes cannot be derived correctly from the per-shard metadata of the underlying remotes alone. We therefore need to append this information to each stored shard. That gives:
- Hashes and size from a range read on the shards, without downloading the full object.
- Self-contained shards: shards that belong to the same logical object can be identified even when the original rclone config is not available.
We propose storing this metadata in a fixed-size footer at the end of each shard. The format is intended to work for a family of algorithms (e.g. raid3 and future Reed–Solomon), not only for raid3.
Proposed 86-byte footer
| Offset | Size | Field | Type / encoding |
|---|---|---|---|
| 0 | 9 | Magic | "RCLONE/EC" (9 bytes) |
| 9 | 2 | Version | uint16 (e.g. 1) |
| 11 | 8 | Content length | int64 (logical file length) |
| 19 | 16 | MD5 | 16 bytes raw |
| 35 | 32 | SHA-256 | 32 bytes raw |
| 67 | 8 | Mtime | int64 Unix seconds |
| 75 | 4 | Algorithm | ASCII, null-padded (e.g. "R3\0\0", "RS\0\0") |
| 79 | 1 | Data shards | uint8 |
| 80 | 1 | Parity shards | uint8 |
| 81 | 1 | Current shard | uint8 (0-based index) |
| 82 | 4 | Reserved | 4 bytes (zeros if unused) |
| 86 | — | (end of footer) | — |
Example:
rclone hashsum MD5 ec-based-remote:path could satisfy the hash by doing a single range read on the last 86 bytes of one shard to get MD5 (and SHA-256), instead of streaming the full object.
We use a fixed-size footer at the tail of each shard so that (1) one range read is enough to read the whole footer, and (2) the format stays simple and predictable. The footer must be at the end of the shard because content length and hashes are only known after the full logical object has been processed.
Please share your thoughts, questions, or suggestions in the replies.
Resources
Cloud storage and erasure coding
- Understanding Cloud Storage's 11 9's durability target (Google Cloud Blog)
Resiliency, durability, and availability (Backblaze B2)
- Resiliency, Durability, and Availability (Backblaze) — 11×9's durability, shards (17 data + 3 parity), and Reed–Solomon erasure coding in the Vault architecture.
Current EC-based virtual backend