`rclone append` command, including S3 backends

Immutability of objects is a strong restriction on S3-style object storage. However, several providers offer ways to overcome this for specific backends:

  • Yandex Object Storage: PatchObject S3 extension that supports partial overwrite and append of existing objects.
  • AWS S3 Express One Zone (directory buckets): supports appending data to existing objects via put-object with --write-offset-bytes.
  • Alibaba OSS and Azure Blob Storage: provide appendable object/blob types.

It would be useful if rclone exposed an append / patch operation at the CLI/API level and mapped it to the best available primitive per backend.

For classic S3 (no native append), this can be emulated server-side using multipart upload + UploadPartCopy:

  1. Initiate multipart upload for the target key.
  2. UploadPartCopy the existing object (or a prefix range) as part 1.
  3. UploadPart with the data to append as part 2.
  4. CompleteMultipartUpload to create the new object.

This pattern is described here (append via multipart + copy):

There is prior discussion on the rclone forum around incremental/append-only uploads to S3:

A high-level interface could look something like:

# append local data to remote object
rclone append local.dat remote:bucket/object

# or: append one remote object to another
rclone append remote:bucket/src remote:bucket/dst

For backends that support true append/patch (Yandex PatchObject, S3 Express One Zone, OSS AppendObject, Azure append blobs), append could call the native API. For classic S3 and pure-object backends, rclone could optionally fall back to the multipart+copy+upload strategy (with a flag to enable/disable emulation).

Notes:

  • This would always produce a new ETag/version for the target object (both for native append and for multipart emulation).
  • It would be especially helpful for workloads that periodically update object headers/footers or maintain append-only logs in object storage.

Does such an append/patch operation fit into rclone’s model, and would it be acceptable to have backend-specific implementations plus a best-effort multipart emulation for S3?


This article may also be of interest:
https://www.heise.de/en/news/AWS-makes-S3-a-file-system-11249368.html

Just to restate the core idea: I’m proposing an rclone append/patch operation that maps to native append APIs where available (Yandex PatchObject, S3 Express One Zone, OSS AppendObject, Azure append blobs) and otherwise optionally emulates append on S3 via multipart upload + UploadPartCopy.

I’m mainly interested in whether this fits into rclone’s existing backend model and API surface, and if there are known design constraints that would make this hard to accept.

If anyone familiar with rclone’s backend interfaces or the S3 implementation has thoughts on the feasibility or better API shape for this, I’d appreciate comments.

1 Like

This would definitely need a new backend interface.

It kind of interacts with resume (which I've been experimenting with recently)

For resume I was proposing to make OpenChunkWriter and OpenWriterAt restartable.

That isn't quite the same as append...

I guess the question is how many providers actually support append, or would that be cutting down something more generic.

Before posting my original proposal, I did a quick survey focused on object storage and found these two core points:

  • A small number of providers expose native append/patch semantics.

  • Even when they don’t, append could probably be emulated using range-reads plus multipart / server-side copy primitives.

I had assumed at that time that drive-type backends would at least support appending to files. After checking more carefully, I found they don’t.

Given these constraints, an rclone append/patch command would probably only be implementable on a subset of object-storage backends and would not generalize to drive-type backends.

That seems to argue for modeling append/patch as an optional backend capability rather than a fully generic core operation. But if it only exists as a backend-specific subcommand (e.g. rclone backend append), it wouldn’t show up in the shared capability interface, and tools couldn’t rely on different remotes implementing it in a consistent way — is that true?