Rclone copy without modification/deletion (immutable storage)

What is the problem you are having with rclone?

I'm working on using rclone for backups and wondering can I give rclone permission to read and write from the remote, but not modify/delete. If this were possible it gives strong immutable storage guarantees, since data can never be deleted by the agent. Restricting permissions like this would achieve the same thing as using Google Cloud Storage's Retention Periods to guarantee that no data is modified/deleted for a certain amount of time (lifecycle operations also allow me to clean up older items too, so rclone doesn't need to worry about that either). However whether these guarantees are enforced by limiting the permissions that rclone has or by using a Rentention Period, rclone copy appears to always attempt to delete/modify a file which has changed on local, failing when it doesn't have permissions to do this.

I'm using incremental backups, since the total size of the backup is very large, and the daily increment much much smaller. rclone copy --no-traverse --max-age does the trick as it only copies recent files and it does it efficiently. It works great even if the rclone credentials don't have modify/delete permissions, new files are created on the remote as they appear in the source. However if a file does change on the source then rclone fails with no permissions because it tries to do a delete and then a re-upload, this is true when I use --suffix as well.

Does rclone support taking backups in this situation, where it doesn't have permission to modify or delete from the remote, only read and write? Maybe by always appending a timestamp to the filename on the remote and saving multiple copies when things change? Or else something like --backup-dir but allowing for incremental backups?

What is your rclone version (output from rclone version)

rclone --version

rclone v1.55.1

  • os/type: linux
  • os/arch: amd64
  • go/version: go1.16.3
  • go/linking: static
  • go/tags: none
    -->

Which OS you are using and how many bits (eg Windows 7, 64 bit)

# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.2 LTS
Release:	20.04
Codename:	focal

Which cloud storage system are you using? (eg Google Drive)

Google Cloud Storage

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy --no-traverse --suffix=`date "+-%Y-%m-%d-%H:%M"` --suffix-keep-extension /tmp/test-backup incremental-encrypted: --max-age 48h

The rclone config contents with secrets removed.

[offsite-backups]
type = google cloud storage
project_number = <redacted>
service_account_file = /root/.rclone/backup-agent-credentials.json
anonymous = false
object_acl = private
bucket_acl = private
bucket_policy_only = true
location = europe-west2
storage_class = COLDLINE

[incremental-encrypted]
type = crypt
remote = offsite-backups:incremental
filename_encryption = off
directory_name_encryption = false
password = <redacted>
password2 = <redacted>

A log from the command with the -vv flag

# rclone copy --no-traverse --suffix=`date "+-%Y-%m-%d-%H:%M"` --suffix-keep-extension test-backup incremental-encrypted: -vv --max-age 48h
2021/04/27 14:50:04 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2021/04/27 14:50:04 DEBUG : --max-age 2d to 2021-04-25 14:50:04.975345942 +0100 BST m=-172799.984532566
2021/04/27 14:50:04 DEBUG : rclone: Version "v1.55.1" starting with parameters ["rclone" "copy" "--no-traverse" "--suffix=-2021-04-27-14:50" "--suffix-keep-extension" "test-backup" "incremental-encrypted:" "-vv" "--max-age" "48h"]
2021/04/27 14:50:04 DEBUG : Creating backend with remote "test-backup"
2021/04/27 14:50:04 DEBUG : fs cache: renaming cache item "test-backup" to be canonical "/root/test-backup"
2021/04/27 14:50:04 DEBUG : Creating backend with remote "incremental-encrypted:"
2021/04/27 14:50:05 DEBUG : Creating backend with remote "offsite-backups:incremental"
2021/04/27 14:50:05 DEBUG : pacer: Reducing sleep to 6.522509ms
2021/04/27 14:50:05 DEBUG : hello.txt: Sizes differ (src 19 vs dst 15)
2021/04/27 14:50:05 DEBUG : hello2.txt: Size and modification time the same (differ by 0s, within tolerance 1ns)
2021/04/27 14:50:05 DEBUG : hello2.txt: Unchanged skipping
2021/04/27 14:50:05 DEBUG : Encrypted drive 'incremental-encrypted:': Waiting for checks to finish
2021/04/27 14:50:05 DEBUG : pacer: Reducing sleep to 0s
2021/04/27 14:50:05 INFO  : hello.txt: Copied (server-side copy) to: hello-2021-04-27-14:50.txt
2021/04/27 14:50:05 ERROR : hello.txt: Couldn't delete: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to the Google Cloud Storage object., forbidden
2021/04/27 14:50:05 DEBUG : Encrypted drive 'incremental-encrypted:': Waiting for transfers to finish
2021/04/27 14:50:05 ERROR : Attempt 1/3 failed with 1 errors and: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to the Google Cloud Storage object., forbidden
2021/04/27 14:50:05 DEBUG : hello.txt: Sizes differ (src 19 vs dst 15)
2021/04/27 14:50:05 DEBUG : hello2.txt: Size and modification time the same (differ by 0s, within tolerance 1ns)
2021/04/27 14:50:05 DEBUG : hello2.txt: Unchanged skipping
2021/04/27 14:50:05 DEBUG : Encrypted drive 'incremental-encrypted:': Waiting for checks to finish
2021/04/27 14:50:05 ERROR : hello.txt: Failed to copy: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to incremental/hello-2021-04-27-14:50.txt.bin., forbidden
2021/04/27 14:50:05 ERROR : hello.txt: Not deleting source as copy failed: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to incremental/hello-2021-04-27-14:50.txt.bin., forbidden
2021/04/27 14:50:05 DEBUG : Encrypted drive 'incremental-encrypted:': Waiting for transfers to finish
2021/04/27 14:50:05 ERROR : Attempt 2/3 failed with 1 errors and: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to <redacted>-incremental/hello-2021-04-27-14:50.txt.bin., forbidden
2021/04/27 14:50:05 DEBUG : pacer: Reducing sleep to 9.051459ms
2021/04/27 14:50:05 DEBUG : Encrypted drive 'incremental-encrypted:': Waiting for checks to finish
2021/04/27 14:50:05 DEBUG : hello.txt: Sizes differ (src 19 vs dst 15)
2021/04/27 14:50:05 DEBUG : hello2.txt: Size and modification time the same (differ by 0s, within tolerance 1ns)
2021/04/27 14:50:05 DEBUG : hello2.txt: Unchanged skipping
2021/04/27 14:50:05 DEBUG : pacer: Reducing sleep to 0s
2021/04/27 14:50:05 ERROR : hello.txt: Failed to copy: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to incremental/hello-2021-04-27-14:50.txt.bin., forbidden
2021/04/27 14:50:05 ERROR : hello.txt: Not deleting source as copy failed: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to incremental/hello-2021-04-27-14:50.txt.bin., forbidden
2021/04/27 14:50:05 DEBUG : Encrypted drive 'incremental-encrypted:': Waiting for transfers to finish
2021/04/27 14:50:05 ERROR : Attempt 3/3 failed with 1 errors and: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to incremental/hello-2021-04-27-14:50.txt.bin., forbidden
2021/04/27 14:50:05 INFO  : 
Transferred:   	        15 / 15 Bytes, 100%, 30 Bytes/s, ETA 0s
Errors:                 1 (retrying may help)
Checks:                10 / 10, 100%
Deleted:                1 (files), 0 (dirs)
Transferred:            1 / 1, 100%
Elapsed time:         0.9s

2021/04/27 14:50:05 DEBUG : 4 go routines active
2021/04/27 14:50:05 Failed to copy: googleapi: Error 403: offsite-backup-agent@<redacted>.iam.gserviceaccount.com does not have storage.objects.delete access to incremental/hello-2021-04-27-14:50.txt.bin., forbidden

hello and welcome to the forum,

you can create a user in the cloud provider and give it whatever permissions you need.
and create a service account file based on that user for rclone to use.

keep in mind, that for a file that is older than max-age;
if you move a file from one local folder to another local folder, then rclone will not copy that file and you local and cloud will be out of sync.

Hi @asdffdsa,
Thanks for those pointers. I've already created a user with restricted permissions, but rclone is failing when it comes across a file that has changed, I'm wondering is there a way to accomplish immutable storage with rclone even when files change on the source.

I understand the considerations around --max-age, maybe in my example it's not needed, I'm more interested in achieving immutable storage.

well,
the whole point of immutable, is not to be able to delete the dest file.

also, i noticed that you are using coldline storage.

i use aws s3 deep glacier, that that does not work well with --backup-dir.
once a file is uploaded, it is not expected to change for a period of time, 90 days with gcloud coldline
if you delete a file, you are charged a pro-rated storage fee, which can get very expensive.

Gotcha, let me clarify. I'm looking for a way to make the remote immutable, so once data is written there are strong guarantees that it won't be changed. The source might change, more files will be added and it's possible that a file will be modified on the source even after being copied to the remote. I'd like to track these changes on the source, without loosing the previous revisions to that file (as modifying or deleting this old version would violate the concept of immutable storage). In practice I'd imagine that this means rclone would need to save multiple versions of the file in the remote, but operations like rclone copy would only consider the latest version when deciding if a file is already there. Can rclone do something like this?

I've found the --immutable flag which does indeed force immutability on the remote, but it errors out if the source file changes. I'd prefer to keep enforcing immutability on the remote, but also copy the changed file and use that as the "latest" version for future comparisons.

perhaps i am not understanding your use-case but that is not immutability.
you want dest files not to be deleted or changed and at the same time, want to have the source overwrite the dest?

have you tested --backup-dir? tho that could get very expensive with coldline and might not work at all.
i tried using aws s3 deep glacier and --backup-dir and it did not work.
if i remember correctly, there is no server-side move/copy for that storage class

Maybe --compare-dest would be useful here - you could give your immutable archive as the parameter to --compare-dest then have a different directory for the changes to go in. That avoids moving files out of the immutable archive.

I've tried --backup-dir with the same restricted service account (can read and write but not modify or delete) and it looks close to what I'm after, but it's still attempting a delete on the remote, which I'd like to avoid. @asdffdsa if I understand you correctly about being expensive, it will also do quite a few moves between directories, thereby removing the gains with using colder storage.

@ncw I'm having a hard time getting my head around these --backup-dir and --compare-dest flags so please don't be afraid to baby step me :stuck_out_tongue: Are you suggesting creating a separate backup dir per day, with only the files that have been changed/created since the previous daily backup in the folder for each daily backup? If so, do you mean that I should provide all the existing daily backup dirs to the --compare-dest flag?

correct.
there are a lot ways to get stuff done with rclone.
for my use-case, i keep recent backups in wasabi, a s3 clone, known for hot storage, using --backup-dir and then move data to aws s3 deep glacier, $1.00/TB/month using --immutable

as for --backup-dir, i use a command like so, forever forward incremental backups.

rclone sync  /path/to/local/folder remote:current  --backup-dir=remote:archive/`date +%Y%m%d.%I%M%S`
  1. a local file /path/to/local/folder/file.txt has changed.
  2. rclone checks to see if remote:current/file.txt is present.
  3. if remote:current/file.txt is present, then rclone will move remote:current/file.txt to remote:archive/20210427.012429/file.txt
  4. rclone will copy /path/to/local/folder/file.txt to remote:current/file.txt

note: if rclone cannot perform a server-side-move, then rclone will do a server-side-copy and delete.

every time i looked at the docs, i always saw --compare-dest and wondered what use is it?
and now, assuming i understand the OP use-case, your suggestion is a workaround/solution

rclone copy $source $dest --compare-dest=$immutable --suffix=.`date +%Y%m%d.%I%M%S` --suffix-keep-extension -vv

Thanks @asdffdsa and @ncw for those pointers. After a bit of digging --compare-dest did what I needed, but I had to write a wrapper around rclone to get incremental backups working the way I wanted it to, the resulting wrapper is here:

It handles full backups and restores (which rclone does perfectly out of the box, even with no replace/delete permissions, so no big magic there) and also incremental backups and restores, this requires a bit of extra logic to get all previous backups and pass to --compare-dest when backing up and then applying each incremental backup one by one until the desired date in time is reached when restoring.

This seems to work well, but if rclone can do something similar either now or in the future I'd love to hear about it.

Thanks again!

Can you do a high level description of how your script works?

At some point I'd like to do a dedicated rclone backup command which probably has some configurable strategies.

On the remote the dir structure is:

bucket/
  full/
    2021-04-29-121425/
    2021-04-30-181229/
  incremental/
    2021-04-29-194217/
    2021-04-30-212211/

Immutable full backups are just forwarding to rclone backup, as it's just the basic use case of "copy this entire directory to the remote", nothing special. These go into full/$(date +"%Y-%m-%d-%H%M%S". Restoring this is also a simple copy, if you specify the date it will restore that exact backup, otherwise it will check what the latest date available is and restore that.

Immutable incremental backups are a bit more involved, as you need to first get all the previous directories and then pass them to --compare-dest. So if you had taken 100 previous incremental backups then you'd need to pass them all to --compare-dest. Note that this may become a problem if taking backups more frequently than daily and stored for 7-10 years, as there are limits on the length of a command in bash and this command will get very long with all those --compare-dest. These backups go to incremental/$(date +"%Y-%m-%d-%H%M%S" and will not contain a full backup, just the files changed since the previous backup. Restoring these requires first listing all previous backups, then restoring them one by one into the same directory on local (overwriting) starting from the oldest until you reach the point in time that you want to restore to, or else keep going to restore them all if no date is specified.

Nice scheme!

I wonder about the efficiency of that too - rclone will have to do at least one API call per incremental backup per object in the backup which will add up quickly :frowning:

It would be more efficient to list the existing objects in each layer one by one building up a map of the newest objects before doing the sync. Rclone could do this before the sync building up a map of the object in memory.

If rclone could save the state of a backend to disk then I could imagine some commands which could do this, this would make your sync much more efficient.

Using --backup-dir is much more efficient since rclone only has to move an object when it finds one in the way.

Note you could make your rclone copy much more efficient by include a --max-age flag so it only looks at local objects of a certain age.

Note also that Google really hates --no-traverse - it will make rclone run very slowly.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.