Cannot synchronize with Amazon S3 glacier storage

What is the problem you are having with rclone?

I want to configure rclone on AWS S3 Storage, I've configured it and managed to successfully upload files using rclone sync. Unfortunately, after I make a change on my local directory that I'm syncing with S3, rclone show an error:

2022/02/11 17:54:50 ERROR : ddd.jpg: Failed to copy: InvalidObjectState: Operation is not valid for the source object's storage class
	status code: 403, request id: [request id], host id: [id here]
2022/02/11 17:54:50 ERROR : ddd.jpg: Not deleting source as copy failed: InvalidObjectState: Operation is not valid for the source object's storage class
	status code: 403, request id: [request id], host id: [id here]
2022/02/11 17:54:50 ERROR : ddd.jpg: Couldn't move into backup dir: InvalidObjectState: Operation is not valid for the source object's storage class
	status code: 403, request id: [request id], host id: [id here]
2022/02/11 17:54:50 ERROR : S3 bucket rpi-backup-test path recent: not deleting directories as there were IO errors
2022/02/11 17:54:50 ERROR : Attempt 1/3 failed with 2 errors and: failed to delete 1 files

What I want to accomplish:

  1. Synchronize local directory with Amazon S3 storage
  2. Prevent data loss by using --backup-dir to copy modified/deleted files into separate location

Run the command 'rclone version' and share the full output of the command.

$ rclone version
rclone v1.57.0
- os/version: ubuntu 21.10 (64 bit)
- os/kernel: 5.13.0-28-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.17.2
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

AWS S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

$ rclone copy -v -i ./test_backup aws_glacier:backup-test/recent/ --backup-dir aws_glacier:backup-test/bak/

The rclone config contents with secrets removed.

[aws_glacier]
type = s3
provider = AWS
access_key_id = [key_id]
secret_access_key = [key]
region = eu-central-1
location_constraint = eu-central-1
acl = private
storage_class = DEEP_ARCHIVE

hi,
rclone sync --backup-dir is not going to work.
that requires server-side copy, which does not work for glacier.

for deep glacier which i use daily,
rclone copy --immutable

and rclone sync is not a good fit,
for example,
--- yesterday you rclone sync a 100GiB from local file to glacier.
--- today, that local file changes, and you run rclone sync which will overwrite yesterday's 100GiB file.
--- since the first 100GiB file was deleted from glacier, in just one day, within the 90 day retention period,
aws will charge you for 100GiB X 89 days.

1 Like

This seems to be saying you can't server side copy a glacier object... let me check the docs...

Yes, in the docs it says

If the source object's storage class is GLACIER, you must restore a copy of this object before you can use it as a source object for the copy operation. For more information, see RestoreObject.

1 Like

(for the avoidance of doubt, a server side move on S3 is a copy followed by a delete as there is no move primitive)

Ok, so here is a thing: I've got about 200GB of files to backup. Most of them (and mostly those largest) won't ever change. Some small text files change daily. I used to zip them all once a week, but they became too large to do it that way. I wanted to use rclone to detect what has changed and transfer those changes only.

Is there any functionality in rclone that would do that for me without using server-side copy that Glacier don't have?

for deep glacier which i use daily,
rclone copy --immutable

Documentation says it "fail if existing files have been modified". I'm not sure what that means. Does it mean my assumed backup cron task will fail as soon as I change any file I've sent to earlier to Glacier?

this is the sequence of steps.

  1. rclone copy --immutable /path/to/file remote: will copy that local file to remote
  2. change the local file
  3. rclone copy --immutable /path/to/file remote:, will fail as the local does not match the remote.

in effect, once a file is uploaded, rclone will never touch the remote file.
for example, once a i have uploaded a 100GiB veeam backup file, that local file should never change, so no reason for rclone to try to re-upload it.

in fact, if the local file did change, there would indicate a big problem, bit-rot, ransomware crypted or some other unexpected issue.
and in that case, for sure, i do not want to rclone to upload that damaged local file.

sure, rclone does that, but glacier is not the correct remote for that.
given the small amount of day, might use another provider for which rclone sync --backup-dir works.

about that zip, i think you want a forever forward incremental backup

  1. create a list of files that have changed or were created within the last seven days.
  2. fed the list to 7zip.
  3. rclone copy that 7zip file to glacier.

I believe (I've never set this up myself) you can set a bucket up so that the files get moved to glacier after a defined interval. That might work for you?

yes, it is call lifecycle rules. i use them to delete older backup files that i now longer need.
https://rclone.org/s3/#glacier-and-glacier-deep-archive

i have never used the lifecycle rules to transition between storage classes.
i find aws s3 standard storage to be very expensive.
so i keep the most recent backups in wasabi s3, which is much less expensive.
and then move files from wasabi to aws s3 deep glacier.

That's interesting, I'll check that out. What will happen if one those "never changing files" gets moved to glacier and then one day for some reason I modify or move it? Rclone won't handle it, right?

BTW, if anyone is interested what I've figured out: I temporarily solved my problem by using Glacier Instant Retrieval Storage class (storage_class = GLACIER_IR in rclone config). Unfortunately it's much more costly then Deep Archive, as @asdffdsa already pointed out. Thank you for recommendation, I'll check that out too.

--- once a file is uploaded to s3, the file is immutable, can never be modified, nothing to do with glacier.
have to re-upload the source.
the dest file will get deleted or if versioning is enabled, the dest file become an older version.
if the dest file is within the glacier 90/180 days retention period, then you will be pro-rated charged as i described up above.
--- by move, you mean to change storage classes?
https://rclone.org/s3/#restore

--- once a file is uploaded to s3, the file is immutable, can never be modified, nothing to do with glacier.
have to re-upload the source.

I'm not sure I understand what you mean. The reason this thread was created is that I had S3 storage, placed my files to Deep Archive and rsync copy or rsync sync commands were failing, because some operations are disabled on deep archive due to its non-zero retrieval time. So I changed storage class to "Glacier Instant retrieval" and then, sync started to work properly.

Now we are talking about moving some of the files to deep archive. How is this different from storing all files in deep archive? When I run rclone on storage with some files in deep archive, it presume it will fail like I described in first post here as soon as I change anything in one of those files.

--- by move, you mean to change storage classes?

I "move" I meant "change location" or "change path of file" here.

let's agree on the terminology.

--- transition - to change a file's storage class.
the file's location does not change.

--- move - to change a file's location.
the file's storage class does not change.

  1. list the files
rclone lsf aws:zorktest --format=pT
file01.txt;STANDARD
file02.txt;STANDARD
  1. transition file02.txt from standard to deep glacier
  2. list the files, notice that the file's location did not change, just the storage class.
rclone lsf aws:zorktest --format=pT
file01.txt;STANDARD
file02.txt;DEEP_ARCHIVE

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.