Rclone support for skipping Glacier-stored files

timeisapear · June 29, 2023, 8:00pm

In the past I've used rclone to sync between remote S3 source and destination. One error I encountered was when the source bucket contains a mix of Standard class and Glacier class files. When attempting to download files to the destination bucket that are Glacier class, an error is raised as might be expected.

Today AWS announced that S3 now provides restore status of S3 Glacier objects using the S3 List API: Amazon S3 provides restore status of S3 Glacier objects using the S3 LIST API

That would presumably allow for rclone to skip those Glacier files and proceed without error, correct?

asdffdsa · June 29, 2023, 8:07pm

hello and welcome to the forum,

https://forum.rclone.org/t/copy-from-s3-to-another-s3-filter-by-storage-class/37861

interesting....

ncw · June 29, 2023, 8:23pm

I think you should be able to use metadata filtering to filter out the objects with tier=GLACIER

See Documentation for more info

asdffdsa · June 29, 2023, 8:56pm

#list all files
rclone lsf --format=pT remote: 
file.20200409.145236.7z;DEEP_ARCHIVE
file.20200921.180322.7z;DEEP_ARCHIVE
file.20210224.192949.7z;DEEP_ARCHIVE
file.20220321.185251.7z;DEEP_ARCHIVE
file.7z;STANDARD

#list all files, exclude storage_class=DEEP_ARCHIVE
rclone lsf --metadata-exclude tier=DEEP_ARCHIVE --format=pT remote: 
file.7z;STANDARD

timeisapear · June 29, 2023, 9:02pm

Ok, thank you both. I had not seen or used the metadata flags yet. That looks like a great solution!

timeisapear · June 29, 2023, 9:17pm

Quick follow-up question: if I run an rclone copy --metadata --metadata-exclude tier=GLACIER src: dest: for S3, will including the metadata flag cause a new copy of every file from src: to dest:? Some of our S3 remote bucket pairs have been fully synced prior to using rclone, so I would not want to redownload terabytes of files just to run rclone the first time.

asdffdsa · June 29, 2023, 9:41pm

run the command with these flags, can see what rclone would transfer
--dry-run -vv
or
--dry-run --log-level=DEBUG --log-file=~/rclone.log

ncw · June 30, 2023, 5:51pm

Metadata is not synced only copied on the first upload so it won't cause any changes on dest:.

This could change in the future, but the worst that would happen is that rclone would update the metadata on dest: which is a quick operation.

ncw · June 30, 2023, 5:51pm

... If you are worried about that then you could use rclone lsf --metadata --metadata-exclude tier=GLACIER src: to make a list of files to feed to --files-from.

system · August 29, 2023, 5:52pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.