My task is checking of the status of restoring process from glacier using rclone.
The new header (x-amz-optional-object-attributes: RestoreStatus) appeared in the s3 list API (please, check this). It would be helpful to add it to rclone lsf/lsjson commands for S3 backend (or create separated command like "rclone backend restore").
Another options is to use the response header (X-Amz-Restore) of HEAD S3 object request, which contains information about restoring status (ex.: X-Amz-Restore: ongoing-request="false", expiry-date="Thu, 07 Sep 2023 00:00:00 GMT"). [please, check this].
I have used the following command:
rclone lsjson -R --files-only config:bucket/some_dir --dump headers
I have seen the "X-Amz-Restore:" response header, which isn't used.
I searched almost everywhere and didn't find something similar. Maybe was I bad searcher?
but this already exist....
and then for example:
rclone backend restore s3:test-bucket-kptsky/test/ --include /glacier.test -o priority=Standard -o lifetime=1
"Status": "RestoreAlreadyInProgress: Object restore is already in progress\n\tstatus code: 409, request id: 5GY8RZQFMSK96S8K, host id: i/QZ6MJJ5lD82ihUhzuUgn0dW6994A2bXIFx02XH2sN6UiOA9OtmY6/CwaUr2mTtOb9Nb43wTXM=",
Ok, but how can I recognize the case when the objects has been already restored?
- I send a command rclone backend restore. It will give me output with "OK" statuses for files.
- I send a command rclone backend restore once more time. It will give me output with "AlreadyInProgress" statuses for files.
Internal S3 action: completed restoring process
- I send a command rclone backend restore for the 3rd time. It will give me output with "OK" statuses for files (it seems it requests them again, although this files already are in standard storage).
It's unclear how to distinguish the cases:
- when files were requested and they're not in the standard storage
- when files are already in the standard storage
What I do I only check once if "AlreadyInProgress" or "OK" - then after time interval try to download. If it fails I wait and try again.
I agree, this's the solution to the the issue, but the command to check the restore status looks like a more convenient and intelligent way.
Agree - there might be better ways. Rclone absolutely is not perfect - nothing is.
Given that you've already investigated AWS spec maybe you could give it a go and implement it using information you discovered? There is always room for making things better.
I didn't see an API to check the restore status of the object with a quick look through the S3 docs but you can easily read the Tier of the object.
rclone lsf -R -F pT TestS3:bucket/path
rclone lsjson -R TestS3:bucket/path
Does that work?
That doesn't work, because when your objects have been moved to the glacier, their tier always will be 'GLACIER' or 'DEEP_ARCHIVE', regardless of whether it has been restored or not.
I'm not pro in AWS API, but there's new request 'RestoreStatus' option for x-amz-optional-object-attributes header of ListObjectsV2, then the following data will be appeared in the response:
Ref:ListObjectsV2 - Amazon Simple Storage Service
I'm not so pro in the rclone architecture as active members, it will be able to take a long time. Therefore, I has proposed this as a new feature to implement by someone, which has more knowledgeable about the rclone architecture.
OK give this a try
v1.64.0-beta.7331.7bcfbf8e1.fix-s3-restore-status on branch fix-s3-restore-status (uploaded in 15-30 mins)
Show the restore status for objects being restored from GLACIER to normal storage
rclone backend restore-status remote: [options] [<arguments>+]
This command can be used to show the status for objects being restored from GLACIER
to normal storage.
rclone backend restore-status s3:bucket/path/to/object
rclone backend restore-status s3:bucket/path/to/directory
rclone backend restore-status -o all s3:bucket/path/to/directory
This command does not obey the filters.
It returns a list of status dictionaries.
- "all": if set then show all objects, not just ones with restore status
Thanks for the very quick reply.
I've tested it and it's what was needed.
As for "all" option, it seems to me that this option must be default, but with some other option like "skip nil" restore statuses to display only "not nil" restore statuses.
The decision is yours.
By default rclone will skip
nil restore statuses
-o all makes it show all restore statuses
nil or not.
Why do you think
-o all should be the default?
Ok, I get your point, agree with you.
By the way, I've given it some thought and it seems it would be useful to add its current storage tier to the output of each object in case of "-o all" option. It make easier to distinguish between objects in the glacier and in the standard storage.
I hope I'm not puzzling you too much.
Thank you very much!
Good idea. Give this a go which has a
StorageClass field too.
v1.64.0-beta.7338.31fc382f6.fix-s3-restore-status on branch fix-s3-restore-status (uploaded in 15-30 mins)
I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.64
This seems to work only for directories.
Restoring file1.txt and file2.txt I get:
rclone backend restore-status s3:bucket/path
But query just the object gives an error:
rclone backend restore-status s3:bucket/path/file1.txt
2023/09/12 15:57:03 Failed to backend: is a file not a directory
- os/version: almalinux 8.6 (64 bit)
- os/kernel: 4.18.0-372.32.1.el8_6.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.1
- go/linking: static
- go/tags: none
Yes due to the way it works it will only work for directories. It doesn't use the higher level listing routines, it has to use the low level ones which only list directories.