Rclone settier fails with scaleway: EntityTooLarge

guybrush · June 23, 2020, 3:32pm

I've set up a scaleway glacier backend which is working well with one exception: settier

# rclone -v --progress settier STANDARD scaleway:guybrush-scripts/0pdj43lcacui7p71d2cr7uumec 
2020-06-24 01:18:31 ERROR : S3 bucket guybrush-scripts: Failed to do SetTier, EntityTooLarge: Your proposed upload exceeds the maximum allowed object size.

I get the same error with move/copy, presumably because glacier backends do not support server-side copy. Why I should get EntityTooLarge for settier is more of a mystery though. I can't see anything about about maximum object size in the scaleway console.

Has anybody seen this before?

Changing an object's storage class in the scaleway console works fine.

ncw · June 23, 2020, 3:50pm

To change the tier of an object you have to alter the metadata.

The only way of doing this on S3 is to copy the object to a new object then delete the old one - so it is doing a copy.

So the question is why doesn't server side copy or move work?

What version of rclone are you using? With a recent rclone 1.52 this should just work I think.

guybrush · June 23, 2020, 4:00pm

rclone 1.52.1

As far as I was aware, server side copy doesn't work with objects with storage class GLACIER. Well, they don't with Scaleway anyway. Guess I'll have to find some other workaround.

Scaleway have some examples of how to do it with the API/aws tool but I can't post the link to this forum. I was hoping rclone could do it for me.

ncw · June 23, 2020, 4:10pm

You can't server side copy them that is true, but if they aren't GLACIER you should be able to server side copy them to GLACIER I think.

That would be interesting to see. I upgraded your account so you can post links

guybrush · June 23, 2020, 4:18pm

guybrush · June 23, 2020, 4:22pm

Here's the corresponding man page from AWS:
https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3api/restore-object.html

ncw · June 23, 2020, 4:31pm

Interesting thanks.

Rclone doesn't yet support the Restore Object API. THere is a place it could go though in a backend command so maybe it should?

What I think should work is if you have an object of normal Tier - rclone should be able to set it to GLACIER for you I think.

guybrush · June 23, 2020, 4:57pm

yes, this seems to work going from STANDARD to GLACIER

guybrush · June 23, 2020, 5:06pm

AFAICT the modification time is also in metadata. Can rclone read this on GLACIER objects, or is it syncing on size only? The Scaleway console won't let me read metadata on GLACIER objects, but it shows up for NORMAL objects.

ncw · June 23, 2020, 5:13pm

Great. Do you think settier should know what to do for GLACIER -> NORMAL? I guess that would be most consistent with the rest of rclone?

Though it is quite a specialised call I think so maybe it should be a different command - what do you think?

Here is the API call in the Go SDK

I think we can ignore the select stuff, the restoring the archive is the stuff we need.

Looks quite complicated!

guybrush · June 23, 2020, 5:18pm

Well, my instinct was "settier NORMAL".

rclone is like a compatibility layer for object stores, so I wouldn't be too fussed about adding some magic to settier. You could always do both, but then you have the problem that a "restore" command might make no sense to some/most backends. It's also asymmetric.. you have "restore" but no "freeze"...

guybrush · June 23, 2020, 5:23pm

s3cmd implementation (python):

github.com

s3tools/s3cmd/blob/3d4a4caf91582c2a674deaa837a5e07d96510059/S3/S3.py#L800


    response = self.send_request(request)
    return response

def object_delete(self, uri):
    if uri.type != "s3":
        raise ValueError("Expected URI type 's3', got '%s'" % uri.type)
    request = self.create_request("OBJECT_DELETE", uri = uri)
    response = self.send_request(request)
    return response

def object_restore(self, uri):
    if uri.type != "s3":
        raise ValueError("Expected URI type 's3', got '%s'" % uri.type)
    if self.config.restore_days < 1:
        raise ParameterError("You must restore a file for 1 or more days")
    if self.config.restore_priority not in ['Standard', 'Expedited', 'Bulk']:
        raise ParameterError("Valid restoration priorities: bulk, standard, expedited")
    body =   '<RestoreRequest xmlns="http://s3.amazonaws.com/doc/2006-3-01">'
    body += ('  <Days>%s</Days>' % self.config.restore_days)
    body +=  '  <GlacierJobParameters>'
    body += ('    <Tier>%s</Tier>' % self.config.restore_priority)

ncw · June 23, 2020, 6:37pm

I could make a backend specific command quite easily.

Do you think people will want to restore one object or many? I guess I could to restore this key or and sub keys quite easily.

guybrush · June 23, 2020, 6:50pm

My use case is restoring backups, which would be single object, directory, or whole bucket. Single objects can already be done with most web interfaces, so it's the directory/recursive/bucket case that needs tooling.

ncw · June 24, 2020, 10:14am

I had a go at this here

https://beta.rclone.org/branch/v1.52.1-111-g58902710-fix-s3-restore-beta/ (uploaded in 15-30 mins)

You use it like this (use rclone backend help s3 to see this help)

I've tested it as far as I think it calls the right API calls, but I haven't restored anything.

Would you be willing to test it some more for me?

Any comments much appreciated!

restore

Restore objects from GLACIER to normal storage

rclone backend restore remote: [options] [<arguments>+]

This command can be used to restore one or more objects from GLACIER
to normal storage.

Usage Examples:

rclone backend restore s3:bucket/path/to/object [-o priority=PRIORITY] [-o lifetime=DAYS]
rclone backend restore s3:bucket/path/to/directory [-o priority=PRIORITY] [-o lifetime=DAYS]
rclone backend restore s3:bucket [-o priority=PRIORITY] [-o lifetime=DAYS]

This flag also obeys the filters. Test first with -i/--interactive or --dry-run flags

rclone -i backend restore --include "*.txt" s3:bucket/path -o priority=Standard

All the objects shown will be marked for restore, then

rclone backend restore --include "*.txt" s3:bucket/path -o priority=Standard

It returns a list of status dictionaries with Remote and Status
keys. The Status will be OK if it was successfull or an error message
if not.

[
    {
        "Status": "OK",
        "Path": "test.txt"
    },
    {
        "Status": "OK",
        "Path": "test/file4.txt"
    }
]

Options:

"description": The optional description for the job.
"lifetime": Lifetime of the active copy in days
"priority": Priority of restore: Standard|Expedited|Bulk

guybrush · June 24, 2020, 12:03pm

I'm getting timeouts with this, though I suspect it is a problem with Scaleway. Same thing happened using s3cmd. I've submitted a ticket to Scaleway. Will keep you posted.

ncw · June 24, 2020, 12:53pm

Thank you - looking forward to seeing how it turns out!

guybrush · June 24, 2020, 4:13pm

I tried using a bucket in Scaleway's nl-ams region instead of fr-par, and it worked fine. Scaleway seem to have a few issues with their service. Unfortunately I can't give it a more thorough test until they fix their fr-par endpoint.

ncw · June 25, 2020, 8:40pm

I've decided to merge this into the the latest beta now given that it has had light testing! If any problems come up, please let me know and I'll fix them before the release.

system · August 25, 2020, 4:40pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.