rclone hashsum md5 azure:crypt/file_crypted
# hash1 returned
rclone hashsum md5 azure:crypt/file_crypted --download
# hash1 returned
rclone hashsum md5 azure_crypt:file
2023/10/16 09:00:57 ERROR : file: hash unsupported: hash type not supported
2023/10/16 09:00:57 Failed to hashsum with 2 errors: last error was: hash unsupported: hash type not supported
rclone hashsum md5 azure_crypt:file --download
# hash2 returned
In this case backend supports md5, so rclone operations directly on azure may use hashes of files without need to download them.
But crypt wrapper doesn't support hashes, so as I understand, files before compare (f.ex sync command) need to be downloaded from base backend and decrypt to calculate md5 (or am I wrong?).
So to optimize operations (eliminate the need for download files content) I need to add additional layer like this, right?
[_azure-crypt]
type = crypt
remote = azure:crypt
...
[azure-crypt]
type = hasher
remote = _azure-crypt:
As I understand, to optimize remote operations even more (eliminate the need for hash queries to remote), do I may add another layer also over base azure configuration like this?
[_azure]
type = azureblob
...
[azure]
type = hasher
remote = _azure:
I'm not sure when hasher is recommended and when it may just spoil things, so I'm asking for some advices
rclone only looks at modification time and size of files to see if they are equal. You could use --checksum flag and then it will check hashes but only when available so nothing will be downloaded from your crypt backend when running e.g. sync.
Using hasher backend is fully optional - do it if you need hashes in a crypt backend. Other option (sort of workaround) is to use chunker and specify hash_type for all files - more details here. I use the latter successfully for my data - the drawback is that for every file there is side car file with metadata.
rclone also has special cryptcheck command you can use to check a remote against an encrypted one - it will utilise hashes provided by crypt underlying remote.
Thank you for your reply, I've forgot about this flag.
But let's say I'm using it - is my scenario correct then?
To summarize:
adding hasher layer over crypt remote can optimize operations to avoid downloading files from base remote
adding hasher layer over base remote (that itself supports hashes) can optimize operations to avoid downloading metadata (hashes)
And third point, which I've not mentioned yet:
adding hasher layer over base remote (that itself DOES NOT support hashes) can optimize operations to avoid downloading metadata (hashes) AND files content
Does it sound generic (and true) enough, to be taken like some kind of universal recommendations?
Yes - if you require files hashes on remote without it then you can use hasher. But it is not required for any rclone commands to works. E.g. sync will work with or without hash support.
What you list in your points 1,2 and 3 is true. But I would not say that it is universal recommendation. rclone can work with or without hashes. And hasher stores hashes locally on your client machine - so it is your responsibility to ensure that your remote is modified only by using hasher remote.