Checksums flag, where are the hashes checked

alexshao1456 · August 30, 2019, 1:12pm

A quick question

We're aiming for a setup whereby we have a server (B) that executes a script containing a rclone copy command accessing windows machines remotely (A) and copying files over to another different server (C)

When using the checksum flag to prevent copying over files that already exist will the intermediate executing server copy the files over to itself locally (B) and check the hashes there or will it check the hashes at the source (A)?

Thanks

thestigma · August 30, 2019, 2:50pm

Normally B would end up being a middle-man here where all the traffic goes through B. I think that will be the case regardless of if you use --checksum or not.

There is an alternative in some cases in using server-side copying, where you can basically get one cloud remote to send data directly to another. In those cases the cloud systems hash-check themselves (in fact they store it as metadata because they used more robust filesystems). Since hashes are always available that is the default method used for comparison - unless of course you involve 2 different cloud systems that use different hashing methods, in which case hash-checks will not be possible and it has to fall back to checking size and modtime.

Let me note that hash-checks are not needed to simply prevent copying over existing files. A size/modtime check is usually more than sufficient and rclone simply skips these on a copy or sync by default. It is quite unusual that files get corrupted in-flight. After all there are multiple layers of control mechanisms at work from TCP and up to detect and correct transmit-errors. Hashchecking is nice when you have the option to use it, but probably only needed for mission-critical data.

In order to know if server-side copying is a possibility for you I'd have to know more about what cloud services are used (if any... I guess you can use rclone just for syncing outside cloud).

An alternative fix to avoid having to middle-man it is to have rclone run on either A or C of course - if you control that environment. You could even set it up with the built-in remote-control and thus do all of the administrative work from B even if none of the data goes through B at that point.

system · November 28, 2019, 2:50pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.