Rclone copy files and checksum

Yep - what asdffdsa says...

You can certainly force checksum comparisons by using --checksum, but if your goal is simply to be sure that the file arrived safely then rclone checksums this automatically. It will not not allow a half-finished or corrupted file to be stored - nor remove a source file (in case of a MOVE or SYNC command) before it has verified that the file was delivered successfully.

The point of --checksum is not to improve transfer-safety, but to improve the accuracy of detecting for example that a file has changed (or to be able to use --track-renames to map data that needs to be moved). rclone normally determines if it needs to make a change to a file based on a combination of it's size and modtime (ie. if both are unchanged it assumes the file is unchanged). Theoretically you could have a file change but get exact the same size and modtime - but this would just be extremely unlikely unless you tampered with the metadata to make it so... So in short, this is very rarely actually needed.

It also has the cost of needing to hash every local file (read all the data) of all files it compares, regardless of if it actually ends up needing to transfer them or not. This is slow, inefficient and potentially puts a lot of extra wear on the harddrives.

I will be happy to answer followup questions, but TLDR it sounds like you are fretting about data-integrity issues that rclone already covers for you :slight_smile:

For the ultra-paranoid, you can go download "hashtab" (if using windows) and manually make your own hash-file and then compare the source to the destination - but assuming you let rclone finish without errors you will find that they hash-match :slight_smile:

The check will by default made by cheking size+modtime match (unless you use --checksum as I said)
But let me reiterate that when the transfer was initially made it did get hashed on both ends to ensure a successful transfer. If you try to re-transfer it will just do this much faster check to see if there is any reason to re-upload it or not.

When hash-checks are made in rclone it uses whatever hash-type the cloud-server uses - since it needs to math that to be useful.
Gdrive uses MD5. You can see a list of what types are used by other providers here:
https://rclone.org/overview/#features
(rclone supports all of them)

if you really really needed to see this info, it should be displayed using the debug-level log:
--log-level DEBUG
or simply:
-vv

2 Likes