Rclone copy post-copy check logic with SMB

What is the problem you are having with rclone?

This is not necessarily a problem with rclone. I’d like to understand how post-copy check of checksums works with rclone copy when the destination is a mounted SMB folder.

I know that a checksum is generated during the reading of the source files, but I’m unsure how the checksum is calculated after writing to the destination completes. Since the destination is a mounted SMB folder does that mean the files are effectively downloaded, or read, from the destination through the SMB protocol just to compute the written checksum?

If so, I guess that the bandwidth is effectively halved since we’re doing both a read and write of the files at the destination.

As a bonus, I’d appreciate if someone with knowledge of the code can point me to where to find this logic.

Thanks!

Run the command 'rclone version' and share the full output of the command.

rclone v1.69.1
- os/version: oracle 8.10 (64 bit)
- os/kernel: 4.18.0-553.16.1.el8_10.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.24.0
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Local storage but transferring from local computer folder to SMB mounted folder.

The command you were trying to run (eg rclone copy /tmp remote:tmp)

sudo rclone copy /usr/local/myfolder/tmppp0/ /mnt/fire-smb/pete/tmppp0/ --no-check-dest --progress -vv --inplace

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

File not found - using defaults

A log from the command that you were trying to run with the -vv flag

2025/09/25 15:10:57 DEBUG : rclone: Version "v1.69.1" starting with parameters ["/bin/rclone" "copy" "/usr/local/myfolder/tmppp0/" "/mnt/fire-smb/pete/tmppp0/" "--no-check-dest" "--progress" "-vv" "--inplace"]
2025/09/25 15:10:57 DEBUG : Creating backend with remote "/usr/local/myfolder/tmppp0/"
2025/09/25 15:10:57 NOTICE: Config file "/root/.config/rclone/rclone.conf" not found - using defaults
2025/09/25 15:10:57 DEBUG : fs cache: renaming cache item "/usr/local/myfolder/tmppp0/" to be canonical "/usr/local/myfolder/tmppp0"
2025/09/25 15:10:57 DEBUG : Creating backend with remote "/mnt/fire-smb/pete/tmppp0/"
2025/09/25 15:10:57 DEBUG : fs cache: renaming cache item "/mnt/fire-smb/pete/tmppp0/" to be canonical "/mnt/fire-smb/pete/tmppp0"
2025/09/25 15:10:57 DEBUG : 100355-0.bin: Need to transfer - File not found at Destination
2025/09/25 15:10:57 DEBUG : 100355-1.bin: Need to transfer - File not found at Destination
2025/09/25 15:10:57 DEBUG : 100355-2.bin: Need to transfer - File not found at Destination
[redacted due to size]
2025/09/25 15:12:02 DEBUG : 968093868-0.bin: md5 = eb0c8066235e35332fb720b4ca8768b8 OK
2025/09/25 15:12:02 INFO  : 968093868-0.bin: Copied (new)
Transferred:       46.851 GiB / 46.851 GiB, 100%, 740.329 MiB/s, ETA 0s
Transferred:          635 / 635, 100%
Elapsed time:       1m5.1s
2025/09/25 15:12:02 INFO  :
Transferred:       46.851 GiB / 46.851 GiB, 100%, 740.329 MiB/s, ETA 0s
Transferred:          635 / 635, 100%
Elapsed time:       1m5.1s

2025/09/25 15:12:02 DEBUG : 4 go routines active

Yes. Rclone still sees your smb dest as a local path, so it calculates hashes the same way it would for any other local path (i.e. manually). The SMB protocol part is handled by your OS, not rclone per se.

Note that this would be different if you were using the rclone smb backend, but it looks like you are not.

Not necessarily, because rclone reads and hashes the file while it is copying it, and then caches that hash internally if successful.

Which leads to here:

Which (for the local backend) leads to here:

As discussed earlier, if the internal object has cached hashes, those are returned:

The dst hash we read while copying is then compared to the src hash, and if they are different, it will error:

2 Likes

Thanks for the details!

So while a file is being read from src, to copy into the destination, the read bytes are hashed, and then that hash is stored in the dst object to be used later in the checkHashes function?

I didn’t see where the src object gets its hash from. Does it get computed from scratch by reading it again during this call?

Yes that's basically it.

The src should also be hashed while we're reading it (but separately from the dst) and then cached on the object:

Note that the src.Hash method there ends up at the same Hash function as the dst I linked to earlier, just with a different object:

But that's essentially a coincidence of doing a local-to-local transfer (it could just as easily have been dropbox-to-local or something). So the src and dst are not really directly aware of each other.

For completeness' sake, I suppose I should mention that there's a slight exception to this on macOS, where there's an equivalent of server-side copying:

1 Like

I think I get it now. Please correct me if I’m wrong.

The src file is opened with its own instance of a hasher. Whenever any bytes are read from it, they get passed to this hasher as well. When the src file is closed, the final hash is computed and cached in the src object to be used later in the Hash function.

The Update function is used to copy the src data to the dst. In this function, a new hasher instance is created. As the src bytes are read, they are also passed to this hasher instance. After the copy is complete, the final hash is computed and cached in the dst object to be used in the Hash function.

The checkHashes function then calls the Hash function on both src and dst, which returns the cached value, to compare them.

In this case of local to local copy, hashing is done twice on the same exact data that gets read from the src file. Would it be accurate to say then that for local to local copy the hashing is not providing any value since it gets computed from the same byte buffers for both src and dst?

Yes, that's right.

I wouldn't say it's not providing any value. It is potentially valuable in the rare case when a file gets corrupted during transfer, or the source file is modified while it's being read. There's a test here that illustrates this (although we disabled it because it's racy. It's actually quite hard to cause corruption on purpose!)

1 Like

Maybe I’m missing something, but how is this possible to detect? I thought the reading of the src happens once, and then gets passed to both instances of the hasher.

Doesn’t the p byte array here, also get passed through the io.TeeReader to the destination’s hasher instance?

I figure the io.Copy on line 1465 is the one and only time the reading happens of the src file resulting in the same byte array getting passed to both hasher instances, in addition to being copied to the dst file.

I see that this gets detected due to the logic in the Hash function that looks for whether the file changed. In this case it looks like both the src and the dst file may get read again for the sake of hashing.

However, when this is not the case then I don’t think the cached hash value will ever be different as I wrote in my previous post.

Is this “changed” case that you were thinking of that would still provide value?

I see what you mean. I think the key is here:

So, on the src side, if we didn't read and hash exactly the expected number of bytes, we don't cache the hashes. That will cause o.Hash() to recalculate fresh ones, instead of using the cached ones (in the part of the code you cited).

I suppose that leaves open a small possibility that we read the right number of bytes but they are the wrong bytes somehow, and then we still cache the hash and don't recheck it... but maybe that's by design. Maybe @ncw can chime in if I'm missing something (I didn't write this part of the code!)

I think that would be true for the src but not the dst. I think the dst would always have a cached hash at this point, so hashFound should always be true (but you are right that changed could be true, in which case it would recheck.)

2 Likes

Interesting, I hadn’t noticed that if statement before. I think I have a solid grasp of how these pieces fit together now. Thanks for you time and patience!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.