I don't know how to force a checksum comparison when using crypt and dropbox backends.
If modified date and file size are equal, rclone sync apparently does not compare checksums. Rclone knows how to compare checksums for dropbox backends, which I can verify by running rclone cryptcheck. So I thought I could run rclone sync --checksum, but then I get the message --checksum is in use but the source and destination have no hashes in common; falling back to --size-only.
Is it possible to compare all of modified date, file size and checksum?
Run the command 'rclone version' and share the full output of the command.
rclone v1.65.1
- os/version: Microsoft Windows 10 Pro 22H2 (64 bit)
- os/kernel: 10.0.19045.3930 (x86_64)
- os/type: windows
- os/arch: amd64
- go/version: go1.21.5
- go/linking: static
- go/tags: cmount
Which cloud storage system are you using? (eg Google Drive)
Dropbox
The command you were trying to run (eg rclone copy /tmp remote:tmp)
A log from the command that you were trying to run with the -vv flag
...
2024/01/21 21:07:20 NOTICE: Encrypted drive 'dropbox-rclone-crypt:': --checksum is in use but the source and destination have no hashes in common; falling back to --size-only
...
I had never thought about doing it that way. Also, I've never used hasher, but I presume it stores the hash by creating additional files on the base remote, like chunker, right? Which one would be more efficient?
Hasher stores hashes in local database so it only works when you use it from the same computer all the time. Chunker stores them in sidecar files so doubles number of stored files. You have to decide which trade-off is more acceptable for your use case.
I prefer chunker approach but I do not work with datasets containing a lot of small files - so I do not mind extra objects in my remote.
Chunker stores them in sidecar files so doubles number of stored files.
And doubles the number of operations to read/rewrite a stored file, right? (as the sidecar file also has to be accessed).
You have to decide which trade-off is more acceptable for your use case.
One more question, if you don't mind: what happens if/when I try and access the remote on another computer? Does the local database need to be recreated, or do I just lose the hash functionality?
I prefer chunker approach but I do not work with datasets containing a lot of small files - so I do not mind extra objects in my remote.
Some datasets that I work with do have a ton of small files -- I have ~35TB distributed among ~13M files, so on average ~2.7MB/file. Not sure whether that's small or what