How to use checksums when syncing crypt over dropbox?

marco-eckstein · January 21, 2024, 8:23pm

What is the problem you are having with rclone?

I don't know how to force a checksum comparison when using crypt and dropbox backends.

If modified date and file size are equal, rclone sync apparently does not compare checksums. Rclone knows how to compare checksums for dropbox backends, which I can verify by running rclone cryptcheck. So I thought I could run rclone sync --checksum, but then I get the message --checksum is in use but the source and destination have no hashes in common; falling back to --size-only.

Is it possible to compare all of modified date, file size and checksum?

Run the command 'rclone version' and share the full output of the command.

rclone v1.65.1
- os/version: Microsoft Windows 10 Pro 22H2 (64 bit)
- os/kernel: 10.0.19045.3930 (x86_64)
- os/type: windows
- os/arch: amd64
- go/version: go1.21.5
- go/linking: static
- go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Dropbox

The command you were trying to run (eg `rclone copy /tmp remote:tmp`)

rclone sync --checksum "C:\my-folder\" dropbox-rclone-crypt:

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[dropbox-rclone]
type = dropbox
client_id = XXX
client_secret = XXX
token = XXX

[dropbox-rclone-crypt]
type = crypt
remote = dropbox-rclone:crypt
password = XXX
password2 = XXX
filename_encoding = base32768

A log from the command that you were trying to run with the `-vv` flag

...
2024/01/21 21:07:20 NOTICE: Encrypted drive 'dropbox-rclone-crypt:': --checksum is in use but the source and destination have no hashes in common; falling back to --size-only
...

kapitainsky · January 21, 2024, 10:52pm

crypt remote does not support hashes.

This error message could be more clear indeed. Local supports all hashes but crypt supports none.

Do not use --checksum with crypt remote.

If really needed you can add hashes to crypt by using hasher or chunker remotes on top of it.

durval · January 22, 2024, 1:56pm

I had never thought about doing it that way. Also, I've never used hasher, but I presume it stores the hash by creating additional files on the base remote, like chunker, right? Which one would be more efficient?

kapitainsky · January 22, 2024, 2:48pm

Hasher stores hashes in local database so it only works when you use it from the same computer all the time. Chunker stores them in sidecar files so doubles number of stored files. You have to decide which trade-off is more acceptable for your use case.

I prefer chunker approach but I do not work with datasets containing a lot of small files - so I do not mind extra objects in my remote.

durval · January 22, 2024, 3:33pm

Perfect. Many thanks for the explanation!

Chunker stores them in sidecar files so doubles number of stored files.

And doubles the number of operations to read/rewrite a stored file, right? (as the sidecar file also has to be accessed).

You have to decide which trade-off is more acceptable for your use case.

One more question, if you don't mind: what happens if/when I try and access the remote on another computer? Does the local database need to be recreated, or do I just lose the hash functionality?

I prefer chunker approach but I do not work with datasets containing a lot of small files - so I do not mind extra objects in my remote.

Some datasets that I work with do have a ton of small files -- I have ~35TB distributed among ~13M files, so on average ~2.7MB/file. Not sure whether that's small or what

kapitainsky · January 22, 2024, 3:37pm

You can always copy it to another machine - check docs. It would be enough to copy all database. Just another things to remember and worry about IMO:)

Yeah - all is relative:)

system · February 21, 2024, 3:37pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.