Is Hasher usable to cache hashes for the local filesystem to reduce reads?

What is the problem you are having with rclone?

I'm using rclone to keep a local backup of my Google Drive account which is nearing 2Tib of data. Currently each time rclone runs it ends up reading and hashing every local file, which is a lot of IO overhead.

I'm exploring to see if using the Hasher overlay on my local path would allow for significantly more efficient syncs.

Run the command 'rclone version' and share the full output of the command.

$ rclone version
rclone v1.66.0
- os/version: ubuntu 20.04 (64 bit)
- os/kernel: 6.8.4-3-pve (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.22.1
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

/usr/bin/rclone sync BACKUP-GDrive-eric_dalquist: HASHER-GDrive-eric_dalquist: \
  --progress --fast-list --max-backlog=1000000 --checksum --checkers=16 \
  --tpslimit=5 --transfers=10 --bwlimit=200M --timeout=30s --low-level-retries=2 \
  --retries=1 --delete-during --delete-excluded --use-server-modtime \
  --exclude=Takeout/** -v \
  --log-file=/shared/backups/rclone/logs/2024-06-01_BACKUP-GDrive-eric_dalquist_hasher.log

Please run 'rclone config redacted' and share the full output.

$ rclone config redacted
[BACKUP-GDrive-eric_dalquist]
type = drive
client_id = XXX
client_secret = XXX
scope = drive
use_trash = false
export_formats = ods,odt,odp,svg
acknowledge_abuse = true
token = XXX
root_folder_id = XXX

[HASHER-GDrive-eric_dalquist]
type = hasher
remote = /shared/backups/rclone/GDrive/eric.dalquist

The local filesystem /shared/backups is a NFS mount to a NAS.

The rclone sync run appears to work correctly. I'm mostly just looking for validation that this is a reasonable use of Hasher. I fully realize it is experimental and I'm using it at my own risk.

Well the result seems to be working and making a huge difference

Execution when doing a sync against the raw filesystem:

Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Checks:             23740 / 23740, 100%
Elapsed time:     1h1m6.4s

Execution when doing a sync against the hasher overlay:

Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Errors:                 1 (retrying may help)
Checks:             23740 / 23740, 100%
Elapsed time:      3m56.7s

Yes, hasher can wrap local same as any other backend.

However, if what you are ultimately trying to do is sync gdrive to a NAS, you might be better off installing and running rclone directly on the NAS (i.e. cut out the NFS mount middleman).

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.