What is the problem you are having with rclone?
I've been experimenting with using rclone on automated test "worker" machines to pull down the test "workspace" (basically just a folder with a bunch of subdirs / files in it) from the main test server via sftp. It works fine, but I'm trying to tune the performance to get it running as fast as possible since this sync step is one of the main bottlenecks of our automated test system and I'm trying to improve upon the previous syncing method (rsync).
Details:
- The workspace folder is about 7.5 GB in size, with a total of 96k files and 9k folders
- There are 35 worker machines, and they all run the
rclone sync
command roughly simultaneously - 8 of the 35 workers are running Windows
- Many of the files in the workspace are tiny Java
.class
files. The sync is actually significantly slower without the--sftp-disable-hashcheck
flag, presumably because computing the hash ends up taking more time than just pushing the small file across the fast LAN - I rolled out rclone gradually to the worker machines, and noticed that syncing on machines still using rsync instead of rclone became slower and slower as I migrated other workers to rclone, presumably because rclone is able to hog more of the network resources and/or server CPU resources?
- In the "slow" case where the previous sync was from a different branch (and thus a large number of files actually need to be synced), the fastest node finishes in about 5.5 minutes, and the slowest finishes in 6.5 minutes
- Under the previous solution (rsync), non-Windows workers would finish the sync in 2-3 minutes, while Windows workers would finish in around 4-7 minutes.
So in terms of total cumulative time spent syncing across all nodes, rclone is currently a bit slower than rsync, but I'm not really sure why. Any tips / thoughts on other flags / approaches I could use to improve performance? Perhaps some way to compute / reuse a manifest of the directory contents up-front so all the worker nodes don't have to run a bunch of redundant dir listings?
Run the command 'rclone version' and share the full output of the command.
Windows-based workers:
rclone v1.68.1
- os/version: Microsoft Windows Server 2016 Datacenter 1607 (64 bit)
- os/kernel: 10.0.14393.7428 (x86_64)
- os/type: windows
- os/arch: amd64
- go/version: go1.23.1
- go/linking: static
- go/tags: cmount
Ubuntu-based workers:
rclone v1.68.1
- os/version: ubuntu 20.04 (64 bit)
- os/kernel: 5.4.0-193-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.23.1
- go/linking: static
- go/tags: none
Which cloud storage system are you using? (eg Google Drive)
ssh/sftp
The command you were trying to run (eg rclone copy /tmp remote:tmp
)
rclone sync --transfers 2 --sftp-disable-hashcheck --inplace --fast-list --delete-excluded --ignore-checksum --exclude /.git/** jenkins-vm:${REMOTE_REPO_PATH} .
Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.
[jenkins-vm]
type = sftp
host = XXX
user = XXX
key_file = ~/.ssh/id_rsa
shell_type = unix
md5sum_command = md5sum
sha1sum_command = sha1sum
A log from the command that you were trying to run with the -vv
flag
(Log output is massive and contains sensitive info, but looks pretty normal / expected to me; it's copying over files that were modified and there's no evidence that it's attempting to compute any hashes)