Multiple instance of Rclone sync working together as a pool/cluster?

What is the problem you are having with rclone?

Currently I'm using rclone to sync data from a Veeam repository to a storj.io bucket.
The process of syncing the data are currently working fine, but are pretty CPU hungry since all data has to be encrypted locally on the server, before transferred to the storj bucket.

I'd like to speed up the process of syncing the data from the repository to the storj bucket, so I've been exploring what options could do that.

  1. Raise the numbers of concurrent file transfers with the --transfers option.
    I'm currently already maxing out the server CPU by using --transfers 3.

  2. Raise the number of parallel parts from a single file to be uploaded using the --parallelism option.
    Unfortunately the --parallelism option are (afaik) not yet ported into rclone from uplink
    This would likely also max out the server CPU, and not transfer more data than option #1.

  3. Add more Ubuntu servers that runs rclones.
    Now this would be doable, since it scales perfectly with how fast I want to upload data from the repository to the storj bucket. Only "problem" here is that each instant of rclone are working independently, and has no clue of which files has been synced to the storj bucket. I could end up having rclone syncing the same 3TB file by two or more rclone instances, even at the same time, and this would not be so ideal :slight_smile:
    I could get around this by putting together some sort of a script, that handles which server and files that has to be synced using the "Rclone server pool/cluster". But if rclone already supports this "multi-sync" I'd be more than happy to know about it.
    If not, I'll look into adding it to the feature request.

Run the command 'rclone version' and share the full output of the command.

rclone v1.64.2
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 5.15.0-89-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.3
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

storj.io

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync --fast-list -vv -P --stats-file-name-length 0 --transfers 3 --min-age 15s  /disk001/ storj:backup/customer001/

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[storj]
type = storj
access_grant = XXX

A log from the command that you were trying to run with the -vv flag

2023/12/11 00:08:47 DEBUG : --min-age 15s to 2023-12-11 00:08:32.618100335 +0000 UTC m=-14.975551376
2023/12/11 00:08:47 DEBUG : rclone: Version "v1.64.2" starting with parameters ["rclone" "sync" "--fast-list" "-vv" "-P" "--stats-file-name-length" "0" "--transfers" "3" "--min-age" "15s" "/disk001/" "storj:backup/customer001/"]
2023/12/11 00:08:47 DEBUG : Creating backend with remote "/disk001/"
2023/12/11 00:08:47 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2023/12/11 00:08:47 DEBUG : fs cache: renaming cache item "/disk001/" to be canonical "/disk001"
2023/12/11 00:08:47 DEBUG : Creating backend with remote "storj:backup/customer001/"
2023/12/11 00:08:47 DEBUG : FS sj://backup/customer001: connecting...
2023/12/11 00:08:47 DEBUG : FS sj://backup/customer001: connected: <nil>
2023/12/11 00:08:47 DEBUG : fs cache: renaming cache item "storj:backup/customer001/" to be canonical "storj:backup/customer001"
2023/12/11 00:08:47 DEBUG : FS sj://backup/customer001: ls -R ./
2023/12/11 00:08:47 DEBUG : FS sj://backup/customer001: OBJ ls -R ./ ("backup", "customer001")
2023/12/11 00:08:48 DEBUG : filename-removed1.vib: Size and modification time the same (differ by 0s, within tolerance 1ns)
2023/12/11 00:08:48 DEBUG : filename-removed1.vib: Unchanged skipping
2023/12/11 00:08:48 DEBUG : filename-removed2.vib: Size and modification time the same (differ by 0s, within tolerance 1ns)
2023/12/11 00:08:48 DEBUG : filename-removed2.vib: Unchanged skipping
2023/12/11 00:08:48 DEBUG : filename-removed3.vib: Size and modification time the same (differ by 0s, within tolerance 1ns)
2023/12/11 00:08:48 DEBUG : filename-removed3.vib: Unchanged skipping
2023/12/11 00:08:48 DEBUG : filename-removed4.vib: Size and modification time the same (differ by 0s, within tolerance 1ns)
2023/12/11 00:08:48 DEBUG : filename-removed4.vib: Unchanged skipping
2023/12/11 00:08:48 DEBUG : filename-removed5.vib: Size and modification time the same (differ by 0s, within tolerance 1ns)
2023/12/11 00:08:48 DEBUG : filename-removed5.vib: Unchanged skipping
2023/12/11 00:08:48 DEBUG : filename-removed6.vib: Size and modification time the same (differ by 0s, within tolerance 1ns)
2023/12/11 00:08:48 DEBUG : filename-removed6.vib: Unchanged skipping
...snip...
2023/12/11 00:09:04 DEBUG : filename-removed4857.vbm: Size and modification time the same (differ by 0s, within tolerance 1ns)
2023/12/11 00:09:04 DEBUG : filename-removed4857.vbm: Unchanged skipping
2023/12/11 00:09:04 DEBUG : filename-removed4858.vbm: Size and modification time the same (differ by 0s, within tolerance 1ns)
2023/12/11 00:09:04 DEBUG : filename-removed4858.vbm: Unchanged skipping
2023/12/11 00:09:04 DEBUG : FS sj://backup/customer001: Waiting for transfers to finish
2023/12/11 00:09:04 DEBUG : Waiting for deletions to finish
2023/12/11 00:09:04 INFO  : There was nothing to transfer
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:              4858 / 4858, 100%
Elapsed time:         5.7s
2023/12/11 00:09:04 INFO  : 
Transferred:              0 B / 0 B, -, 0 B/s, ETA -
Checks:              4858 / 4858, 100%
Elapsed time:         5.7s
2023/12/11 00:09:04 DEBUG : 8 go routines active

Th3Van

You could use filtering to make sure that every rclone instance does its own job. e.g. all files starting with A are processed by rclone instance 1 and all files starting with B by instance 2.

If you need more sophisticated orchestration then you have to script it yourself.

In your particular case I would first try to use storj S3 gateway instead of storj native which is indeed very heavy on CPU and network. multiple storj native rclone instances can bring your network to its knees as well... only test will tell.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.