Rclone sync: very high memory usage

What is the problem you are having with rclone?

We are running rclone to sync data from our OpenShift MinIO Cluster to (external) S3 Ceph RGW. The source bucket has ~91 million objects. We are using rclone with cron, utilizing Kueue in order to queue the jobs (in order to remain within request rate limits). We are noticing that it using an enormous amount of memory, which keeps steadily increasing. There does not seem to be memory written back to the OS. We have set the resource limit at 200Gi, but it will try to eat more. At some point we have just stopped the job, to prevent the node (that it runs on) from crashing.

❯ kubectl top pods
NAME                                                      CPU(cores)   MEMORY(bytes)   
prod-bucket-2505011141-gp9pk                              6930m        204204Mi 

Just to note, most of the objects are stored in the root. I am mentioning this, because I read this.

Is moving the objects into sub directory really the only way to prevent the excess memory usage, or have there been development in this field since then?

Run the command 'rclone version' and share the full output of the command.

rclone v1.69.2-beta.8581.84f11ae44.v1.69-stable
- os/version: alpine 3.21.3 (64 bit)
- os/kernel: 5.14.0-427.62.1.el9_4.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.24.2
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Source: MinIO
Target: S3 Ceph RGW

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync   source:"prod-bucket"/   target:"prod-bucket"/   --retries=3   --low-level-retries 10   --log-level=INFO   --fast-list   --metadata   --transfers=50   --checkers=8   --checksum   --s3-use-multipart-etag=true   --multi-thread-cutoff=256Mi   --s3-chunk-size=5Mi

The rclone config contents with secrets removed.

2025/05/07 07:08:18 NOTICE: Config file "/.rclone.conf" not found - using defaults

We are using env variables to set it, but basically should look like:

[minio]
type = s3
provider = minio
access_key_id = xxx
secret_access_key = xxx
endpoint = xxx
region = ""

[ceph]
type = s3
provider = Ceph
access_key_id = xxx
secret_access_key = xxx
endpoint = xxx
sse_customer_algorithm = xxx
sse_customer_key_base64 = xxx
sse_customer_key_md5 = xxx
region = ""

A log from the command with the -vv flag

It will take several days to list, so might not useful atm:

[2025-05-07 14:57:00 UTC] INFO: START rclone sync from https://s3.xxx.xxx.net/prod-bucket to https://objectstore.xxx.xxx/prod-bucket
[2025-05-07 14:57:00 UTC] INFO: Executing command: rclone sync   source:"prod-bucket"/   target:"prod-bucket"/   --retries=3   --low-level-retries 10   --log-level=DEBUG   --use-mmap   --metadata   --transfers=50   --checkers=8   --checksum   --s3-use-multipart-etag=true   --multi-thread-cutoff=256Mi   --s3-chunk-size=5Mi
2025/05/07 14:57:00 DEBUG : Configuration directory could not be created and will not be used: mkdir /config: permission denied
2025/05/07 14:57:00 DEBUG : rclone: Version "v1.69.2" starting with parameters ["rclone" "sync" "source:prod-bucket/" "target:prod-bucket/" "--retries=3" "--low-level-retries" "10" "--log-level=DEBUG" "--use-mmap" "--metadata" "--transfers=50" "--checkers=8" "--checksum" "--s3-use-multipart-etag=true" "--multi-thread-cutoff=256Mi" "--s3-chunk-size=5Mi"]
2025/05/07 14:57:00 DEBUG : Creating backend with remote "source:prod-bucket/"
2025/05/07 14:57:00 NOTICE: Config file "/.rclone.conf" not found - using defaults
...
[setting defaults with env]
...
2025/05/07 14:57:00 DEBUG : fs cache: renaming cache item "target:prod-bucket/" to be canonical "target{DaWQt}:prod-bucket"
2025/05/07 14:58:00 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       1m0.0s

2025/05/07 14:59:00 INFO  : 
Transferred:   	          0 B / 0 B, -, 0 B/s, ETA -
Elapsed time:       2m0.0s

It is known problem addressed in the latest beta (v1.70).

Here you are more details:

It has been already merged into the main branch so feel free to download the latest beta and try.

1 Like

Thanks, that seemed to work! Memory is much more stable now (tested with a test-bucket of ~300K objects tho, set --list-cutoff to 100K and it wrote to disk instead of memory). Any indication when this feature will be out of beta (into 1.70 release)?

You can see such indications here:

It is already over due so should be soon.

2 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.