Memory Usage when sycing large amount of files in GCS Bucket

What is the problem you are having with rclone?

I’m seeing memory usage exceeding my available memory when using rclone, leading to OOM issues and my system killing the rclone process after 30 minutes to 2 hours.

The memory usage just creep upwards, stays at 90%, and at some point, the system kills the process:

I’ve used:

  • --use-mmap
  • --list-cutoff 10000
  • --max-buffer-memory=1G
  • export GOGC=20

To somehow limit memory consumption, but nothing seems to help very much. The bucket in questions contains millions of files in a single directory, which I believe might be the problem, but I was under the impression that especially –list-cutoff should help with that problem.

Run the command 'rclone version' and share the full output of the command.

rclone v1.70.3

  • os/version: unknown
  • os/kernel: 4.4.302+ (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.24.4
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Google Storage (GCS Buckets)

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy GCS:data /volume1/gcs/data/ 
--config /volume1/data/rclone.conf 
--max-buffer-memory=1G
--use-mmap 
--checkers 16 
--transfers 8 
--list-cutoff 10000 
--gcs-decompress -P 
--log-file /volume1/data/log.txt

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[GCS]
type = google cloud storage
project_number = XXX
service_account_file = /volume1/data/service-acc.json
bucket_policy_only = true
location = XXX

What’s the actual memory on the system?

I have 4GB of memory available, 3GB are available and rclone uses up most of it during its runtime (~about 100-200mb stays free).

Can you test on the latest beta as well? I think this → Abnormal Memory Consumption After Writing and Deleting 400K Files on Ceph RGW · Issue #8683 · rclone/rclone

Might be related too.

If it is millions of files in a directory then this should fix it… however gcs doesn't have the underlying ListP primitive to enable it yet.

It isn't too hard to add though, if you make an issue in GitHub we can discuss.

So that param doesn't actually work for GCS? I do get the log outputs about batching though. I'll open a ticket for this.

Edit:
I've opened an issue on Github. I'd also be open to sponsor this feature if it isn't a huge amount of work, and if this is something you are open to.

1 Like

Thanks for making the issue @Elysium . I don't think it is a great deal of work to implement and if you'd like to sponsor the feature please drop us an email to sales@rclone.com and we can discuss - thank you.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.