Optimizing performance on large buckets

What is the problem you are having with rclone?

I’m trying to copy a large bucket from AWS S3 to Cloudflare R2 and struggling to get acceptable performance. My bucket is 4TB and has about 3.5m objects but almost none of the objects are updated since the previous copy and this is the case I want to optimize.

Currently this is taking around 2 hours with rclone. I can sync the same bucket into google using STS in a few minutes (yes I appreciate this isn’t quite the same thing)

I am already using --fast-list --size-only

I’m running the command on an AWS ec2 VM and I have upgraded it to t3.large so it can run without paging. It is not hammering the RAM or CPU while running.

As far as I can see, the ‘list’ phase is very quick on the source bucket but much slower on the destination.

I am trying to investigate the max-age and no-traverse options and I have a smaller bucket of 100k objects that I have been doing some tests with:

  • without either it syncs in less than a minute
  • with max-age takes over an hour, listing both the source and destination are very slow
  • with no-traverse it takes around 30 minutes. the listing of the source is quick but the check phase is slow
  • with both max-age and no-traverse it takes around 40 minutes - the listing of the source is slower than the check phase would be without max-age

I haven’t tried these on my main bucket but on the basis of these tests either of these flags is going to make it an order of magnitude slower when I was expecting them to make it faster.

Run the command 'rclone version' and share the full output of the command.

rclone v1.73.5

  • os/version: amazon 2023.11.20260413 (64 bit)
  • os/kernel: 6.1.166-197.305.amzn2023.x86_64 (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.25.9
  • go/linking: static
  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

from AWS s3 to CloudFlare R2

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy aws:source-bucket cloudflare:dest-bucket --transfers 16 --checkers 16 --multi-thread-streams 8 --s3-chunk-size 64M --fast-list --size-only --log-level=INFO --stats 1m

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[aws]
type = s3
provider = AWS
env_auth = false
region = eu-west-1
location_constraint = eu-west-1
access_key_id = XXX
secret_access_key = XXX

[cloudflare]
type = s3
provider = Cloudflare
access_key_id = XXX
secret_access_key = XXX
region = auto
endpoint = https://xxxx.r2.cloudflarestorage.com

A log from the command that you were trying to run with the -vv flag

I can attach logs later if required

welcome to the forum,

check out
https://forum.rclone.org/t/recommendations-for-using-rclone-with-a-minio-10m-files/14472
https://forum.rclone.org/t/how-to-sync-s3-with-millions-of-files-at-root/36703

Thanks for the fast response, it’s an interesting approach.

However I see that just rclone lsf on the destination bucket is taking almost as long as the current sync, which matches up exactly with what I see on the progress stats of my normal sync (with fast-list and size-only).

The only way to avoid listing the whole destination bucket would seem to be no-traverse, which in turn means I have to use max-age. However max-age hugely slows down the listing of the source bucket (from files 10,000 per sec to 40 per sec) so that’s a non-starter :frowning:

OK I think I have figured it out. The max-age parameter seems to cause it to send a HEAD request for each file in order to get the age, but I can prevent that by using --use-server-modtime

That makes it fast to check, but unfortunately if there are any changes to copy, the copy fails with a strange error:

Failed to copy: failed to prepare upload: operation error S3: CreateBucket, https response error StatusCode: 403, RequestID: , HostID: , api error AccessDenied: Access Denied

I’m not sure why it’s trying to create the bucket when the bucket definitely exists. The error only happens when you have both max-age and no-traverse specified - I guess that might be a bug?

Either way there is yet another flag --s3-no-check-bucket which seems to fix it

So now I have the listing and checking phase of the full 3.5m files down to about 5 minutes :slight_smile:

My final command is

rclone copy aws:source-bucket cloudflare:dest-bucket --max-age 7d --use-server-modtime --s3-no-check-bucket --no-traverse --transfers 16 --checkers 16 --multi-thread-streams 8 --s3-chunk-size 64M --fast-list --size-only --log-level=INFO --stats 1m