Difference between rclone sync for s3 and aws s3 sync

What is the problem you are having with rclone?

I am syncing 6 of 10GB files with rclone sync for S3. But rclone is 120sec slower than aws s3 sync for same objects. I'm trying to understand the difference between rclone sync for s3 and aws s3 sync. Where is this delay coming from? I was suspecting the default checksum done by rclone, but it seems like aws s3 sync runs a checksum by default as well. Why is there such a difference? I posted the commands ran for aws s3 sync and rclone sync with the benchmark results to show the performance difference.

On top of that, I was wondering if rclone mount runs default checksums as well.

Run the command 'rclone version' and share the full output of the command.

rclone v1.62.2

  • os/version: debian 10.13 (64 bit)

  • os/kernel: 4.19.0-22-cloud-amd64 (x86_64)

  • os/type: linux

  • os/arch: amd64

  • go/version: go1.20.2

  • go/linking: static

  • go/tags: none

Which cloud storage system are you using? (eg Google Drive)

S3

The command you were trying to run (eg rclone copy /tmp remote:tmp)

Following are some benchmarks on syncing with different parameters:

aws configure set default.s3.max_concurrent_requests 10
aws s3 sync src s3://dst_bucket

612.961856s

rclone sync --transfers=10 src s3:dst_bucket
733.26336s

rclone sync --ignore-checksum --transfers=10 src s3:dst_bucket
672.10352s

S3 upload performance is always trade off between speed and resources usage (mainly RAM). I guess aws S3 sync has more aggressive defaults than rclone.

You can control upload performace using --s3-upload-concurrency and --s3-chunk-size flags.

As per docs:

Increasing --s3-upload-concurrency will increase throughput (8 would be a sensible value) and increasing --s3-chunk-size also increases throughput (16M would be sensible). Increasing either of these will use more memory. The default values are high enough to gain most of the possible performance without using too much memory.
Multipart uploads will use --transfers * --s3-upload-concurrency * --s3-chunk-size extra memory.

For single 10 GB file you can try below "high performance" commands and you will see increasing performance:

2.5GB of ram used to upload:
rclone copy --progress --s3-upload-concurrency 40 --s3-chunk-size 64M 10gb.zip remote:bucket

5GB of ram used to upload:
rclone copy --progress --s3-upload-concurrency 80 --s3-chunk-size 64M 10gb.zip remote:bucket

10GB of ram used to upload:
rclone copy --progress --s3-upload-concurrency 160 --s3-chunk-size 64M 10gb.zip remote:bucket

If you would use above settings for sync with 10 transfers it will use 10x more RAM

Rclone default chunk-size is 5 MB and upload-concurrency is 4 - so it uses 20MB of RAM. Defaults have to take into account that people run it sometimes on very limited systems like raspberry pi etc. but you can easily change these settings using mentioned flags e.g. for your test:

rclone sync --transfers=10 src s3:dst_bucket --s3-upload-concurrency 8 --s3-chunk-size 16M

it will use about 1.2GB of RAM during transfer.

2 Likes

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.