S3 -> S3 Large File copy

What is the problem you are having with rclone?

I'm currently trying to copy a largish file (17GB) from one s3 bucket to another which both reside in the same region. I'm running this in a Kubernetes job running with a pod that has up to 10GB of RAM and 2 vCPUs. It appears that rclone only ever uses up 8MB of memory and practically no CPU. I wasn't sure if this was because it was a remote to remote transfer or something involved with my settings.

Ultimately, it doesn't appear to affect the copy speed at all no matter how much I play with transfers, s3-upload-concurrency, or s3-chunk-size. The copy always seems to take around 10-12 minutes.

What is your rclone version (output from rclone version)

rclone v1.52.3
- os/arch: linux/amd64
- go version: go1.14.7 

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy s3:bucket1/dir/file.txt s3:bucket2/dir/ --transfers 16 --s3-chunk-size 64M --s3-upload-concurrency 16 --log-level DEBUG

The rclone config contents with secrets removed.

- name: RCLONE_CONFIG_S3_TYPE
  value: s3

A log from the command with the -vv flag

https://pastebin.com/BYTgFZin

hello and welcome to the forum,

  • if you transferring a single file, then flags like --transfers have no effect.

  • the transfer is limited by internet speeds of the computer running rclone.
    so no matter of amount of tweaking can change that hard limit.
    what is the internet speeds of the pod?

  • flags like --s3-upload-concurrency have a limited effect, often not much difference between the default of 4 versus 16

Rclone will be doing a server side copy I think. If you use -v then you can see for sure in the log.

Rclone doesn't use concurrency for server side copies - it could but I didn't implement it yet.

So you are limited by how fast S3 copies data internally.

17G in 10 mins is 28MB/s which isn't brilliant.

You can disable server side copies with --disable Copy if you want to download and upload. That will cost you transfer fees though.

@ncw rclone supports the --s3-copy-cutoff flag for multipart server-side copies. For multipart copies, will rclone copy the chunks in parallel?

I don't think it does. It could, it just hasn't been done yet.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.