Seeking Ideal Rclone Settings for Near Real-time Upload to AWS

What is the problem you are having with rclone?

When I run my rclone transfer from an HTTP source, I am getting just 1.5 mb/s and I need to get this running faster. My upload is taking 3-5 min / 400 mb file and I really need to get that all the way down to 3-5 sec.

My baseline burst GB/s on the EC2 instance is 5. I'm hoping to find the optimal setting to get this copy function close to this speed. I cannot easily use clone as the directory contains a bunch of files with irregular strings I don't want in AWS.

I have looked into using a chunker overlay and have 1 configured in my .config file (called overlay- see below) to run on my source, but this doesn't seem to be boosting my upload either. I have chunker configured to chunk for anything 1M+ in size.

What is your rclone version (output from rclone version)

1.53.2

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Amazon Linux 2 x64

Which cloud storage system are you using? (eg Google Drive)

AWS

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy --ignore-existing -v prs_idx, aws_prs --buffer-size=512M --tpslimit=40 --transfers=20 --tpslimit-burst=10

The rclone config contents with secrets removed.

[horelS3]
type = s3
env_auth = false
access_key_id = XXX
secret_access_key = XXX
#region = other-v2-signature
endpoint = https://pando-rgw01.chpc.utah.edu
location_constraint =

[AWS test]
type = s3
provider = AWS
env_auth = false
access_key_id = XXX
secret_access_key = XXX
region = us-west-1

[NOMADs]
type = http
url = https://nomads.ncep.noaa.gov/pub/data/nccf/com/hrrr/prod/

[NCEP]
type = http
url = https://ftpprd.ncep.noaa.gov/data/nccf/com/hrrr/prod/

[AWS Grib]
type = s3
provider = AWS
env_auth = true
region = us-east-1

[overlay]
type = chunker
remote = NCEP
chunk_size = 1M
hash_type = md5


A log from the command with the -vv flag

2020/11/11 23:09:53 INFO  :
Transferred:      265.996M / 385.768 MBytes, 69%, 1.479 MBytes/s, ETA 1m21s
Transferred:            0 / 1, 0%
Elapsed time:       3m0.5s
Transferring:
 *                     hrrr.t21z.wrfprsf15.grib2: 68% /385.768M, 1.528M/s, 1m18s

2020/11/11 23:10:53 INFO  :
Transferred:      352.996M / 385.768 MBytes, 92%, 1.471 MBytes/s, ETA 22s
Transferred:            0 / 1, 0%
Elapsed time:       4m0.5s
Transferring:
 *                     hrrr.t21z.wrfprsf15.grib2: 91% /385.768M, 1.448M/s, 22s

2020/11/11 23:11:15 INFO  : hrrr.t21z.wrfprsf15.grib2: Copied (new)
2020/11/11 23:11:15 INFO  :
Transferred:      385.768M / 385.768 MBytes, 100%, 1.473 MBytes/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:      4m22.5s


2020/11/11 23:11:15 INFO  : Starting HTTP transaction limiter: max 40 transactions/s with burst 10
2020/11/11 23:11:15 INFO  :
Transferred:             0 / 0 Bytes, -, 0 Bytes/s, ETA -
Checks:                 1 / 1, 100%
Elapsed time:         0.5s

Is it the source limiting the speed of the transfer? rclone to s3 on ec2 is blazing fast!

I note that the --tpslimit will apply to both the source (http) and destination (s3) so you might want to raise it a bit. In fact that is probably the source of the speed limit.

Chunker won't help you to speed things up I don't think. Only use it if you actually want chunks.

There seem to be : missing from your config file - I assume that that is a copy/paste problem? That should be remote = NCEP: otherwise you are copying to a local directory called NCEP.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.