Getting rate limited before advertised limit on s3 compatible object storage

What is the problem you are having with rclone?

Let me preface by saying that I am fairly new to using rclone.

I am getting rate limited before the advertised rate limit on Linode (s3 compatible) object storage. They have an advertised rate limit of 750 requests / second. The maximum I was able to get was 125 requests / second. I reached out to the support. Among other possible reasons one of the thing that they mentioned was this:

Your client may be making more requests than log entries, abstracting multiple requests into a single log entry.

I am getting rate limited as soon as I try for 126 requests / second.I have configured the rclone parameters (to the best of my knowledge) to make sure that I am only sending 126 transactions / second (and each transaction is just a single put request). The --dump=header logs also seems to only show this one request and one response for each file.

This is probably due to some other limit on Linode's side, but just wanted to confirm that there are no additional requests being sent other than the one that I am seeing in the logs. It would make sense if there were 6 requests being made for each file upload (125 * 6 = 750).

Run the command 'rclone version' and share the full output of the command.

rclone v1.58.1
- os/version: Microsoft Windows 11 Home Single Language 21H2 (64 bit)
- os/kernel: 10.0.22000.675 (x86_64)
- os/type: windows
- os/arch: amd64
- go/version: go1.17.9
- go/linking: dynamic
- go/tags: cmount

Which cloud storage system are you using? (eg Google Drive)

Linode Object Storage (s3 compatible).

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy <local_dir> <remote>:<bucket> --s3-no-head --s3-no-head-object --s3-no-check-bucket --no-check-dest --retries 1 --low-level-retries 3  --local-no-check-updated --header-upload "Content-Encoding: gzip" --timeout 10s --contimeout 10s --transfers 126 --tpslimit 126 --tpslimit-burst 0 --max-backlog 126 --dump headers --log-file=log.txt

The rclone config contents with secrets removed.

[redacted]
type = s3
provider = Other
access_key_id = <redacted>
secret_access_key = <redacted>
endpoint = us-southeast-1.linodeobjects.com
acl = public-read

A log from the command with the -vv flag

2022/05/29 11:28:50 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2022/05/29 11:28:50 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2022/05/29 11:28:50 DEBUG : HTTP REQUEST (req 0xc000d05b00)
2022/05/29 11:28:50 DEBUG : PUT <redacted>
Host: us-southeast-1.linodeobjects.com
User-Agent: rclone/v1.58.1
Content-Length: 4
content-encoding: gzip
content-md5: <redacted>
content-type: text/plain; charset=utf-8
x-amz-acl: public-read
x-amz-meta-mtime: 1653798981.0739441
Accept-Encoding: gzip

2022/05/29 11:28:50 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2022/05/29 11:28:50 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2022/05/29 11:28:50 DEBUG : HTTP RESPONSE (req 0xc0000cd600)
2022/05/29 11:28:50 DEBUG : HTTP/1.1 200 OK
Content-Length: 0
Accept-Ranges: bytes
Connection: keep-alive
Date: Sun, 29 May 2022 05:58:51 GMT
Etag: <redacted>
X-Amz-Request-Id: <redacted>

Update:

This may be due to how your client is connecting to our Object Storage system. Our Object Storage clusters do have multiple endpoints as you can see here:

# dig +short us-southeast-1.linodeobjects.com

139.177.206.120
194.195.208.174
194.195.213.250
139.177.204.133
194.195.215.215
194.195.215.57

If rclone's behavior as a client is to connect to one IP address of the six in the cluster instead of balancing the load across all endpoints, it is likely limited to 1/6th of the total possible rate limit for a bucket in that cluster (750/6=125).

I'm not sure what rclone's behaviour will be here. It will probably use just one of the IPs for the TTL of the entry but I'm not sure. I think that is how most programs will work.

Can you check with netstat -tnp after rclone had been running for a bit. That will sure what IPs rclone is connecting to.

Rclone is connecting to a single IP address (194.195.215.215).

Rclone doesn't really pick an IP to connect as that's the local resolver on the system.

Generally, it'll get one IP and stick unless you do something to adjust the behavior. Windows will keep one IP cached for a period of time to reduce look ups.

This is because rclone is using persistent connections (which speeds up HTTP transactions by needing fewer round trips). Rclone should pick a new IP address every 15 minutes (which is how long the IP takes to expire).

This is mentioned in a Go issue here: net/http: Client round-robin across persistent connections · Issue #34511 · golang/go · GitHub

That also gives a hint about fixing it.

Try v1.59.0-beta.6166.ce988e6a1.fix-keepalives on branch fix-keepalives (uploaded in 15-30 mins)

With this flag I just added

  --disable-http-keep-alives   Disable HTTP keep-alives and use each connection once.

I will test this out and update as soon as possible.

I was able to create a rudimentary python script that cycles through the 6 ip addresses. I was able to build an upload script using asyncio & aiohttp to achieve upload speeds greater than 325 files / second. I was still not able to hit the advertised limit of 750 requests / second due to my AWS4-HMAC-SHA256 implementation, TCP concurrent connections limit etc. But it still did confirm that a round-robin approach will give upload speed greater than 125 requests / second.

import socket
from itertools import cycle
ais = socket.getaddrinfo("us-southeast-1.linodeobjects.com", 0, 0, 0, 0)
linode_cluster_ips = []
for result in ais:
    linode_cluster_ips.append(result[-1][0])
ip_addr = cycle(linode_cluster_ips)

# next(ip_addr) will return one from the linode_cluster_ips in a round robin manner
...

Le me know! I'm not 100% sure it will work but it might do!

Sorry for the delay. I was able to get 164-170 requests / second (not consistent) with --disable-http-keep-alives. Above that it has a lot of errors similar to

2022/06/01 20:57:46 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2022/06/01 20:57:46 DEBUG : 934.txt: Received error: EOF - low level retry 1/3
2022/06/01 20:57:46 DEBUG : 955.txt: Received error: EOF - low level retry 1/3
2022/06/01 20:57:46 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2022/06/01 20:57:46 DEBUG : HTTP REQUEST (req 0xc002758800)
2022/06/01 20:57:46 DEBUG : PUT /<redacted>/934.txt HTTP/1.1
Host: us-southeast-1.linodeobjects.com
User-Agent: rclone/v1.59.0-beta.6166.ce988e6a1.fix-keepalives
Content-Length: 3
Authorization: XXXX
Content-Encoding: gzip
Content-Md5: <redacted>
Content-Type: text/plain; charset=utf-8
X-Amz-Acl: public-read
X-Amz-Content-Sha256: UNSIGNED-PAYLOAD
X-Amz-Date: 20220601T152746Z
X-Amz-Meta-Mtime: 1653948024.799911
Accept-Encoding: gzip

Raising the transfer rate to 250 will also give a lot of 503 errors:

2022/06/01 21:05:36 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2022/06/01 21:05:36 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2022/06/01 21:05:36 DEBUG : HTTP RESPONSE (req 0xc00124de00)
2022/06/01 21:05:36 DEBUG : HTTP/1.1 503 Service Temporarily Unavailable
Connection: close
Content-Length: 148
Content-Type: text/plain
Date: Wed, 01 Jun 2022 15:35:37 GMT
X-Amz-Request-Id: <redacted>

This is the command I used:

rclone copy local_folder remote:bucket --s3-no-head --s3-no-head-object --s3-no-check-bucket --no-check-dest --retries 1 --low-level-retries 3  --local-no-check-updated --header-upload "Content-Encoding: gzip" --timeout 10s --contimeout 10s --transfers 250 --tpslimit 250 --tpslimit-burst 0 --max-backlog 250 --dump headers --disable-http-keep-alives --log-file=log.txt 

What hosts is rclone connecting to? Can you check with netstat?

All of the requests are being sent to 194.195.215.57.

So it is still ignoring the multiple IPs...

Try

export GODEBUG=netdns=go # force pure Go resolver
export GODEBUG=netdns=cgo # force native resolver (cgo, win32)

You may need to build rclone yourself to make the cgo resolver work.

Tried this from my windows system.

> SET GODEBUG=netdns=go
> SET
...
DriverData=C:\Windows\System32\Drivers\DriverData
GODEBUG=netdns=go
HOMEDRIVE=C:
...

Still connecting to only one IP address (verified from Wireshark).

Command I used:

rclone copy <local_folder> <remote>:<bucket> --s3-no-head --s3-no-head-object --s3-no-check-bucket --no-check-dest --retries 1 --low-level-retries 3  --local-no-check-updated --header-upload "Content-Encoding: gzip" --timeout 10s --contimeout 10s --transfers 250 --tpslimit 250 --tpslimit-burst 0 --max-backlog 250 --dump headers --disable-http-keep-alives --log-file=log.txt 

From log file:

2022/06/05 22:11:17 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2022/06/05 22:11:17 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2022/06/05 22:11:17 DEBUG : HTTP RESPONSE (req 0xc002dfe000)
2022/06/05 22:11:17 DEBUG : HTTP/1.1 503 Service Temporarily Unavailable
Connection: close
Content-Length: 148
Content-Type: text/plain
Date: Sun, 05 Jun 2022 16:41:17 GMT
X-Amz-Request-Id: <redacted>

Don't know if this has anything to do with this issue (Seems support is present - but the last date comment is just 8 days old): net: make Resolver.PreferGo and GODEBUG=netdns=go use Go code on Windows · Issue #33097 · golang/go · GitHub

Did you try export GODEBUG=netdns=cgo - that will do something different - whether it fixes the problem I don't know!

That is tangentially related. I'm not sure it is a solution to your problem though.

I am using a windows system.