Suddenly Seeing rateLimitExceeded constantly

chezmojo · February 11, 2019, 10:36am

I have two machines making use of a Google Drive backend.

Machine A lives in a data center and prepares files locally and then uses rclone to copy them to Drive.
Machine B mounts the drive locally at home.

These two machines have been running cron jobs happily for at least 18 months with no hiccups.

Over the weekend I notice that an rclone job that syncs the Drive to another Drive (a backup mirror) had been hung since November.

I killed that job and tried restarting the sync, but I get 403 Rate Limited Exceeded errors constantly and the transfer rate sits at 0. I stopped the cron job to give it a rest (e.g., if there is some kind of 24 ban going on).

But now Machine A simply cannot upload anything to Drive.
The instant the transfer starts, the log fills up with:

2019/02/11 11:16:13 DEBUG : rclone: Version "v1.46.0-006-gef5e1909-beta" starting with parameters ["rclone" "copy" "-c" "--stats" "3m" "--tpslimit" "3" "--transfers" "3" "--files-from=/tmp/files" "-vv" "/mnt/media/Drive" "GoogleDrive:"]
2019/02/11 11:16:13 DEBUG : Using config file from "/home/chezmojo/.config/rclone/rclone.conf"
2019/02/11 11:16:13 INFO  : Starting HTTP transaction limiter: max 3 transactions/s with burst 1
2019/02/11 11:16:13 DEBUG : pacer: Rate limited, sleeping for 1.499745568s (1 consecutive low level retries)
2019/02/11 11:16:13 DEBUG : pacer: low level retry 1/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/02/11 11:16:14 DEBUG : pacer: Rate limited, sleeping for 2.234574878s (2 consecutive low level retries)
2019/02/11 11:16:14 DEBUG : pacer: low level retry 2/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/02/11 11:16:15 DEBUG : pacer: Rate limited, sleeping for 4.7689464s (3 consecutive low level retries)
2019/02/11 11:16:15 DEBUG : pacer: low level retry 3/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/02/11 11:16:17 DEBUG : pacer: Resetting sleep to minimum 100ms on success
2019/02/11 11:16:22 DEBUG : pacer: Rate limited, sleeping for 1.230433586s (1 consecutive low level retries)
2019/02/11 11:16:22 DEBUG : pacer: low level retry 1/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/02/11 11:16:22 DEBUG : pacer: Rate limited, sleeping for 2.932044098s (2 consecutive low level retries)
2019/02/11 11:16:22 DEBUG : pacer: low level retry 2/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/02/11 11:16:23 DEBUG : pacer: Rate limited, sleeping for 4.806679986s (3 consecutive low level retries)
2019/02/11 11:16:23 DEBUG : pacer: low level retry 3/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/02/11 11:16:26 DEBUG : pacer: Resetting sleep to minimum 100ms on success

The log just goes on and one like this for hours with a transfer rate of 0:

Transferred:             0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors:                 0
Checks:                 0 / 0, -
Transferred:            0 / 0, -
Elapsed time:      3h6m1s

If I run the same command against a different Drive the log fills up with the same errors, but then actually manages to transfer files after a while.

I created a new API key and refreshed the tokens to effect.
I updated to the latest rclone Beta to no effect.

Machine B is doing fine — no errors in the log (for the mount).
And the sync job is now running fine on another machine.

Just to be clear, I changed absolutely nothing about my setup. These 403 errors cropped up suddenly and without any obvious proximate cause.

Is anyone else running into anything similar?

chezmojo · February 11, 2019, 12:00pm

It seems like adding --no-traverse makes a difference. Very slow transfer rates, but at least files are actually uploading now.

Animosity022 · February 11, 2019, 12:13pm

What’s your actual sync command?

chezmojo · February 11, 2019, 12:20pm

Sorry, it was buried in the debug log.

The exact, same command with --no-traverse results in files uploading, but with terrible (though non-zero) speeds.

Animosity022 · February 11, 2019, 12:22pm

If you check your console, are you seeing any huge jumps on the quotas per 100 seconds if you check a day or so?

chezmojo · February 11, 2019, 12:29pm

I regenerated the API keys last night so I do not have a lot of statistics, but the peak is > 600.

Animosity022 · February 11, 2019, 12:32pm

So I’m not sure why it would creep up all the sudden, but I try to limit my transfers/checkers as the goal is to keep under 10 transactions per second so I do something like:

/usr/bin/rclone move /data/local/ gcrypt: -P --checkers 3 --log-file /home/felix/logs/upload.log -v --tpslimit 3 --transfers 3 --drive-chunk-size 32M --exclude-from /home/felix/scripts/excludes --delete-empty-src-dirs

Depending on how many things you run at the same time, you might want to reduce a little and see how it works out.

Did the # of files or something start to ramp up perhaps? It sounds like you crossed some limit creating more hits.

chezmojo · February 11, 2019, 12:34pm

There are certainly a ton more files in the queue since it’s ben about three days that I haven’t been able to upload anything.

I haven’t tried limited the checkers… will give that a shot.
Right now there should be only one instance of rclone anywhere trying to write to Drive.
Seeing the numbers from your developer console, there is clearly something wrong with mine!

Animosity022 · February 11, 2019, 12:37pm

I don’t sync and just copy a few files each night at most so I’m probably not the best use case for checking the console so definitely don’t compare to me. I just showing an example.

I’m just a plex user running a few streams each night and moving usually 5-10 files overnight daily.

chezmojo · February 11, 2019, 12:41pm

Same here except that mirror the Drive so that I can keep a mirror mounted on the server and the “real” one mounted at home.

That is what is so confusing… everything just kind of went sideways for no reason.

chezmojo · February 11, 2019, 1:26pm

Weirder still, it suddenly started throwing “user rate limited exceeded” errors, suggesting that I rethink my API quotas even though there is no evidence of anything crossing the user limit in the developer console.

I killed rclone and tried again and it said “unauthorized”… I had to regenerate the token.

Whatever is going wrong, I think it is happening on Google’s side and not mine.

Animosity022 · February 11, 2019, 1:33pm

Hmm. Trying to think of other things.

Is the time right on the server? That can get the keys not refreshing properly.

Or write access on rclone.conf if it can’t update the key.

I had my time off on my test VM and it wouldn’t regenerate the key.

chezmojo · February 11, 2019, 1:39pm

Time is ok.
It’s actually a dedicated server (uptime just hit 400 days) so settings tend to stay put.

I went through all of my servers and killed all instances of rclone and then started the upload again.
It is currently working with --no-traverse, but it choked on all the little nfo files, so I just deleted them.
At least this way I can clear the backlog… I hope.

The developer console is still showing spikes > 400 queries per second though.

mikerhw · February 14, 2020, 4:07pm

I started seeing the exact behavior a few days ago. mysetup is a bit different. I am using the same gdrive mount on two machines. The first one is in a data center and performs great, the mount and move commands exactly as intended. The second one is a server at my house that i use for backups from my local network. The mount on this one works fine but the move command immediately give the rate exceeds. the config files, mount command and move commands are the exact same.

jcastill · February 14, 2020, 7:29pm

Are you using the same client id in both mounts? I highly recommend using one per mount for peak performance.

Animosity022 · February 14, 2020, 7:39pm

Can you make a new question and fill out the question template rather than bumping a year old post so we can get more details and help you out.

Thanks!