Settings for Google Drive for guaranteed protection against ban (but not maximum performance)

Korwin · July 4, 2021, 7:39pm

What are the settings for rclone v1.55.1 to ensure that all quotas are not exceeded?

I understand that the question may seem incorrect and unprofessional, that it is better to proceed from the tasks, optimize and fine-tune the settings specifically for the conditions of use of the application. In addition, firstly, the online manual page describes all the keys used to connect Google Drive using rclone, and secondly, similar questions have already been asked many times. (For example, here, and in a dozen other topics.) To this, I would answer that the description of the launch keys rclone does not indicate with which keys the user can exceed the existing quotas of Google Drive (by exceeding the traffic or the number of requests), and with which keys the exceeding of quotas is guaranteed to be impossible. As for the existing topics, first of all, the application develops, it has new opportunities, keys that were not there during the discussion in the indicated topics. And secondly, special cases are discussed everywhere, optimization for specific tasks, and not "safe mode", slow but universal, and guaranteeing work without throttling.

PS. To discuss a not-so-spherical-horse-in-a-vacuum, I will clarify that we will try to use the application in such conditions. After deleting hundreds of thousands of duplicate files that appeared at some point in the Google Drive root of one of our users. Perhaps there was some kind of error related to incorrect processing of files not by the owner of these files, but by the editor. Unfortunately, we did not immediately notice this incident, and the rclone logs were not saved.

Animosity022 · July 4, 2021, 7:45pm

You don't get banned for going over quota. You only get banned for violating terms of service with the content you store.

Use the defaults and make your own key and that's it.

ncw · July 5, 2021, 1:36pm

We don't merge features which go against Googles Terms of Service so there shouldn't be any features which will get you into trouble.

Google tends to throttle first and rclone will obey that throttling. There are hard limits on amount of data you can upload per day (750GB) and download (I forget) so if you exceed those you'll get a temporary ban until your quota refreshes.

Ah another physics student (like me!)

If you want to avoid being throttled then setting --tpslimit quite low will do that. 5 should do it. However throttling is no big deal, rclone will slow down for a bit then speed up again so I wouldn't worry too much about it.

Korwin · July 5, 2021, 8:09pm

I expressed myself incorrectly. By ban, I meant a temporary ban, the impossibility within 24 hours from the moment the limit was exceeded to do something in the storage.

@NCW, thank you for the detailed answer! That is, there is no need, for example, to set the minimum time between successive requests equal to 50 ms?

Ok, I'll try using rclone.exe mount Google_Drive: h: --tpslimit. Does it make sense to completely cache all online data and only sync changes to it? If so, what is the best way to do it? I did not find anything suitable for "cache", "save" and "local" in the Google drive article. Apparently, these are application settings common to all types of cloud storage.

Set --fs-cache-expire-duration=TIME equal to several days?

As I understand it, you do not yet recommend using Cache in large quantities.

Animosity022 · July 5, 2021, 8:30pm

You don't get banned or temporarily banned for going over the API quota so it's best to just use the defaults. There's no reason to tune for the API as you get 1 billion per day.

Edrock200 · July 6, 2021, 6:02am

You are probable referring to the download quota. As far as I can tell this isn't set in stone. Initial amount is like 15tb/day but if you do that over consecutive days the amount lowers. Its also pertinent to note this doesn't appear to be cumulative, but object based. E.g. if you look at your google audit log and you only grabbed a 64MB chunk of a 5GB file, it counts as a 5gb download, or so the audit log would make it appear.

My guess is google emplores some advanced huerisitics and pattern usage making it difficult to pinpoint exactly how much data you can pull on any given day if doing random reads across a large data set.

ClementNerma · July 6, 2021, 11:15am

I've seen many people complaining about a 750 GB/day limit, so I'm not sure.
But you don't get banned for that for sure, the download speed only gets very slow after that.

Animosity022 · July 6, 2021, 11:48am

That's not correct as you can download partial files and it does not count for full files. Many years ago rclone didn't support this and you downloaded full files all the time and that's not the case for some time now.

Edrock200 · July 21, 2021, 5:23pm

Right, I'm not saying that rclone will grab the full file, I'm saying that Google will count the full file size against your daily quota. e.g. if the daily limit is 10TB, and you have 15TB of files, and you touch 1 byte of each file with rclone, google will cause an daily quota limit ban once you've touched 10TB worth of files, even though you've only actually downloaded a fraction of it, at least based on my experience.

Animosity022 · July 21, 2021, 6:20pm

It counts partial downloads.

The whole reason rclone had a major upgrade some time back was to add ranged requests for downloads so the very problem you've described was resolved.

So this

Is wrong.

Edrock200 · July 22, 2021, 1:11am

Clearly this is why your are smarter than me. Thanks for the info. Appreciate the insight.

Animosity022 · July 22, 2021, 1:43am

Not a problem as it is quite the topic before

D34DC3N73R · July 22, 2021, 3:26am

I'm not so sure this is entirely true. I got a 403 downloadQuotaExceeded error on July 3rd, while running rclone v1.55.1 and --vfs-cache-mode=full. It was even a relatively light day, as I was busy getting the house ready for the 4th of July. I've been reaching out to google to try to figure out what exactly this "secret" undocumented quota is, but I've just been passed from CSR to CSR so far.

Animosity022 · July 22, 2021, 3:48am

What isn't true? The statement was a range request counts as a full file and that is not correct as it counts as a partial download.

It's very easy to test as I've shared in other threads as you can simply run a loop on a large file:

for i in {1..100}
do
head -c 10 <somelargefile>  > /dev/null
 echo $i
done

Grab the first 10 bytes and I just did a 100GB file for 1000 times which would be 100TB downloaded and no issues. You can see in the GSuite admin log entries for each one of those as well as a 'download' for every item and you can export that to excel and it would match up with 1000 that I just ran:

Without knowing your config/setup/scheduled/tasks/team drive/edu drive/gsuite/etc, it's tough to make any guesses on why you have got that. It also depends as there are many factors in terms of if it's a shared / team drive or if it's a shared file as none of it's documented and the limits are just wacky at times.

In all my time using rclone, I've yet to ever get a download quota issue so if you get a response back from CSR, I'd love to hear what it is as i think they'll just bounce you around and say check back later as the formula seems to be secret.

I've heard many folks on EDU hitting quota issues and team drives but it's all just word of mouth and hard to validate/prove. There are a few big users here that also don't seem to hit the quota but that is all word of mouth as well.

D34DC3N73R · July 22, 2021, 4:35am

I think it mostly comes down to that. The only reason I'm suspicious is because of the lack of documentation on their end. While it may be true that rclone and drive won't download or "count" a full file as downloaded when only a chunk is grabbed, I'm not positive that size has anything/everything to do with the download quota exceeded error. I didn't mean to imply you weren't being honest, just that it's still entirely possible to get a temp ban.

My drive has 0 shared files (entirely encrypted encfs mount) single user gsuite drive. When the error hits (rarely) it seems entirely random. I first opened up a ticket at issuetracker.google.com and was directed to workspace support (although they said they would raise the lack of documentation issue internally) and workspace support passed me along to a specialist, although they claim they can't help with drive API issues in their boilerplate, so we'll see what happens.

Anyways, not to hijack a thread but to satisfy curiosity, these are the mount settings I use.

--bind <server-ipv4> \
--config /home/admin/.rclone.conf \
--rc \
--allow-other \
--log-level ERROR \
--syslog \
--umask 022 \
--allow-non-empty \
--user-agent "my-user-agent" \
--poll-interval 10s \
--drive-pacer-min-sleep 10ms \
--drive-pacer-burst 1000 \
--vfs-read-ahead 2G \
--vfs-cache-poll-interval 5m \
--dir-cache-time 360h \
--vfs-cache-max-age 360h \
--vfs-cache-max-size 256G \
--vfs-read-chunk-size 1M \
--vfs-cache-mode full

I run a separate rclone copy script but the only flag on that is -c --bwlimit "10:00,1.25M 16:00,0.625M 23:59,3.125M"

Animosity022 · July 22, 2021, 12:39pm

and

With your setup would do quite more API calls since the range requests are starting out at 1M and assuming they are reading sequentially, it would take some time to build up.

D34DC3N73R · July 22, 2021, 3:54pm

My quotas are still way under the limits so I didn't see the harm in leaving it. You're suggesting I remove --vfs-read-chunk-size and let it run with defaults?

Animosity022 · July 22, 2021, 3:57pm

So we go back to your previous statement.

The quotas you are seeing for API are not the same as the quotas for the download/upload quota.

I agree you have plenty of API quota per the console as that's visible and generates errors if you 'hit' the API too hard.

The download/upload quota are not published / no way to track and seems to have some inane rolling formula with more factors than I care to guess as so having less downloads of a file might be a good thing but that's all just guessing as we all have anecdotal evidence on what the reason is and no actual clear policy from Google (which is dumb).

system · September 21, 2021, 11:57am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.