Download Quota Exceeded & VFS

Animosity022 · August 26, 2019, 12:46am

And your setup is Plex/Emby/Jellyfin all running at the same time? I can speak from experience that Plex and Emby should be very light on API hits once your library is scanned.

If you really want to reduce API hits (which I really don't think is your issue), you can increase 128->256 or 512 as that's basically a HTTP range request for a file and doesn't mean you'd download 256M at a time. It basically wastes a little bandwidth to reduce the number of API hits.

The download quota exceeded though has my gut telling me something else is going on but we'd need some logs in debug to figure that out.

What's your 1 day file / get? I can run some mediainfo loops and get those numbers pretty quick if you feel that's a possible issue and rule it out.

Animosity022 · August 26, 2019, 2:47am

There isn't a need for this. If a file is already analzyed, it just checks the time stamp resulting is no analysis/mediainfo/ffprobe on the file. Same for both Plex and Emby (I don't use Jellyfin so I can't comment on that.

With analysis off, the same as above, nothing but a directory list happens.

Animosity022 · August 26, 2019, 3:04am

That's exactly what 'off' in vfs-cache-mode does as it implements partial reads on the file. The buffer-size is the spot that the sequential data is stored.

Animosity022 · August 26, 2019, 3:05am

Is this a team drive or just a google drive?

ncw · August 26, 2019, 9:18am

The system rclone uses for downloading files in chunks was put in by @B4dM4n to address this problem! What rclone used to do is just open the entire file for download. This would then count as the entire file downloaded even if the next thing your client did was seek which would cause the stream to be closed after only reading a few MB and a new one opened .

The system of chunks works well in the presence of seeks - what it means is that rclone only downloads a small part of the file at first so when a seek happens you have only been charged for that small part of the file in the quota rather than the whole thing.

Now it may be that google has changed the quota system... You can set rclone to use its old behaviour with --vfs-read-chunk-size 0. However rclone will still need to re-open the file every seek so i don't think it will change things at google. You could give it a try and see what you think?

Animosity022 · August 26, 2019, 10:21am

Further above, I tested by opening the same 90GB file about 1500 times and saw no issues with doing small reads on a large file.

I really think from reading the post, the OP was actually hitting download quota due to setup. With Sonarr/Radarr and the other things analyzing with running a similar config, I do believe he actually ran out of download quota for the day with all those settings and size / 180TB of data being analyzed/opened due to the settings.

Few key points with Google, Shared/Team drive act different than a personal GD and definitely have different quotas for files being shared. All my testing/setup is on a stock Google Drive with nothing shared or shared with me.

The quotas for Shared items are much much lower.

Animosity022 · August 26, 2019, 11:05am

Confirmation from Google that there was a bug back in 2016 that chunked downloads counted against the quota and it was fixed.

https://issuetracker.google.com/issues/36760092

Rclone (Cache and VFS) backend both use chunked downloading so this does not count against the download as it only counts as a chunk of the download.

@ncw and @B4dM4n can confirm rclone properly does the range request to get a chunk as that was the code that was added back in mid 2018.

thestigma · August 26, 2019, 11:06am

What do you believe the differences between personal and teamdrives are in more detail? So far I've pretty much assumed they are they are the same in most respects due to lack of solid indications to the otherwise, but I wouldn't be surprised if there were quota differences.

Animosity022 · August 26, 2019, 12:09pm

Sorry as I realize I worded that a bit poorly to what I was trying to communicate. Since Google does not post/share/document their download quotas, it makes it a bit tough as you can only go off other people's information/stories/accounts and try to get to a standard belief.

Google seems to take a harder stance on 'shared' things and sharing a large video file and downloading that a number of times fully (analyzing it/thumbnails/etc) hits a download quota much faster than a personal drive (not shared) doing the same thing.

Since the timing/quotas aren't known, it makes quantifying this much more difficult other than taking in anecdotal evidence from people posting. People that are working tend to never post/look as they do not have issues.

I personally have never hit a download quota exceeded but that does not mean the scenario does not exist. There are definitely ways to trigger it and that seems to generally be triggered with sharing a file and/or having a large pipe to download and run through.

Take for example the OP, who mentioned he was running a similar config to mine and do the math on the usage.

He's got quite the number of file gets that start at 128M and if read sequentially (like analysis does), the data supports there is a huge amount of download being done on the files the OP has, which would lead to using the daily download quota. Since there isn't anyway to check that number other than asking Google why you got the error, one can only conclude on the data supplied, that's the issue unless there is a bug in the rclone code for chunked reading which is not apparent.

thestigma · August 26, 2019, 1:27pm

If you mean shared as in "shared with others" then that is absolutely the same impression that I have gotten. It seems quotas on that may be much stricter, but also possibly on a much shorter interval. At least that people tell me who have tried to use drives to directly serve files to external users (via shared files that is as opposed to having your own server that pulls from a remote).

I read it as you thought teamdrives and personal drives acted differently, but I see now I maybe just misinterpreted that. Now that teamdrives are also technically renamed to "shared drives" it is even easier to be confused (thanks google! lol)

ncw · August 26, 2019, 6:58pm

Interesting - not seen that before - thanks.

Yes, range requests are ultimately how rclone gets the chunks from the backend.

Use -vv --dump headers if anyone wants to see them!

wavlinky · August 27, 2019, 1:26am

Google seems to take a harder stance on 'shared' things and sharing a large video file and downloading that a number of times fully (analyzing it/thumbnails/etc) hits a download quota much faster than a personal drive (not shared) doing the same thing.

That doesn't make a lot of sense to me, because you would need a higher quota to accommodate more users that would come with a shared drive. I'm the only user on the shared drive, although I do use a few service accounts for uploading, I slightly modified your upload script.

Unfortunately I did not have vnstat installed during most of the download quota week. Honestly, I suspect emby and jellyfin kept recanning the entire library. I had to scan both at least twice during that week, with about a day in between the 403s.

I used to have buffer-size 64M, but I can't remember if I had it during the heavy scanning week. Let's assume I did.

Let's do the math:

‭66,000‬ x 64M = 4224000M = 4.22T x 2 (jellyfin and emby) = 8.44T

If it was 16M, the default buffer size:
‭66,000‬ x 16M = 1056000M = 1.05T x 2 (jellyfin and emby) = 2.10T

This is not even considering if there is more than 1 chunk used per ffprobe call. So either I was within the 10tb daily limit, assuming all 66,000 files can be scanned in 24 hours, or I was over it. It takes about 1.5-2 days to scan my library from scratch, so i don't think I did all 66,000 in 24 hours.

Let's assume 33,000 files but with 2x more chunks.
33,000‬ x 64M = ‭2112000‬M = 2.11T x 2 (jellyfin and emby) = 4.22T x 2x = 8.44TB
4 chunks per ffprobe:
33,000‬ x 64M = ‭2112000‬M = 2.11T x 2 (jellyfin and emby) = 4.22T x 4x = 16.88TB

Let's assume 33,000 files but with 4x more chunks.
33,000‬ x 16M = 528000‬M = .52T x 2 (jellyfin and emby) = 1.04T x 2x = 2.08TB
4 chunks per ffprobe:
33,000‬ x 16M = ‭528000‬M = .52T x 2 (jellyfin and emby) = 1.04T x 4x = 4.16TB

So either I barely went over, or I was fine.

It's been 4 days without any api errors. If it happens again, I'll turn on debug mode.

Right now I have 128M read-chunk, no limit, buffer-size 0. I have a 4 hour scheduled task window. I decided to try my luck and turn on chapter and thumbnail extraction (24 hours with it on, so far it's fine). I delay local files for 8 hours, giving plenty of time for plex to do it while the file is local. I've stopped using emby and jellyfin, maybe one day I'll do jellyfin again run into this, but now I know to turn the read chunk down to 8mb and turn off the buffer.

thestigma · August 27, 2019, 1:47am

I don't think the buffer size would matter as long as it's less than the cache chunk size. You'd be requesting 128MB regardless with those settings. If you are indeed hitting the download limit then I would imagine that a lower chunk size would help you reduce the impact of all of the smaller data grabs.

Very possible my understanding here is flawed though, and if it is then by all means correct me

Animosity022 · August 27, 2019, 2:23am

Here is the post from Google talking about sharing links.

Examples:
https://support.google.com/drive/thread/2035857?hl=en

Since nothing is documented on the usage, I'd actually say it's more likely the range request that is sent over, which is the vfs-read-chunk-size. That's the only thing that google logs on the request and as ncw mentioned earlier, you can see that by dumping the headers if you wanted to.

The way buffer-size works is it fills up the buffer but with the ffprobe, the file just closes and the buffer drops/stops reading ahead.

I'd guess to try to do the math by vfs-read-chunk-size * drive.files.get perhaps? That's the only thing i can think of.

EDIT: I'm less thinking range requests are not the sole factor either as I did at test last night as I'm 128M range request by default and I did:

In 24 hours equating to (in theory)

I tend to get the feeling it's a bit more complex and more likely related to gets / download per file that happens and perhaps some overall formula as well.

wavlinky · August 27, 2019, 5:20pm

I do not use any sharing links. It's simply my user, and I added service accounts, of the same rclone project, with contributor access to the drive. It's just another member of the drive, but google does see them as outside the org users.

You must turn on "People outside mydomain.com can be given access to the files in this shared drive" to allow service accounts to access. This is perhaps

Which is different from "Only members of this shared drive can access files in this shared drive". which I have selected.

I doubt allowing outside org access itself would lower the limits, if anything it would increase them. Those service accounts only upload.

I think it's a calculated value with lots of variables and not one we will easily guess.

Animosity022 · August 27, 2019, 5:26pm

Your best bet is if you get the error, open a support case and ask. Based on your numbers, you seem to be downloading quite a lot based on your API hits / settings you've shared.

ashlar · August 28, 2019, 6:46pm

Some months ago I had a lengthy ticket open for the download events impact on limits.
It escalated to second tier, with the person taking time to talk with Drive team to come back with answers.

I was told, repeatedly, that a download event does not count for the whole size of the file, if the whole file is not downloaded.

random404 · August 30, 2019, 4:24am

I run my mount with:

  --vfs-read-chunk-size=10M \
  --vfs-read-chunk-size-limit=0 \
  --buffer-size=0K \
  --max-read-ahead=0K \

I always have at least 40 open files at any time, and never have any issues

Just my input here. Also It's easy to test stuff, just make a mount with another client id and secret, and call mediainfo in the file, then you can test many settings and their effects. I've been wanting to do that but no time yet

Animosity022 · August 30, 2019, 4:43am

Just a FYI, that does nothing unless you have a custom compiled kernel.

Issue here: ACD FUSE change default --max-read-ahead · Issue #877 · rclone/rclone · GitHub

system · November 28, 2019, 4:43am

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.