@ncw Really Quick fix GDrive 24hr ban w/ PLEX scans on R-CLONE mount

SlurpeeMachineSalesm · April 14, 2017, 2:50am

TL;DR Google drive does the 24 hour bans on google drive when doing a plex scan not because of api requests but because plex tries to analyze every file which triggers every file being scanned to download through rclone. Google blocks this mass download as abuse via a 24 ban. This could be stopped by a flag on the rclone mount command along with some logic to determine a user requested stream vs a library scan.

=====================================================================================

So from reading here and around the internet it seems that google is locking drive accounts NOT for number of api requests but file downloading on a large automated scale.

So when PLEX scans your library it analyzes your video files too, on a local disk this is no big deal but on a cloud storage mount it means it is downloading all your media to scan that’s 20+ tb for me. Some other users here have mentioned that after they got a 24hr ban PLEX continued to scan in media this is because PLEX could still pull file and directory names names off the mount.

I can support my claim:
PLEX cloud fixed the google drive bans by only indexing your files by file names. When plex cloud was getting banned it would generate thumbnails for my home movies but now that it does not get the 24hr ban it is not able to generate thumbnails for media it will only scrape meta data for known media from sites like themoviedb.org so this shows previously plex was downloading each file to analyze. This mass download is what triggers google drive to lock the account to prevent abuse. I could offer more supporting evidence but I am busy and don’t have more time to write up the info. It would be most simple if plex would implement this feature on self installed/non-cloud servers but for the time being I think we can work around the issue.

MY SIMPLE FIX:
Have a flag for the mount that will block automated mass file downloads, while still allowing individual users stream.
e.g. rclone mount --ban-protect gdrive:/plex /mnt/plex
(I have some other ideas for mount flags that will allow more control for the individual user)

There could be some basic logic to detect if it is a plex scan vs a user streaming

number of requests per second (automated requests would happen quickly and could trigger a lock)
directory request timing (an automated scan would craw between multiple directories quickly an individual request would not)
max number of downloads/file snags/streams off of mount per second

If there is a way I can see what is happening with rclone in realtime I could watch what it looks like to rclone during a plex scan vs a user requested stream and come up with a way for rclone to differentiate the two so streaming can be kept from getting locked out while still allowing rclone to block excessive file download requests by plex.

the most rudimentary fix for endusers that would allow people to use the fix almost right away, a flag that would lock all downloads from google drive and a bash script that would:

unmount gdrive: umount /mnt/plex
remount gdrive in ban protected mode: rclone mount --gdrive-ban-protect gdrive:/plex /mnt/plex
execute plex scan
upon plex scan completion unmount gdrive: umount /mnt/plex
remount gdrive normally

Danial_Hanafian · April 14, 2017, 3:45am

Well, most of us know that, but it’s not the problem of Plex, it’s the way rclone handling the file and cache system.

Because if you look at other mounting option, like node-gdrive and ocamlfuse, they cache the list of the file in the beginning as list and that’s why Plex read from that list and no access directly to the file for each scan.

Which this way Plex not keep repeatedly downloading same file over and over.

There is a bounty on this, but still no word from @ncw, maybe he’s busy.

Lol340 · April 14, 2017, 9:40am

Yeah I think I’ll have to move from rclone because of that.

Do you know if you can use node-gdrive with the encryption I already have using crypt ?

calisro · April 14, 2017, 1:11pm

Im not a plex user. but if plex continues to scan after the ban, couldn’t you just protect yourself by using your own clientid and scaling down / limiting the api requests per second on the developer console as a test? For that massive ‘initial’ scan, perhaps you lower that down real low to prevent a ban but allow the scan. the mount would then see 'over the quota limits when you are hitting your lowered quota.

btw, im not saying this is a good ‘long term’ fix but just mentioning it may be a way to self-limit.

Danial_Hanafian · April 14, 2017, 1:17pm

It’s not API call, even if you use you own Client ID, you won’t get much API usage, it’s the amount of download per file that cause ban.

For example Plex try to access a file to get it’s data, but rclone doesn’t cache that file and Plex try to access it over and over, and that cause multiple download per file and now if you have huge library, that cause to many download and Google doesn’t like that and ban you.

calisro · April 14, 2017, 1:30pm

ahhhhh. I didn’t realize that DOWNLOADS do not have API calls. I just verified that but making my ‘queries per day’ to ‘1’ and I can still download files. I just can’t list them. So if I have a cached mount, i can still download everything and browse. Interesting. I would have thought that each download had 1 API call but I am wrong.

Danial_Hanafian · April 14, 2017, 1:34pm

Because I used my Client ID and when I was checking the GSuite Admin, in the Google Drive section, you see 1 file downloaded more than 15-20 time in just 1 second, and this keep repeated for the rest of the files too, but my API wouldn’t go even reach to 100-200 (for like maybe around 1 hour.

zenjabba · April 14, 2017, 10:32pm

Correct, if you have a cache of the google drive mapping and take the changes request from the drive.changes.get you can keep the full filesystem upto date without needing to call API calls. Basically google wants your to cache the filesystem and use the drive.changes.get to follow the updates and call everything locally, but only pull the file when you need to download it.

What is even cooler, if you get given a token and API location field, so you can start from the last place in the drive.changes.get and continue processing meaning you can quickly catch up without a major issue.

SlurpeeMachineSalesm · April 22, 2017, 8:53pm

so I have not tried node-gdrive-fuse b/c I had difficulty installing it, but google-drive-ocamlfuse still downloads the files during a scan I think the difference between rclone and google-drive-ocamlfuse is that a google-drive-ocamlfuse mount downloads one file at a time during the scan, where I'm guessing rclone grabs a ton of download threads to grab multiple files at once for the scan. You can see google-drive-ocamlfuse is only grabbing one file at a time by monitoring it's cache directory size with the du (disk usage) command:

A great way to test this is to:

Create a new tv show library in plex
Add a directory for a show (one season)
Use DU command to monitor the cache directory size
You will notice the cache directory file size increases to the size of the episode being scanned in and then drops to 0 before the file size starts to climb again for the next file/episode.

I'll try and post a follow up video detailing how I tested this and showing my findings

Danial_Hanafian · April 23, 2017, 6:04am

But from what I saw in my Drive, rclone download one file over and over, but the other one usually download them like 1-2 times for scan, and that won’t cause any ban.

Lol340 · April 23, 2017, 6:25am

I’m about to try ocamlfuse with a big library load up for first time.

I also noticed that scanning ~1300 files didn’t got me the ban but ~140 files gave me the ban. The only difference between the two was that the first has in a tree structure ( e.g x/xx/xxx ) and the second has all files in a single folder.

Does anyone know the optimal settings for ocamlfuse for the cache?

Danial_Hanafian · April 23, 2017, 4:04pm

Well, if we can test that more (having tree structure), and figure it out, maybe we can use it as temporary fix now.

But testing these is cost you 24 hours each time and it’s a long wait

Animosity022 · April 23, 2017, 4:30pm

The only thing I really did was to increase a few things in the config:

max_memory_cache_size=8737418240 # bumped to 8MB
memory_buffer_size=8388608 # bumped ~8GB
read_ahead_buffers=5 # bumped from 3 to 5
stream_large_files=true

I had major problems with stuff just not working well if I tried any union fuse type mounts.

My mounts with a rclone encrypt looks like:
> # Mount GD Fuse
> /usr/bin/google-drive-ocamlfuse /GD -o allow_other

> # Mount the 3 Directories via rclone for the encrypt
> /usr/bin/rclone mount \
> --allow-other \
> --read-only \
> --default-permissions \
> --uid 1000 \
> --gid 1000 \
> --umask 002 \
> --acd-templink-threshold 0 \
> --buffer-size 100M \
> --timeout 5s \
> --contimeout 5s \
> --syslog \
> --stats 1m \
> -v \
> media: /media &

I recovered from doing quite a number of scans last night trying to not use the unionfs mounts and ended up doing like 150k API hits over the last 6 hours with no bans with roughly ~1100 movies and 8-9k TV shows. I use a folder for every movie and TV shows are dropped in folders following the norma plex naming conventions.

My cache size barely every goes above ~200MB, but the memory use maxes out as it does some work the first time.

If I do a full “Scan Library” now after it’s all scanned, it takes maybe 10-12 minutes but I don’t need do that anymore so that’ll save some API calls too.