Help setting up rclone gdrive gcatche gcrypt with seedbox

thestigma · December 1, 2019, 11:36pm

Yes, Plex isn't directly cloud aware (like pretty much all programs aren't) and a mount will be the only way it can interact with files on the cloud.

I can't say I know of any good Plex-guides for the cache spesifically here that I know of (there probably is, but I am not omniscient). The cache-setup is pretty straight-forward however. I think the main benefit to running a cache with Plex is that it would help alleviate some of the problems associated with aggressive scanning for metadata - but I would suggest that this is better dealt with by turning those scans off and rather doing them manually once in a while if needed. Animosity is a guru here on Plex/Linux and he ultimately decided to not use the cache-backend at all.

He has a thread here detailing his setup:

While not all of the info there applies to you (his setup is a bit more advanced than most) this thread slo contains a lot of good info on Plex-use on rclone in general and best-practices around Plex-settings and how to make Plex behave well on rclone.

I would definitely use a cache to keep some of the hot data local. Either type would help with that.
I think I would personally go with the VFS cache here to avoid over-fetching data when you get a request for just a small bit of data (as is not uncommon on a torrent). If you just make sure the VFS cache is properly set up to retain data up to a certain size I think that should do the trick.
But yea, the cache-backend would also work here. I just don't think the forced chunk-prefetch approach is ideal for this situation.

It's not quite as bad as you imagine.
When a torrent-peice request comes in, rclone will open a read-segment of the file (by default this is 128M but can be configured otherwise). It can then read any file in that segment and seek within it without re-opening the file (which is the biggest limitation).
I have never come near to stressing the API while seeding, although my seeding is probably very modest compared to your needs. You have 1000calls pr 100 seconds to use, so this is fairly substancial, especially as peers will often request a range of segments that fall within the same segment and can be fetched as one operation.

Let me be clear that there is no such thing as an "API ban". The worst that can happen is that you max out your quota for the 100 seconds. Rclone will automatically keep track of this and throttle down a little bit if it needs to in order to keep you within the API limit. You will never get locked out of the API entirely - or at least I have not seen this ever happen. You can run at the 10calls/second average 24/7 without hitting the absolute max daily calls (you would need multiple concurrent users on the same key for that to happen). That's close to a million calls a day.

But sure, at some point you will probably end up API limited if you are seeding hundreds of torrents to hundreds of peers at the same time. It is hard for me to gauge how at what point this would happen because I've just never tried serving that kind of volume, but as long as you keep the hottest files on cache (ie. the "fresh this week" stuff) then I think the API can probably handle a pretty decent volume of the older stuff that is just requested occasionally. Ultimately I think this is something I think you just have to test and get a feel for. I would be very interested to know your results though.

You can keep an eye on API-use here and know exactly what kind of load the API is seeing:

Sure, the larger the cache the better as you will need to fetch less remotely.
If you wanted to use the VFS-cache instead you'd use this to achieve much the same result:
--vfs-cache-max-age 332880h
--vfs-cache-max-size 1500G

I think I've mentioned this before though.
The reason I don't think the cache-backend is ideal for torrents is that I believe it will fetch a minimum of 1 whole chunk, and if you have more than 1 worker thread it will fetch that many chunks. That would be a lot of inefficiency if it's just a small request. It will also be harder on the API as I think those chunks will need separate calls.

This is true. I think it would actually be harsher on the API though as I said.
Using the VFS aproach you'd keep the hot files in cache too. It would just not re-cache something once it has been evicted. However, for torrents especially - once a file is no longer "hot" and recent, it rarely has a resurgence of popularity, so I don't see this as a problem.

With chunking I don't think you can download several at once. They are going to be fetched by cache-backend one-by-one I'm pretty sure. You could use very large chunks, but that would add a lot of delay before responding to any request - and you'd get massive overshoot if only a small bit of data was requested.

Correct. With cache backend you download 1 full chunk minimum. Note that it will actually fetch several chunks to start with if you have more than 1 --workers . If you had 8 you'd get the data requested + the next 7 for example. The cache bcakend was designed for media streaming I think, so that kind of sequential prefetching makes sense there, but not so much for torrents perhaps.
Normally rclone would just fetch as much data as whatever the application requested (which in a torrent client I guess would probably be a set of pieces that may or may not be contiguous)

I hope we can get some more robust and efficient general caching later on in the VFS eventually - but these things take time

No, not as far as I know. If you requested data from the middle of the file it would grab the chunk that data fell under. At least that is my best educated guess.
EDIT: reading your example below then yes it would be "aligned from the start" as you say.

Pretty sure the answer is B
The only chunk that will have an odd size will be the last one at the end of the file. This is made this way so that you can download pieces here and there - and later on these can match together without overlap.
As mentioned, with multiple worker threads (say 8) you may also in addition to chunk#2 get chunk# 3-9
And if you set it to use only 1 worker then it would only be fetching chunks 1 at a time which also isn't ideal here. So this is one of a couple of reasons I don't think the cache-backend is ideal for this use-case.

Chunks should be large enough to have time to TCP-ramp up to decent speed, so I would hesitate to use less than 32M (or even 64M), but of course that come with the cost over over-fetching more data.
I don't think you need to worry about the database efficiency. It's local so it will easily handle it.

TLDR on this whole thing, I think I would reconsider the cache-backend of the torrent-use.

Heavier on API calls
Overfetching of data
Problem with multiple worker threads fetching multiple chunks we may not want (not something that was not left as a setting unfortunately).
Only big benefit is that it can re-cache old and less popular data (which seems like ti wouldn't be that important, or even inefficient when you look at how torrent popularity typically trends over time).
It was intended for media streaming (a large buffer for fairly predictable sequential reads and you are really asking it to do a polar opposite job here).

While the VFS-cache might be slightly "simpler" in some aspects, I think it will do a better job here honestly - and with the benefit that it will improve and gain more flexibility as development proceeds in later versions.

But the choice is up to you - and I will do my best to assist either way you want to proceed