Google Drive. How to have video playback start immediately while caching on disk?

Problem is that every chunk creates a download event, which is what I’m trying to minimize. Or are you saying it doesn’t matter if these are generated through rclone?

In any case, I tried playing back a different movie with Kodi and, again, playback was not starting. I stopped it at 2.69 GB having already been saved in cache (12 files, having the offset size in bytes, such as 0, 262144000, etc.).

I wonder if there’s anything “special” that needs to be configured for this to work normally in Windows.
I’m using this to mount:
rclone mount --allow-other --timeout 1h --cache-db-path E:\rCloneCache cache-gdfs: Q:

Edit: I tried opening the “0” file with MPC-HC and it opened perfectly for the .mkv file that in reality it’s a chunk of. There must be some sort of “disconnect” at play here… :-/

The issue before was that each time a file was read, it would count as a complete download of that file.

So if you had a 10GB file and it got read 10 times, it counted as 100GB for the download quota for that file. The ban/issue/error was that files were exceeding download quota.

If you can push the API quota per day, which is 1 billion, good luck :slight_smile:

I push barely 20k API hits per day with 60Tb of data and 4-5 people streaming.

If you want to use cache, you need a small chunk size or it’s going to be tough.

You also need to make and use your own client ID/API key:

https://rclone.org/drive/#making-your-own-client-id

That being said, I wouldn’t use the cache backend unless you have a reason to store chunks as it’s just overhead.

Thanks once more, I cannot tell you how much I appreciate the time and patience. I’m sadly aware of my ignorance here so… really thanks!

So you’re telling me that I could safely ignore the number of download events and mount without cache or vfs configured? Like… you’re 100% sure that the number of download events displayed at https://admin.google.com/AdminHome?pli=1&fral=1#Reports:subtab=drive-audit don’t matter and the only thing that matters is the whole file size being transferred?

I’m not afraid of the number of daily API hits, for the reasons you express I’ll never, ever risk reaching 1 billion hits. But download quota for a single file… 20GB file, hit hundreds of time… it’s easy to reach 10TB downloaded watching a couple of movies, if download events are counted every time as full size.

Edit: I say without vfs because the only reason I thought I needed vfs was to minimize the number of download events. I could use a memory buffer if that’s not important (I have 16GB after all). Or is vfs useful in my scenario?
Also, yes, I created my cliend ID/API key. I read instructions as well as I could before beginning.

rclone only does chunked downloading so you don’t hit an issue of download quota if for example you run mediainfo on a 50GB movie 1,000,000 times.

I use my mount command I shared above on a encrypted GD without any issues. I am a Plex user though so that’s my use case. A linux server running plex with sonarr/radarr/etc.

So “chunked downloading” means that the API call specifies the amount of data that is being transferred for every download event? Is that what is happening and is this why one doesn’t risk surpassing the download quota?
Without using the cache backend but just vfs playback starts. I’ll see if any windows user has any more suggestions and in the meantime, thank you Animosity022. :slight_smile:

There is a good write up by the guy that wrote the chunked reading part here:

He explains the problem before and what he did to solve the issue.

I mean, you can still hit the quota assuming you have a nice fat pipe and download the same file many, many times :slight_smile:

That looks like a very informative read. Thanks for sharing. I will study it.

If you hit play, does it start right away or do you see it download the entire file? Regardless of the logs, it should be pretty easy to figure that out.

You can see some key words in those logs like ‘offset’ and “actual_length”.

Yes, playback is almost instant.
And yeah, I noticed those key words. Those are what prompted me to ask.

You are trying to solve something that simply just isn’t a problem.

There are plenty of API calls to make during a single day as you can’t really go over the limit as Google only allows ~10 per second with their default quotas.

The term download ‘event’ does not really apply and just adds to confuse other folks as it makes an issue that doesn’t exist.

You have a few options in rclone.

  • Standard - you can stream a file using just memory which is more the default settings
  • Two cache mode options
    • use a vfs cache mode and keep a file on disk for a period. this requires getting a whole file
    • ‘cache backend’ - this does chunked downloading and retains parts of a file. bigger chunk storage size keeps files around longer.

24 hour bans don’t happen for ‘no good reason’ as there is something that always causes that.

I can speak to Drive File Stream because I don’t use it.

If you played a file using a version of rclone prior to June 2018 and didn’t use the cache backend, that would cause the issue with rclone in particular.

There are many ways to get the old package by not downloading the latest version from the site. For example, Ubuntu has versions in their PPA that are ancient:

If you have Drive File Stream questions, I’m sure they can help out with those particular things. We can happily answer any rclone questions here especially me with Plex as that’s my use case.

I’ve got ~60TB of encrypted data and switched over to rclone back in June 2018 once the new release hit. So I’m approaching almost 10 months now and never have seen a 24 hour ban nor any major API hits.

Thanks. No more GDFS talking, I agree. I was providing context but it became overly long.
I recover a couple of questions from the post in the old discussion that I (wrongly) resurrected.

Would it make sense to have a --vfs-read-chunk-size very low for initial library scanning purposes and then raise it once the library has been scanned (as subsequent additions, from day to day, would be a fraction of the initial scan)? Or is there a shared “optimum” value for scanning purposes?
I use Kodi, not Plex but the scanning process is quite similar. I think Kodi uses a call through the ffmpeg code it uses internally to scan for media info. In that it should be similar to Plex, I guess.

Also, as my experimentation has shown me, using vfs read chunks has very “little” video available, with 64MB of buffer to counter “network hiccups”, especially in high bitrate scenarios and I might want to consider increasing the memory buffer to something like 256 or 512MB. Any cons that I should bear in mind, were I to do this?

Plex/Emby/Kodi all do something similiar to a ffprobe/mediainfo command to get the codecs and such for the file. I’m not as familiar with Kodi as I’ve never really used for more than a few minutes.

Chunked or partial reading means that is does a request for a piece of the file at a time. If it’s set too big, you can get some waste, but Plex closes the files so fast, it really is insignificant in terms of the initial library scan. File size semi matters but usually what’s in the container dictates how long it takes when it’s scanned.

I usually see anywhere from 2-10 seconds per file depending on the file.

buffer-size would help if it was bigger in terms of direct playing or a process using the file in a sequential fashion. For Plex, this means Direct Play. I personally just it at the default value for the buffer and I never have seen an issue with it.

Having a large buffer size means if a bunch of files open, you potentially can run out of memory for the system.

My general is approach is to keep it simple and leave everything at defaults unless I have a very specific reason to change it.

1 Like

Ok, did some experiments using:

mount drive: Q: --allow-other --allow-root --tpslimit 10 --fast-list --dir-cache-time 96h --vfs-read-chunk-size 128M --buffer-size 64M --timeout 1h

I have a couple of questions:

  1. Considering the API limit for Drive, is it correct to set tpslimit at 10?
  2. Is there anything I can do to speed up directory listing? I’m using fast-list but I’m not sure if it actually helps or not.
    rclone caches directory structures and keeps them for 96h, per --dir-cache-time. Is there a way to set it up so that the directory structure is cached on disk and updated only when needed (so never expiring unless something changes)?
    Also, since my main use will be through an HTPC that is placed in standby after use (thus keeping memory state intact for when it’s woken up), would there be any detrimental effect in rasing dir-cache-time to… I don’t know 960h? The machine, as mentioned previously, has 16GB of RAM, I don’t think it would be a problem in that regard. But I’m not sure if the cache would survive the standby/wakeup process.
  3. Does rclone use chunk downloading by default? Meaning: if one does not specify --vfs-read-chunk-size is rclone using chunk downloading or not? This is more a personal curiosity than anything else, really.

Thanks!

No reason to set the TPS limit for a mount.
fast-list does nothing on a mount so you can remove it.
You can keep the dir-cache-time as high as you want and it’s only kept in memory and not on disk. Polling will pick up changes and expire what’s needed as that normally happens every minute.

Yes, the defaults are chunked downloading. You can remove all that and just use the defaults.

1 Like

One extra question on this.
By reading here https://rclone.org/commands/rclone_mount/ I see that “Changes made locally in the mount may appear immediately or invalidate the cache. However, changes done on the remote will only be picked up once the cache expires.”

My use case sees me adding stuff to Drive from a different machine than the one I use to access the content loaded there (accessing it through rclone mount). Does the above mean that new content wouldn’t be “seen” by the rclone mount until the cache expires? If that is the case, I would need to keep dir-cache as low as possible, unfortunately.

No, changes are picked up via polling on Google Drive so the dir-cache time doesn’t matter.

 --poll-interval duration                 Time to wait between polling for changes. Must be smaller than dir-cache-time. Only on supported remotes. Set to 0 to disable. (default 1m0s)

Ok. This is how i understood it before. But then what is the documentation referring to when talking about “changes done on the remote will only be picked up once the cache expires”?

I’m not questioning what you explained, just wondering if, maybe, the wording in the docs could be clearer in that passage.

And man, thank you so much. You provide an incredible job here, you reply so fast I’m… humbled, really.

It does depend on the backend as not every backend supports polling. It could be a bit clearer though to comment on that.

I’ll see if I can get some time on a pull request to add that as it makes sense.

3 Likes

I don’t think what I’m about to ask exists but still, it’s worth asking, in case I missed it somehow.

After a reboot (not standby/wake) cycle, obviously the internal (memory) dir listings cache is lost.
Is there a way to have only that on disk but no actual file caching?

When you scan a library for changes, after a reboot, there’s a strong delay because the machine needs to fire up a series of drive.files.list calls. I’m not worried about the number of calls, far lower than any limit there might be, I’m simply annoyed by the time needed for the first scan after a reboot.

Maybe this could be a valid feature request if it’s currently impossible?