VFS Cache behaviour

Currently I am experimenting with allowing a much larger vfs cache and reading ahead the entire media file, to avoid fluctuations in the remote's transfer rate.

I am currently using the options below in my mount command:

--vfs-cache-mode full
--vfs-cache-max-age 8h
--vfs-cache-max-size 88G
--vfs-read-ahead 80G

This is working well. The issue I would like to discuss is that when the cache becomes full, the media file currently playing stops reading ahead any further. I can see this just by tracking the disk usage of each vfscache subfolder with du -h, which clearly shows media files reading ahead all the way to the full file size (as the ones I test with are way below 88GB) and the current one stopping when it reaches vfs-cache-max-size.

What happens when the current media file reaches the last cached point? I am guessing it will continue playing directly from the remote without caching any further, am I right?

Is there a way to make rclone start removing the older files from cache as soon as the vfs-cache-max-size is hit, even though it's not yet been vfs-cache-max-age time?

In other words, is there a way to always prioritise the currently playing file, getting it to be cached (incl. read-ahead) in full, to the detriment of older files without reducing the vfs-cache-max-age?

Your best bet would be to buy a bigger cache disk.

Having large files dump from the cache and re-create is a big performance hit as it has to dump the file and recreate what's there.

The method to evict is documented here:

rclone mount

Ok, but it's better than relying on flaky object storage remotes that can and do fluctuate below the media bitrate for several minutes at at time. My goal is to mitigate that by keeping the entire file in local storage.

Getting a larger disk will not solve the problem unless I reduce the vfs-cache-max-age, which I can do with my current cache disk too.

I read the link actually before I posted, and I re-skimmed through it just in case I missed something.

I focused on this part:

If using --vfs-cache-max-size or --vfs-cache-min-free-size note that the cache may exceed these quotas for two reasons. Firstly because it is only checked every --vfs-cache-poll-interval. Secondly because open files cannot be evicted from the cache. When --vfs-cache-max-size or --vfs-cache-min-free-size is exceeded, rclone will attempt to evict the least accessed files from the cache first. rclone will start with files that haven't been accessed for the longest. This cache flushing strategy is efficient and more relevant files are likely to remain cached.

The --vfs-cache-max-age will evict files from the cache after the set time since last access has passed. The default value of 1 hour will start evicting files from cache that haven't been accessed for 1 hour. When a cached file is accessed the 1 hour timer is reset to 0 and will wait for 1 more hour before evicting. Specify the time with standard notation, s, m, h, d, w .

This documentation however still doesn't answer my question (assuming an answer exists and is known): if the vfs-cache-max-age hasn't expired, and the vfs-cache-max-size has been reached, is there a way to get rclone to start removing the older, not-in-use and not-locked files to fit the currently playing file? Or is there no way to do that (other than reducing the max-age)?

I think it is valid question and throwing bigger cache disk at it is not really solving anything but delaying this issue occurrence.

In ideal world vfs cache size should have two limits - soft and hard one. Allowing it in lazy way to maintain (hard-soft) size cache free space buffer.

It does not have today - it would require development.

That's the line right here:

A large disk does solve it pretty well as I did it for a few years. You have an insane max age and a reasonable max size say 2TB or something. You hit max size, it removes the oldest files in the cache. Worked great for me for years.

I'm sorry, maybe I am being really thick here, but this only tells us how max-age works. I know that it sets a timer from last access time and starts evicting after timer expires.

What I am asking is, when there is a max-age AND a max-size, AND max-size has been reached while max-age has not, is there any way to make rclone evict old files or not?

I don't see how the documentation answers this.

I am not criticising - maybe it doesn't because no one thought about it yet or the norm is to use huge cache drives and it becomes unnecessary to even ask this. But for small cache drives it is a pertinent question.

If the max-age has not been reached, will it remove the oldest files when you hit max-size though?

Files are only removed periodically in case of max cache size exceeded. Whatever cache size and max age OP problem is always reproducible today IMO.

What is happening it that it looks like read-ahead before being invoked checks current cache size. If there is not enough space it does not trigger. Now it does not matter that few moments later cache polling maybe deletes some old items. Currently streamed item will continue without read-ahead.

Only solution would be to maintain free space cache buffer - big enough to always have enough space for new file to be streamed with read-ahead.

1 Like

Perhaps I'm not explaining it well.

Your goal: evict the oldest files first
Solution; Set max age high (99999h), set size to what you want based on your disk limitations
Outcome: Once size is reached, oldest files leave the cache.

A small cache disk that constantly gets out of size is a huge performance hit and drain and really not a great setup as it has to empty, reget. You'd be better off with no cache at all, imo.

I don't think so as all this happens concurrently.

If a file is open and read ahead is kicking in, say it's reading ahead for a 100GB file as an example. Cache expiration happens. It has to evict the file since there is no more space and puts down a new sparse file and starts to fill it out.

Setting the cache size to smaller than the biggest file 'works' but man, it's a hit on performance.

Prior to the cache changes, people got around this with buffer size IF a file wasn't closed as closing it drops the buffer.

If we're talking media streaming and you are using Plex and transcoding, your transcoding buffer size takes care of this so other settings are semi pointless. For plex, cache really comes into play with direct playing and that's about it.

(scanning a library is a whole different topic)

This is true statement. But again problem here is different...

Because of a not great config. Solving for the wrong thing.

So even though the max-age has not been reached (99,999h is 11 years), once the size is reached the oldest files will be deleted?
That's not what I am seeing here though with a max-age of 8h, the files stay there and read-ahead stops reading ahead (as max-size has been reached).
Would it really work differently if instead of 8h I set it to 99999h?

But that's really not my problem, nor is it causing any performance problems for me. Instead, it solves the problem that the remote bitrate is erratic. My files play smoother without freezing etc. with this setup. Disk performance hit is not a consideration here.

No, because with no cache at all, if the remote goes below the required bitrate for the file (which it does at times), playback freezes.
With the cache, this does not happen and playback is smooth for the entire duration of the file.

Yes, exactly this is what is happening.

Which basically means increase the cache disk size. As that costs a lot of money when dealing with cloud servers, I am looking to see if there is another way, so that older files are evicted without setting the max-age to a low value.

I am guessing there is no other way.

EDIT:

I was under the impression the documentation says that files currently open are not evicted from cache even if vfs-cache-max-size has been reached.
So something I am currently streaming won't be evicted from cache until I stop it, no?

No. Increasing the cache size is not changing anything. It will only take longer time before you experience your issue.

As it is today it can not be changed by any extra flags etc. Proper solution is to change existing algorithm. I think your case highlights area where it can be improved.

So... you know how it rolls here:) DYI and send PR, sponsor development or find somebody to do this for you:)

1 Like

I already donate monthly and will increase it as I am using rclone more and more, but the DYI method might get me into gear to brush up on my skills. Maybe in a year or two I'll be in a position to... understand the code :sweat_smile:

1 Like

You'd have to explain to me the player, if you direct playing or transocding and what media player you are using.

So maybe for some brave soul ready to have a go on it what we need is:

New flag to define soft cache size limit. By default it would 0/off

--vfs-cache-max-size-soft

Existing flag would become hard limit definition.

--vfs-cache-max-size

read-ahead to check if --vfs-cache-max-size not reached before proceeding. It is how it works today.

New behaviour would be that cache maintenance invoked every --vfs-cache-poll-interval would remove oldest files if cache size is greater than --vfs-cache-max-size-soft (if used).

Maybe others can have better ideas. Let's see

EDIT - or maybe read-ahead should ignore checking if cache size is exceeded or not. It would allow temporarily for cache to grow above set limits (until next cache polling) This could be configurable with a flag.

Jellyfin server, Jellyfin app on Shield TV Pro, always direct play, never transcoding.

I've never touched Jellyfin. It would be wise to run through an example, share a debug log. I spent a lot of time testing Plex and I'm intimately familiar with how Plex works, but not Jellfyin/Emby.

The goal here is with the changes you'd want (regardless if I think they make sense :slight_smile: ), you'd want to confirm all those things in the debug log, like if the file is open / closing or anything odd to ensure what you are asking actually would work.

1 Like

Do you know if, when old items eventually get evicted and there is free space again, read-ahead will start again during the current stream?
Or does read-ahead stay stopped once it stops once due to max-size reached?

Not sure. You would have to look at the source code.