Vfs-cache-mode FULL Most needed feature

From my brief testing of vfs-cache-mode full, I'm missing a feature that might improve on the flexibility of it. Its the file count limit for stale objects.

--vfs-cache-poll-interval 1s works OK but if you have a bunch of small size, it will not get cleared due to --vfs-cache-max-size.

e.g. set
--vfs-cache-max-size 1000M
--vfs-cache-poll-interval 1s

Small files do not get cleared even if its stale until 1000M is reached. Maybe its intended or a bug but I'd prefer if you could add a file count limit instead.
--vfs-cache-file-count-stale 5
All stale files are limited to 5 files. Preferably delete files by oldest date so the earliest 5 stale files are kept.

This is highly beneficial as it mostly affects responsiveness. Whenever its clearing a bunch of files when --max-size is hit, the thousands of small files will take some time to clear, sometimes minutes.

Maybe its a bug that has max-size higher priority than poll interval?
Hope I explained it clearly.

You probably need the --vfs-cache-max-age flag instead of the --vfs-cache-poll-interval

Docs: https://rclone.org/commands/rclone_mount/#vfs-file-caching

--vfs-cache-max-age duration         Max age of objects in the cache. (default 1h0m0s)
--vfs-cache-max-size SizeSuffix      Max total size of objects in the cache. (default off)
--vfs-cache-poll-interval duration   Interval to poll the cache for stale objects. (default 1m0s)

Edit: Ok shall try max-age
Edit2: Just tested it,

--vfs-cache-max-age 1s \
--vfs-cache-max-size 500M \
--vfs-cache-poll-interval 1s \

By adding -max-age 1s, it doesn't respect 500M, even small sizes gets deleted. What I need is 500M being respected while deleting only stale files. I think making max-age=default 1 hour and having max-size the highest priority is the way to go but adding file-count-stale would be best.

It can go over but should not by a huge amount as it's cleaning up. The poll interval of 1s seems excessive as it would just be trying to clear all the time.

Can you share a debug log with the issue?

I found a bug when upgrading beta.rclone.org to use the new rclone. It seems like the count of objects

2020/09/16 13:24:15 INFO  : vfs cache: cleaned: objects 19943 (was 19943) in use 0, to upload 0, uploading 0, total size 4.742G (was 4.742G)

Is not accurate

# find  /home/www-data/.cache/rclone/vfsMeta/ -type f | wc -l
1296

Are you looking at the count message to make your decisions?

chmod 0777 --cache-dir to see the files disappear/appear and noticed small files linger until it reaches the max-age.

But yea, main point is to have a limited number of stale objects if its possible.

Poll 1s is mainly due to small tmpfs ram and to get rid of stale files as quick as possible.

Right, but you'd be constantly creating overhead with such a small interval. If you are using a super tiny tmpfs, it's probably better to use something slightly bigger.

Having a flag to limit the total number of files in the cache doesn't sound unreasonable to me...

We also need a flag to limit the number of VFS entries cached in RAM as these use up RAM and there is no way to clear them at the moment...

Not all files. Just stale files. Limit stale files. Perhaps you're right, I'll just mergefs with my SSD i guess.

Do you define stale to be over the max age?

I think what you are asking for is a cache clean to be kicked off sooner so instead of waiting for the interval, kick it off after there are a set number of files that are over age.

Is that right?

That's the gist of it yeah!

Max-age works as intended. When you have a long max-age with a huge max-size, small files are the issue for me.

The issue is mainly responsiveness as when the small files starts to clear up due to max-age or max-size, it will bog down the whole system or sometimes stutter. Keep in mind it will accumulate a lot of small files.

To define what's stale or not, I used polling 1s and max-size 1M. This will overshoot cache until the file goes stale.

I'm mimicking vfs-cache-mode off.

Give it a try, use this method and you will see your videos starts much faster for some odd reason. Of course, this is mainly for small video files like mine so it might not work for you.

But yes, its much faster doing it this way.

Why would creating more overhead for rclone make something faster?

If you have sufficient space, spreading out the cleaning would sped things up. If you are limited on the space (which I think you mentioned you are), that would be a reason.

In 60 seconds, you are running a clean 60 times and in my case, I'm running 1 which is much less overhead as I have the disk space to handle keeping some data local.

My start times before and after are pretty much the same as the new cache handles internet blips and starting/stopping in the middle much smoother now.

No idea but its definitely faster. Tested playback on direct play. 3-4 seconds for the method above.

7-8 seconds for vfs off.

I was referring to those settings as cache-mode full and off would have much different results.

Direct play tends to open a file 3 times before it plays so having that part cached makes things faster to start. The times are very broad ranges as there are many factors on your internet speed / peering /etc.

For example, on my setup, I start to direct play things in ~1-2 seconds with off and normally 1 second with it on. Anything on disk (cache) is instant.

Wow, does this really happen? This explains a lot tbh.

Used max-size rather than max-age. You have to choose between speed and amount of ram you have. Having do this on your ssd might impact its TBW, unless its one of those enterprise ones.

Yep, you can put rclone in debug and see the 3 opens (I haven't checked that in the last few release but was consistent prior).

With streaming from Plex, you could be fine with a slow drive anyway as unless you have many, many streams, it's unlikely it would even be hit much as more streams are only a few MB at a time with 4K being a bit more heavy with 10-30ishMB depending on the bitrates.

In my case, I'd rather spend 75$ or so and it should get a few years out of it as I'd have my Linux box on SSD for many years now and it still runs like a champ in my basement :slight_smile:

1 Like

I'd prefer to not cache anything tbh. But having tested FULL, its much faster while having the OFF benefits but not its cons.

Yep, you could even do a $50/$60 500GB SSD and let that go for a few years without much issue. I did a 1TB that I use for my cache for like $90 so I figure that even if it only lasts 2-3 years, I'd have upgraded something by then anyway.