Advanced caching method

Hi,

I am looking for a very special usecase but I think it would be amazing for websites using rclone and maybe for torrenting as well.

Since there is a concurrent file transfer limit it would be awesome to have caching. But if something would like to access a file that is not cached rclone should send an iowait to the requester and only stream it when it's completely arrived onto the disk. And it would also be awesome to have a REST API or something where I can invalidate for example the non frequently accessed, but already cached files.

Can rclone do this now or it's not possible?

If you are using rclone mount then --vfs-cache-mode full does exactly this.

Rclone has an API for this kind of thing.

However I think what you need is probably built in already

  --vfs-cache-max-size SizeSuffix          Max total size of objects in the cache. (default off)

I serve beta.rclone.org with these flags and the caddy web server.

Awesome for the first part! :slight_smile:

And for the second, I want to invalidate the least opened files. Will that flag respect that or it will always delete the oldest cached files when needed?

Currently I'm 99% sure it's by age, so it's kind of basic.
I think the original idea behind the cache was more as a necessity for for OS comparability on writes rather than as a performance enhancer - but there is no reason it can't be both obviously. It's just not very optimized towards that goal yet.

I totally agree that this could be greatly improved, so if you are saying you may be willing to adjust the code for that I have lots of ideas in this regard :wink:

For example - here is an outline of a size-weighting factor when considering which files to eject from the cache:


(disregard it originally for the cache backend - I think that system is fast getting phased out and everything there applies equally to the VFS cache which it makes more sense to focus on)

Size of course matters because if you have good bandwidth your large files actually transfer fast, but it's the collection of small files that will be problematic and incredibly inefficient. Therefore - having most of your tiny files in cache is GREAT for overall performance, while still not taking up a lot of actual space.

You can of course combine the two and make a weighing factor that considers both the last-access and the size and set a good balance between the two (preferably adjustable). That would do wonders for the efficiency of the current VFS cache.

It will remove the least recently used which is probably what you want.

Oh, so by last-access time in other words?
I learned something new then :slight_smile:

1 Like

Amazing, thank you!

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.