A better semantics for the --vfs-cache-max-size parameter?

The current semantics of the --vfs-cache-max-size parameter is probably confusing to many users. The cache cleaner checks cache size periodically at the --vfs-cache-poll-interval (default 60 seconds) interval and remove cache items in the following order.

(1) cache items that are not in use and with age > vfs-cache-max-age
(2) if the cache space used at this time still is larger than vfs-cache-max-size, the cleaner continues to remove cache items that are not in use

The cache cleaning process does not remove cache items that are currently in use. If the total space consumed by in-use cache items exceeds vfs-cache-max-size, the periodical cache cleaner thread does not do anything further and leaves the in-use cache items alone with a total space larger than vfs-cache-max-size.

A cache reset feature was introduced in 1.53 which resets in-use (but not dirty, i.e., not being updated) cache items when additional cache data incurs an ENOSPC error. But this code is not activated in the periodical cache cleaning thread.

Should we invoke the cache reset code in the cache cleaner thread to reset cache items until the total size of the remaining cache items is below vfs-cache-max-size? This way users specifying the vfs-cache-max-size would be more likely to see the cache space usage consistent with their expectations. [An exception is that large write cache items can still cause the cache space to grow larger than the vfs-cache-max-size parameter because the cache reset code does not reset dirty cache items.]

The code change for this fix would be simple. @ncw What do you think?

I think it sounds like a good idea :slight_smile:

This is still a problem, but I can't see a way around that.

There is also the problem that we only check the cache every 60 seconds. We could potentially do a better job there - perhaps by keeping a running total. However we don't really know how much space sparse files take up on the disk and we don't know when we are unsparsing blocks in the file, so it is all a bit approximate!

Yep, --vfs-cache-max-size is an approximate guideline for the cache cleaner thread to keep the cache space usage in control. Resetting the cache items when necessary during the cache poll is low-hanging fruit to get the approximation closer to the user's expectation. :slight_smile: Using a running total seems to make more sense when/if cache eviction is changed to be done at the sub-file (block or region) level because we would also need a running queue for eviction candidates?

@ncw Assuming there is no objection, I will do a PR to include cache item reset logic during cache poll.

That sounds great - thank you :slight_smile:

i think 60 seconds is a reasonable default – In fact I set it to 10 minutes and anybody can modify this default

I vouch for this, sometimes vfs parameters make me some kind of confusion honestly :slight_smile:

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.