I am experimenting with a solution when we Rclone mount runs out of cache space by removing cached files that are not dirty (i.e., no writes have been done to them). At this time the code doing the removal (and a new item.open()) is synchronous as a part of a retry upon an ENOSPC error in item.ReadAt(). It seems to work and I was able to get similar performance compared to when I had enough cache space to avoid the ENOSPC error.
@ncw Can we include a solution like this as a stop-gap before chunk-level cache replacement is added? This would enable us to use full cache mode for files that are cached only for reads without requiring the cache space in a ramdisk tmpfs to be larger than file size, which can be many TBs.
While working on this solution I found item.MetaDirty flag being set and it's not clear to me how that is used and what action needs to be taken before removing/resetting the cache file when MetaDirty is true. The MetaDirty flag is turned into true even when the file is only read (i.e., not changed).
Great! Are you going to send a pull request for this?
I think it sounds like a great idea. Chucking stuff which hasn't been modified is fine with me.
When MetaDirty is set the meta file will be written back to the disk. This is set when we update the Atime which is how we know which files to delete first. This happens even when files are opened read only.
Is the write-back of Atime needed for the backend (in our case, S3)? Which function does this metadata write back? I guess it's a meta file inside the cache dir?
I don't seem to find the code that sends the Atime info to S3. You mean writing back the Atime to the vfsMeta hierarchy on the local disk in a per-file meta file, right?
Although the experimented solution to do a retry in item.ReadAt to remove and recreate the cache file seems to work now, I hope to improve it so this fix leverages the cleaner thread to examine all clean items to remove. The current fix only removes the item that got the ENOSPC. The timer-driven cleaner thread will be turned on synchronously (assuming it's feasible) upon ENOSPC if not already running and it will remove clean open files once the out-of-space flag is turned on. What do you think about that approach?
Yes, I got the kick channel for resetting cache of clean open files implemented. It seems to work well with my use case now. I will go through the instructions on pull request and tests now.
@ncw I pulled from the master today in preparation for my vfs-clean-cache-reset branch. It looks like the recent removal of the vendor directory caused the following errors. What step(s) might I be missing?
Thanks! This step helps the go build of rclone on Ubuntu. But on MacOS, I got the following errors during rclone build after cleaning the cache. Might it be related to the recent osxfuse change?
I was building the master. But the master was changed and go.mod was different. I guess it's probably the result of my trying to run go mod tidy command when I ran into the cache cleaning need. After a git reset, it's building now. Thanks!
@ncw Are the changes supposed to pass the go.1.11 and go.1.12 tests? I am using github.com/pkg/errors in vfs/read_write.go (errors.Is()) for detecting the ENOSPC error and it fails 1.11 and 1.12 tests complaining about the errors package's inclusion. But this package seems to be also used elsewhere in rclone. Also, syscall.ENOSPC is acceptable for all platforms, but not other_os. Any suggestion?
errors.Is is part of the standard library and was only introduced in go1.13 so you can't use that for the moment unless we conditionally compile the code for go1.13 and above.
pkg/errors just calls that and it looks like it isn't compiled in for < go1.13
I think we will have to write our own IsErrNoSpace() function...
See fserrors.Cause for an error unwrapper which should work for all go versions.
I suggest you define an IsErrNoSpace() function in fs/fserrors which uses fserrors.Cause and compares it with syscall.ENOSPACE
I would suspect (can you link the build results?) that it is only plan9 which doesn't define this, so you could conditionally compile the plan9 IsErrNoSpace to always return false.