Cache dir without mount?

Having used rclone for ages I feel a bit dumb asking this question. But I searched through prior posts and couldn't find this particular issue addressed. If it has already been answered, a link to the post would be much appreciated.

rclone v1.51.0

If I run the following commands, is there any way to get rclone to cache the directory between the first and second command without a mount being present?

rclone ls remote: --include aaa*
rclone ls remote: --include bbb*

Most of the documentation related to persistent caches seems to be for vfs/cache mounts.

Any suggestions much welcome :slight_smile:

You'd use --fast-list to see if that helps or you can wrap it in the cache backend if that was the goal.

To give a technical answer here - currently it is not possible to run the VFS-layer (that handles the VFScache) withut the mount. These are considered to be part of the same system.

However... I have discussed this with Nick before, and it should in principle not be a problem to disconnect the VFS from the mount such that you can use the VFS-layer without the mount (the mount will still require the VFS-layer by necessity). This would certainly have a lot of good benefits for advanced users so I understand this is what the eventual goal will be.

However it will take some work-intensive restructuring and some patience is likely needed :slight_smile:

In your case specifically, the practical solutions availiable to you are basically what Animosity said. --fast-list (if available on your remote type) will probably fix the issue for the most part as it can be as much as 15x faster - while running the cache-backend would actually cache the listings like you are asking for. I'm not sure if I really recommend the cache backend at this point because it has issue and will be phased out eventually - but the caching of listings I think work well enough.

One last solution - if you have a remote capable of "ChangeNotify" like Gdrive is to use a scripted OS-precache like I use. This basically makes a RAM-cache for the whole file hierarchy in the OS-cache at mount startup that will be kept up-to-date via changenotify. That allows for some neat tricks like searching and filtering files on a clouddrive at SSD-speeds (well, RAM speeds really). I won't put a whole tutorial here because it a bit more involved than just adding --fast-list to your commands :wink: but if you are interested then throw me a PM and I can share my scripts for it and explain in more detail how that functions.

1 Like

Thank you @thestigma The first part of your reply guesses exactly what I was getting at ( non-mount, temporary query-related caches for repetitive queries that might be looking at the same directories). I have seen and use a variant of your excellent pre-cache scripts! Thank you for sharing those.

^^ Completely understood. Happy to wait patiently. :wink:

Also thank you @Animosity022 for your reply. I wrote the question in short hand hoping that someone would intuit that I was asking about non-backend --flags. I should have been a bit more specific (sorry, was late at night when I popped off the question).

I'm not sure what you mean. If you use the cache backend, it solves your use case. Are you not trying to keep a cache of the directory/files?

It does solve the specific example I cited, thank you.

I do use --fast-list, --checkers and other options to speed up queries when needed. But in this case I was looking for a --flag based option for keeping a directory listing in cache for some period of time without creating a cache remote for each remote being queried. It was my fault for not being more specific - apologies.

A separate-but-related question, which I can move to a new post if that is better:

When rclone ls remote --include aaa* is run, is the aaa* filter being applied in the backend itself (that is, google filters the results before returning the answer to rclone) or is the full ls reply being returned locally to rclone and the --include filter is applied locally?

You can use the cache backend is that what you mean?

@ncw Thank you for the reply :slight_smile: No, this is a separate question about how rclone operates.

When a filter is applied to a query, is the filter executed locally (where rclone is running) or remotely on the back end?

Example:
Files aaa.txt and bbb.txt exist in Google Drive, and a remote named google: points to them.

If you run rclone ls google: --include aaa* does rclone

  • pull the names of the two files aaa.txt and bbb.txt to the local disk, then apply the filter to return aaa.txt ?

OR

  • execute the filter remotely on Google Drive and return only the filtered result aaa.txt?

It is executed locally, however rclone makes an effort not to list directories which aren't included in the filter.

1 Like

Great! Mostly curiosity. But a good excuse to have a fast nvme for large queries!!

@ncw I think a lot things like these could be fixed by separating the VFS-layer from the mount. We've discussed this in the past, but I don't know how fresh it remains in the current ongoing agenda. It's an idea to keep in mind next time decide to overhaul the VFS systems. Also looking forward to that async upload queue function. I think that will not only be convenient but also solve a lot of current issues.

The conclusion I've come to is that we need a unified vfs-cache which can be used for normal rclone operations (like rclone sync) and also for rclone mount.

The async upload should eventually be part of the new vfs-cache. Async upload implies data persistence (we need to keep track of files we are uploading).

I've been putting a lot of background cycles into thinking about both these issues - I don't want to start coding before I've got a decent plan!

3 Likes

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.