VFS Cache file-names/folders/metadata

Hello. I have a question about rclone with vfs. I want to cache file names / folders / metadata without caching the files' contents themselves is there an easy way to do this? I spent the last few days trying to figure out how but I have not had much luck.

I have tried using full, minimal, and writes but they don't seem to do what I would like them to do.

rclone v1.62.2

  • os/version: alpine 3.17.2 (64 bit)
  • os/kernel: 5.15.46-Unraid (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.20.2
  • go/linking: static
  • go/tags: none
  • Docker version 20.10.14, build a224086

hello and welcome to the forum,

that should be rclone default behavior.

the default is --vfs-cache-mode off

quick comparison of rclone caches

No you cannot.

Feature request out there for it but not done.

1 Like

Not exactly sure what you're trying to accomplish, but there's always this:

I am trying to access a few thousands files across a few directories and the files are too large to cache them being upwards on 200gb each. I am doing 3-4 million api calls on dropbox a day across multiple accounts and need low latency for my application to stay responsive. I have a 2-4ms connection to dropbox's api. When I cache using vfs it is okay at first buts eventually starts struggling under high load with something as simple as listing files in a directory taking 10-40 seconds. This makes it so the files in them can not be read until it responds. When there isn't much load listing files is instant and reading those files is pretty quick.

When I turn caching off with the same high work load file access times are lower than with caching on but directory listing times are terrible which are to be expected.

Having some in between setting would be nice that way hopefully vfs stays responsive.

My vfs cache is stored on a zfs pool with 8 drives in a raid 0 configuration and 2 ssds as a cache for zfs. zfs has always performed well. I tried a few different setups and this seems to work fine.

I do monitor the rclone logs and I am not being api rate limited. I have --tpslimit 8 set for each dropbox account.

Cache mode has nothing to do with the file / directory structure.

The directory / file cache is not persistent and if you stop the mount, it's gone.

--dir-cache-time 9999h

I use that to keep the cache for long periods.

The vfs-cache-mode is purely for caching files to disk.

As I noted above, there isn't a persistent cache for the file/directory structure.

Oh this clears a lot of things up for me, I didn't know that.
Is this just for directories or does it cache the files as well? Or is that a separate setting?

If you are doing a listing, it cache the directory and file information.

Thank you for all the help, going to give this a try.

I use dropbox my only gripe is the first mount listing as it doesn't have a recursive flag built for it, takes a long time.

I use rc and run a post command to prime the directory cache

ExecStartPost=/usr/bin/rclone rc vfs/refresh recursive=true --url 127.0.0.1:5575 _async=true
1 Like

I run this after rclone mounts correct? I also have a user/password set for rc. Do I need anything for that?

rclone rc vfs/refresh recursive=true --user rclone --pass rclone --url 127.0.0.1:5572 _async=true

It returns

{
"jobid": 1
}

Does this look correct?

Yep. Depending on the number of directories and such, it'll take some time.

Mine takes a good 10-15 minutes to prime.

I have 6k directories and like 6k files.

Ahem, since nobody clicked my link, that solution is exactly what I suggested :wink:

@Animosity022 - how come the priming takes so long for you? Mine takes 5 minutes at most (it varies, even though the content doesn't change much). That's with about 55k folders and 286k files.

Are you still Google Drive or on Dropbox?

Still on Google Drive and good old Windows.

Drive has a recursive list and Dropbox has it in the API but it is not implemented yet in rclone so that’s the speed difference.

Ah, I had no idea Dropbox was different in that regard. Makes sense.

So, if I had moved my stuff over to DB, priming would take forever...

I just didn't understand what that exactly did from the way you explained it is all. When Animosity022 broke it down it made a lot more sense to me. Thank you though ^-^

1 Like

This may be a dumb question but I noticed latency goes up when loading fresh data from the remote even if part of the file is in vfs. I noticed as long as the files are being constantly read/refreshed latency stays low. Is there a setting I can use to optimize this? Like if a file is being read to keep it open for longer in memory after it's been closed?