'vfs/refresh' targetting a new directory created outside rclone will error with "file does not exist"

What is the problem you are having with rclone?

Running vfs/refresh with a dir parameter will throw a "file does not exist" error if the directory isn't yet in the mount's VFS directory cache even though it exists in the remote. Instead of erroring I would expect rclone to walk the remote looking for the directory. If it finds it it should load it into the VFS and only throw that error if it doesn't exist on the remote.

If I refresh the new directory's parent directory then the new directory will appear in the mount, but if the parent directory is large then this is is a very intensive approach that I believe should be unnecessary.

I know --dir-cache-time and --poll-interval could help to work around this issue but I would still expect vfs/refresh to discover the new directory.

What is your rclone version (output from rclone version)

rclone v1.57.0-beta.5629.e45c23ab7

  • os/version: Microsoft Windows 10 Pro 2009 (64 bit)
  • os/kernel: 10.0.19042.1165 (x86_64)
  • os/type: windows
  • os/arch: amd64
  • go/version: go1.16.7
  • go/linking: dynamic
  • go/tags: cmount

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Windows 10 (64 bit)

Which cloud storage system are you using? (eg Google Drive)

Google Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone rc vfs/refresh dir="rclone_test/new_dir_on_gdrive/" recursive=true --user=xxx --pass=xxx

The rclone config contents with secrets removed.

[gdrive]
type = drive
scope = drive
token = {"access_token":"xxx","token_type":"Bearer","refresh_token":"xxx","expiry":"2021-08-16T23:18:10.6857809+12:00"}
root_folder_id = xxx

A log from the command with the -vv flag

2021/08/16 22:31:25 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "rc" "vfs/refresh" "dir=rclone_test/new_dir_on_gdrive/" "recursive=true" "--user=xxx" "--pass=xxx" "-vv"]
{
        "result": {
                "rclone_test/new_dir_on_gdrive/": "file does not exist"
        }
}
2021/08/16 22:31:25 DEBUG : 4 go routines active

Here is a more complete example using a series of commands:

  1. A new directory is created outside the mount
  2. The mount is told to refresh the new directory but fails "file does not exist"
  3. The mount is told to refresh the parent directory and succeeds
  4. The mount is told to refresh the new directory and succeeds
rclone mkdir gdrive:rclone_test\new_dir_on_gdrive\ -vv
2021/08/16 22:42:39 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "mkdir" "gdrive:rclone_test\\new_dir_on_gdrive\\" "-vv"]
2021/08/16 22:42:39 DEBUG : Creating backend with remote "gdrive:rclone_test\\new_dir_on_gdrive\\"
2021/08/16 22:42:39 DEBUG : Using config file from "C:\\xxx\\rclone.conf"
2021/08/16 22:42:41 DEBUG : fs cache: renaming cache item "gdrive:rclone_test\\new_dir_on_gdrive\\" to be canonical "gdrive:rclone_test/new_dir_on_gdrive"
2021/08/16 22:42:41 DEBUG : Google drive root 'rclone_test/new_dir_on_gdrive': Making directory
2021/08/16 22:42:42 DEBUG : 4 go routines active

rclone rc vfs/refresh dir="rclone_test/new_dir_on_gdrive/" recursive=true --user=xxx --pass=xxx -vv
2021/08/16 22:42:59 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "rc" "vfs/refresh" "dir=rclone_test/new_dir_on_gdrive/" "recursive=true" "--user=xxx" "--pass=xxx" "-vv"]
{
        "result": {
                "rclone_test/new_dir_on_gdrive/": "file does not exist"
        }
}
2021/08/16 22:43:00 DEBUG : 4 go routines active

rclone rc vfs/refresh dir="rclone_test/" recursive=true --user=xxx --pass=xxx
{
        "result": {
                "rclone_test/": "OK"
        }
}

rclone rc vfs/refresh dir="rclone_test/new_dir_on_gdrive/" recursive=true --user=xxx --pass=xxx -vv
2021/08/16 22:46:49 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "rc" "vfs/refresh" "dir=rclone_test/new_dir_on_gdrive/" "recursive=true" "--user=xxx" "--pass=xxx" "-vv"]
{
        "result": {
                "rclone_test/new_dir_on_gdrive/": "OK"
        }
}
2021/08/16 22:46:50 DEBUG : 4 go routines active

I'm not sure what you are expecting.

If you create something outside the mount and the directory cache hasn't expired and polling hasn't occurred, rclone wouldn't know it existed and it produces the error you described.

There's no magic here to 'know' it's there as that's how polling and dir cache works as you stated.

Hi Animosity022, while I have you here I just want to thank you for all the help you've provided in the forums. I've learnt a lot from you. Thank you! :slight_smile:

You're right that I wouldn't expect a new directory/file created outside the mount to automatically appear without use of --dir-cache-time or --poll-interval, but I was under the impression that using the rc command vfs/refresh would allow me to update the directory cache manually. Thats definitely how it appears to work.

vfs/refresh without a dir= rebuilds the directory cache from the root directory, and this is working for me as expected. When vfs/refresh is passed a dir=, rclone will update at that directory only (as well as all its children if recursive=true is used). If the directory that was passed already exists in the directory cache then this command also works fine. The issue I'm seeing is that for vfs/refresh to update a specific directory then the directory must already exist in the directory cache. Instead of this behaviour, as rclone is attempting to update the directory provided, I would have thought that rclone should walk that path on the remote (Google Drive in my case) and if that path is valid on the remote it would update the directory cache by caching any new folders it needs in order to get there.

Maybe it wasn't implemented this way due to some technical limitation or perhaps I have some kind of misunderstanding of how vfs/refresh is supposed to work.

Also, for context, the reason I'm not using the high --dir-cache-time and low --poll-interval approach is because in my actual implementation I have a drive > crypt > union > mount setup in order to combine Google Drive with local disks. It appears that --poll-interval isn't supported through the union as the directory cache isn't being updated automatically and running rclone rc vfs/poll-interval returns the error "poll-interval is not supported by this remote".

It can't refresh something it doesn't know that exists as you specifically asked to refresh a non existent path.

Walking remotes might be fine for Google, but rclone supports many remotes that contain costs per API hit.

If you goal is to refresh, you'd want to do the whole thing or parse a path above that you know exists.

Are you saying that the behavior I'm seeing is implemented that way by design? Or is it a limitation for which there is no current plan to address? To me, this seems well within the scope of what vfs/refresh should be able to do.

I don't think walking would actually be necessary. I'm not familiar with every remote but surely simply they would all support being queried for if the dir= exists.

Having to query a parent directory that already exists could be a very intensive approach, especially when using recursive=true to get the desired directories children. For example, in my setup:

  1. Radarr (running on my NAS) puts a new movie in my union:/movies/ directory.
  2. Radarr tells rclone (running on my media server) to do a vfs/refresh dir=/movies/new_movie/ recursive=true
  3. Rclone on the media server throws a "file does not exist" error

Instead, what you're saying, is rclone would need to do vfs/refresh dir=/movie/ recursive=true and thus refresh the entire movies directory recursively. This will update thousands of directories and take many minutes instead of the couple of seconds it would take if rclone could simply attempt to discover the directory it was provided if it didn't yet have it in the directory cache.

If you have:

TOP
     ONE
     TWO

and you add a directory called:

TOP\THREE

TOP gets invalidated and walked.

Example debug log:

2021/08/16 15:52:50 DEBUG : top/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2021/08/16 15:53:17 DEBUG : Google drive root '': Checking for changes on remote
2021/08/16 15:53:17 DEBUG : : changeNotify: relativePath="top", type=0
2021/08/16 15:53:17 DEBUG : : invalidating directory cache
2021/08/16 15:53:17 DEBUG : : >changeNotify:
2021/08/16 15:53:17 DEBUG : : changeNotify: relativePath="top/three", type=0
2021/08/16 15:53:17 DEBUG : : >changeNotify:
2021/08/16 15:53:36 DEBUG : /: Attr:
2021/08/16 15:53:36 DEBUG : /: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2021/08/16 15:53:36 DEBUG : /: Lookup: name="top"
2021/08/16 15:53:36 DEBUG : /: >Lookup: node=top/, err=<nil>
2021/08/16 15:53:36 DEBUG : top/: Attr:
2021/08/16 15:53:36 DEBUG : top/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2021/08/16 15:53:38 DEBUG : top/: Attr:
2021/08/16 15:53:38 DEBUG : top/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2021/08/16 15:53:38 DEBUG : top/: ReadDirAll:
2021/08/16 15:53:38 DEBUG : top/: >ReadDirAll: item=5, err=<nil>
2021/08/16 15:53:38 DEBUG : top/: Lookup: name="one"
2021/08/16 15:53:38 DEBUG : top/: >Lookup: node=top/one/, err=<nil>
2021/08/16 15:53:38 DEBUG : top/one/: Attr:
2021/08/16 15:53:38 DEBUG : top/one/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2021/08/16 15:53:38 DEBUG : top/: Lookup: name="three"
2021/08/16 15:53:38 DEBUG : top/: >Lookup: node=top/three/, err=<nil>
2021/08/16 15:53:38 DEBUG : top/three/: Attr:
2021/08/16 15:53:38 DEBUG : top/three/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2021/08/16 15:53:38 DEBUG : top/: Lookup: name="two"
2021/08/16 15:53:38 DEBUG : top/: >Lookup: node=top/two/, err=<nil>
2021/08/16 15:53:38 DEBUG : top/two/: Attr:
2021/08/16 15:53:38 DEBUG : top/two/: >Attr: attr=valid=1s ino=0 size=0 mode=drwxrwxr-x, err=<nil>
2021/08/16 15:53:47 DEBUG : Google drive root '': Checking for changes on remote

Not ensuring consistency would lead to funkiness in the cache as it doesn't know what changed. It's no different once your dir-cache time runs out as it has repoll everything anyway.

I find Windows/NAS to be very lacking in that setup and you hit problems like these where Linux is a much better fit for the use case.

Hi Heath,

I have no experience with a setup like yours, but perhaps the new Google Drive for Windows could somehow be useful.

It supports streaming: “Only use hard drive space when you open files or makes files available offline”.

I guess it is based on file change notifications instead of polling for changes.

More info here:
https://support.google.com/googleone/answer/10838124

I think it's the union and the no polling that causes the issue, which is one reason I use mergerfs on Linux.

Ah, I get it, thanks - I overlooked the union on the media server

Thanks for discussing this with me guys. If this doesn't qualify as a bug with the refresh command then it looks like I'll have to find a work around. :slight_smile: