Rclone poll-interval mount support

Hi,
I am trying to find a new location for new data after my google drive went read only. Currently looking at a hetzner storage box. I am connecting via sftp but have noticed the mount point does not update unless I restart the service I created for this. Can you confirm if --poll-interval is supported on sftp remotes? If not do the following remotes support this? Samba/CIFS or Webdav

Thanks

welcome to the forum,

sftp,webdav,local and smb, do not support polling, otherwise known as ChangeNotify

and you can see that for yourself, in the log
INFO : sftp://redacted.your-storagebox.de:23/: poll-interval is not supported by this remote

the mount will update after a period of time as per
--dir-cache-time duration Time to cache directory entries for (default 5m0s)

or force a refresh using a command such as
rclone rc vfs/refresh recursive=true


here we can see sftp does not support polling, does not support ChangeNotify

rclone backend features hetznersbox01_sftp:
{
        "Name": "hetznersbox01_sftp",
        "Root": "",
        "String": "sftp://redacted.your-storagebox.de:23/",
        "Precision": 1000000000,
        "Hashes": [
                "md5",
                "sha1"
        ],
        "Features": {
                "About": true,
                "BucketBased": false,
                "BucketBasedRootOK": false,
                "CanHaveEmptyDirectories": true,
                "CaseInsensitive": false,
                "ChangeNotify": false,

here we can see webdav does not support polling

rclone backend features hetznersbox01_webdav:
{
        "Name": "hetznersbox01_webdav",
        "Root": "",
        "String": "webdav root ''",
        "Precision": 1000000000,
        "Hashes": [
                "sha1"
        ],
        "Features": {
                "About": true,
                "BucketBased": false,
                "BucketBasedRootOK": false,
                "CanHaveEmptyDirectories": true,
                "CaseInsensitive": false,
                "ChangeNotify": false,

Looking at source code I think that only google drive, amazon drive, dropbox and onedrive support polling. And I do not think it is clearly documented anywhere.

Thanks both, I will try adding --dir-cache-time to my service file and see if it makes a difference. As currently I am moving files and then deleting them but unionfs is not seeing until I restart the service. With Google drive and polling it worked fine

This is correct way for non polling remotes. dir-cache-time controls how often content is refreshed from the backend. Default is 5min.

changes made directly on the cloud storage by the web interface or a different copy of rclone will only be picked up once the directory cache expires if the backend configured does not support polling for changes. If the backend supports polling, changes will be picked up within the polling interval.

Thanks kapitainsky, happy to confirm my test folder has appeared after 5 mins so this is now all working. Any issues if I reduce it down to say 2 or 3 mins?

You can even reduce it to 10s:slight_smile: But it means you will keep constantly listing this remote.

I guess all depends on your workload - myself I set it to higher value than default - 15 min.

thanks :slight_smile: will play around with it. Interested to see if I has an effect when playing media

TWIMI: I just checked with rclone backend features and Box doesn't support polling (ListR) nor Recursive(ListR), so any directory optimizations are going to have to rely on --dir-cache-time.

Yes - it is only google drive, amazon drive (dead by now I think), dropbox and onedrive supporting polling.

Thanks for the confirmation.

I have a running rclone mount with a Box backend and it's taking a LONG time (over a minute!) to create a single small file in a large directory (ie, a directory already containing hundreds to thousands of other files), so I'm trying to speed it up by using a larger --dir-cache-time.

I don't want to stop the mount and restart it with a different --dir-cache-time, because I have a days-long command running on top of it, so I tried changing it on the running rclone mount via

rclone --rc-addr 127.0.0.1:5572 rc options/set --json '{"vfs": {"DirCacheTime": 6000000000000} }'

And in fact rclone --rc-addr 127.0.0.1:5572 rc options/get | grep DirCacheTime shows "DirCacheTime": 6000000000000, as expected, but the file creation speed on that directory is unnafected, still only about 1 per minute.

Does rclone mount indeed uses the new value set by rclone rc or does it stick with the one that was set when it started up?

(may sound like a crazy question, but I've seen other parameters, eg LogLevel, that could be set via rclone rc but the change was just ignored).

TIA!

Do not think mount options can be changed when it is already running.

But you can always start second mount (just do not use the same cache location) with different parameters.

So you can have multiple mounts to the same remote for different purposes:) - but make sure you set explicitly different cache location for each

Yeah, that's what I feared :frowning:

Sure. But it would not speed up the current running command -- which by my estimate will end up taking 28 days more to run :-/ and this command (a tar x on a very large file) is not restartable -- meaning I will have to re-run it from the beginning if I interrupt it. Restarting the tar with the -k option would at least skip already-present files, but it would still have to check all files to see whether they already exist.

Thanks for trying to help.

Again TWIMI(*): I used a second mount as recommended by @kapitainsky to try and determine what options would help with Box file-creation performance on a directory where there's already many (hundreds of) files. Here's what I found:

  1. --dir-cache-time does not help at all.

  2. The only thing that helped (and it helped quite a lot, file creation times went from 1 per minute to under 1 second) was setting --vfs-cache-mode to write or full; but this help is illusory, as it resulted from rclone mount caching the whole file creation operation -- and so in my scenario, where a large amount of small files are being migrated from Google Drive to Box, it would just fill the local cache directory and I would have to wait the same amount of time -- or longer -- for these files to actually be copied to Box.

  3. The determining factor in the slowness seems to be the number of files already in the directory; so, if you are moving a large amount of files to Box, it may help to restructure your large directories into a tree of smaller subdirectories (eg, by creating subdirectories called "A", "B", etc and then moving all files starting with A to subdirectory "A", ditto starting with "B" to B, and so on).

I hope this is of help to someone.

(*) To Whom It May Interest

1 Like

are you using --transfers flag with your mount? Increasing it should help with small files.

Whaat!? --transfers works with rclone mount?! This is totally unexpected! :+1:

On second thought, I see it could work as long as it caches each file's creation and then uploads many of them simultaneously up to box. But then it would require --vfs-cache-mode= >= writes, no? Please confirm.

As per docs...

When using VFS write caching (--vfs-cache-mode with value writes or full), the global flag --transfers can be set to adjust the number of parallel uploads of modified files from the cache (the related global flag --checkers has no effect on the VFS).

It looks like for Box it might be beneficial to set it high.

1 Like

Thanks for the confirmation including the quote from the docs. I'd just re-read the rclone mount doc page for the Nth time but failed to see that.

It looks like for Box it might be beneficial to set it high.

I will have it tested right away and post the result here. Thanks again!

Also increasing default --vfs-write-wait duration can be helpful depending on how your files are created. Pathological example is actively downloaded torrent file - sometimes download can stop - rclone starts upload to cloud immediately. torrent is updated - it starts upload from the start again...

1 Like

Humrmrmrmrmr... I just checked the docs again, and they say:

--vfs-write-wait duration Time to wait for in-sequence write before giving error (default 1s)

I think you actually meant --vfs-write-back, no?

--vfs-write-back duration Time to writeback files after last use when using cache (default 5s)

Presuming the latter, files in my use case are written sequentially from start to finish and then closed, one file after the other -- so I guess --vfs-write-back=0 would be best in my case as then they would be written back immediately after being closed, right?