SFTP mount increase stat() rate or alternatives

I recently switched from GDrive to an SFTP remote. What I did not remember is that GDrive is a polling remote, as such the very large --dir-cache-time I was using with GDrive was causing new files to not appear in the mount.

I decreased --dir-cache-time to 5min which solved that issue but introduced another as a side-effect. When the cache expires it takes so long to refresh it that another software that scans the mount every min completely freezes as it is waiting for I/O.

A strace shows that the rate of stat() is just too low.
For example a tree on the SFTP rclone mount using --max-age without anything else accessing the mount takes:

2573 directories, 4984 files

real    18m44.325s
user    0m0.191s
sys     0m0.396s

In comparison a tree on sshfs with the full files takes:

2579 directories, 42060 files

real    15m15.555s
user    0m0.619s
sys     0m0.847s

not that better, as the problem is caused by FUSE?

Any flags that would help with the SFTP remote or with the dir-cache being populated?
Alternatively is there some other remote I could use instead that would support polling.

I wonder, as the first level of dirs is returned in 1-2 sec, would it help splitting those dirs in 2,3,n connections each listing their respective parts when the cache is to be rebuild.

Run the command 'rclone version' and share the full output of the command.

rclone v1.64.0
- os/version: debian 10.13 (64 bit)
- os/kernel: 4.19.0-24-amd64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.21.1
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

SFTP

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone mount --allow-other --read-only --filter - /#*/ --dir-cache-time 2h --poll-interval 25s --buffer-size 16M --cache-dir /opt/rclone --vfs-cache-mode full --vfs-cache-max-age 1440h --vfs-cache-max-size 224G --vfs-cache-poll-interval 1m --vfs-read-chunk-size 128M --vfs-read-chunk-size-limit 256M --transfers 4 --umask 022 --max-age 2022-01-01 sftp_remote:/srv

--poll-interval has no effect on SFTP and some other options are the default values, I just have them there for ease due to laziness (hope that is not part of the issue :slight_smile: ).

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[sftp_remote]
type = sftp
host = <host>
user = <user>
port = <port>
key_file = ~/.ssh/id_ed25519
shell_type = unix
md5sum_command = md5sum
sha1sum_command = sha1sum

SFTP is slow. I'm not sure what else you can tweak other than waiting for the metadata cache to be a thing.

You can try using rclone rc vfs/refresh recursive=true _async=true (you'll need the -rc on your rclone mount) to fill up the directory cache first. Rclone will hopefully do this more efficiently than traversing the directory listing via the FUSE file system.

I have no experience with the --rc. I have read (quickly) the documentation and read a few threads so if the following questions have been answered before, sorry.

  1. The machine running rclone is publicly accessible. From a security standpoint by just adding --rc, it listens only to localhost by default and since I haven't added any authorization it will reject any commands that require it, correct?
  2. I run rclone rc vfs/refresh recursive=true _async=true and it was quite faster (results below)
    2.1 is there a way to see what command caused jobid X? The json response below does not state if it was a vfs/refresh or some other command.
  3. When --dir-cache-time expires how exactly will it behave? will it do a FUSE listing as normally?
  4. If you have some time could you explain in few words how the vfs/refresh works?

vfs/refresh time:

rclone rc job/status jobid=2
{
        "duration": 175.594298834,
        "endTime": "2023-10-04T23:06:15.475275247+02:00",
        "error": "",
        "finished": true,
        "group": "job/2",
        "id": 2,
        "output": {
                "result": {
                        "": "OK"
                }
        },
        "startTime": "2023-10-04T23:03:19.880976522+02:00",
        "success": true
}

Duration from other runs: [196,189,188]
Still much better than 18min and running it again does not cause any hiccups.

Misc question:

I saw in vfs dir a sftp_remote{1Isjm} dir. It must have been created when I was running some tests. Any idea what the {} is about? A temp dir that did not get removed? It was ~ 7GB in size.

Thank you @ncw!

I run mine with no auth as it's only listening locally on my local network on a home machine:

Pretty sure if you don't supply that, it generates a random username / password.

All that does is manually crawl the directory structure to prime the cache. It doesn't really do much faster per se but saves you the work of priming your cache. I do that with all my mounts.

There's no change in any of that as it all works the same as the rc command just runs it in the background. Once the dir cache expires, it has to recheck the remote if you try to fully crawl it.

There's really no magic there as it just crawls the file system. Depending on the remote, it may recursively do that, but on Dropbox as that's what I use, it does not, but Dropbox is a polling remote.

You probably made a full remote with a custom parameter. The debug log shows this on startup. I'd imagine you'd want to check for any data you may want in there and/or delete it after.

I understood that it does not. I tried: rclone rc operations/list and got:

2023/10/05 10:58:29 Failed to rc: failed to read rc response: 403 Forbidden: {
        "error": "authentication must be set up on the rc server to use \"operations/list\" or the --rc-no-auth flag must be in use",
        "input": {},
        "path": "operations/list",
        "status": 403
}

It appears to me that it is faster. Keep in mind in my usecase till the mount is fully listed the software on top keeps panicking as it locks waiting for stat().

  • Doing the priming (if I use the term correctly) via FUSE:
    Start the mount, do a tree on the whole mount takes 15min+
  • Doing the priming via vfs/refresh:
    Start the mount, run vfs/refresh, once it completes (~3.5 min including me checking if it completed) a tree takes 1.5sec to list everything.

So as far as the software on top is concerned it panics for ~3min instead of 15m+ :stuck_out_tongue:

Once the dir cache expires as you already said it crawls the directories as they are accessed by the software on top, which starts to freeze. A solution to that would be to increase dir cache to a huge number and periodically call vfs/refresh to update the mount with changes (there are only additions), as vfs/refresh updates the mount once it completes so it appears to the mount / software on top as instantaneous. Basically a poor man's rough polling substitute...
So a cron or systemd timer to call vfs/refresh? Is there a better way to approach this?

Would be nice if I could tell rclone run vfs/refresh once you start or/and every X time (basically what I will try to do with systemd execstartpost and a timer most likely).

Most likely as I was trying options to improve the performance with SFTP.
I have already removed it as I never add files via the mount, so it could only have files from the remote.

Would WEBDAV or some other self hosted protocol/software that can run on a Linux VM / container be a better choice?

I can't see your screen so not knowing what you typed makes it hard.

[felix@gemini ~]$ rclone mount DB: /home/felix/test --rc -vv --rc-addr :7758
2023/10/05 07:34:03 DEBUG : Setting --config "/opt/rclone/rclone.conf" from environment variable RCLONE_CONFIG="/opt/rclone/rclone.conf"
2023/10/05 07:34:03 DEBUG : rclone: Version "v1.64.0" starting with parameters ["rclone" "mount" "DB:" "/home/felix/test" "--rc" "-vv" "--rc-addr" ":7758"]
2023/10/05 07:34:03 NOTICE: Serving remote control on http://[::]:7758/
2023/10/05 07:34:03 DEBUG : Creating backend with remote "DB:"
2023/10/05 07:34:03 DEBUG : Using config file from "/opt/rclone/rclone.conf"
2023/10/05 07:34:03 DEBUG : Dropbox root '': Mounting on "/home/felix/test"
2023/10/05 07:34:03 DEBUG : : Root:
2023/10/05 07:34:03 DEBUG : : >Root: node=/, err=<nil>
2023/10/05 07:35:03 DEBUG : Dropbox root '': Checking for changes on remote
2023/10/05 07:35:18 DEBUG : rc: "vfs/refresh": with parameters map[recursive:true]
2023/10/05 07:35:18 DEBUG : : Reading directory tree

and my rc command.

[felix@gemini ~]$ rclone rc vfs/refresh recursive=true --url 127.0.0.1:7758

So you have to set a user name and password if you want one.

FUSE is just wants makes the file system available in the user space and rclone sits on top of it.

If you crawl the mount, it does that directory by directory.

If the remote supports a recursive listing and you use the vfs refresh, it will be faster to build the directory cache.

Either work fine. I'd probably use cron, but it really doesn't matter as it what works for you.

Maybe? You'd want to test and see if it works for you better. Your use case is unique to your setup/flow/etc. Best to just test and figure out what works best for you as if it works for me, it might be great for you.

The command I typed was literally rclone rc operations/list which I got from the rc documentation and the response of that command was the json I had pasted below it.

I 'll give it a go and see how well it works, unless ncw chimes in with another good idea!

Right as that's telling you:

[felix@gemini ~]$ rclone rc operations/list --url 127.0.0.1:7758
2023/10/05 08:42:05 Failed to rc: failed to read rc response: 403 Forbidden: {
	"error": "authentication must be set up on the rc server to use \"operations/list\" or the --rc-no-auth flag must be in use",
	"input": {},
	"path": "operations/list",
	"status": 403
}

That you need to either setup

--rc-no-auth

or setup a user name and password.

Exactly, that was an example to show that by default there is no user / pass, as you mentioned it might auto generate one.
For my use case there is no need to create a user/pass or use the no-auth as the vfs commands do not require authorization.

Not exactly as you can see from mine, for the refresh command, you don't need it, hence why I shared the command.

^^^^^^^^^^^^^^^^^^^^^

I am confused. Aren't we saying the same thing all this time?
By defining just --rc, rc will allow you to use only commands that do not require authorization.
Any command that does require authorization will get rejected unless you do specify user/pass or set the --no-auth. For example the operations/list command.

Unless you are referring to something in the DEBUG?

Do you need anything else as I'm not sure what you are asking at this point? Is something not working?

No it is fine. We are going in circles saying the same thing.

Thanks for the help btw.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.