Sftp backend opens less connection than expected

What is the problem you are having with rclone?

I am using SFTP backend to syncronize folders and as restic backend for my backups. Everything works fine (this setup is running for years) but seems that rclone is opening a lot less connections than expected, and this is limiting the bandwidth usage.

Run the command 'rclone version' and share the full output of the command.

rclone v1.63.1
- os/version: freebsd 13.2-release-p1 (64 bit)
- os/kernel: 13.2-release-p1 (amd64)
- os/type: freebsd
- os/arch: amd64
- go/version: go1.20.6
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

SFTP toward a dedicated remote machine

The command you were trying to run (eg rclone copy /tmp remote:tmp)

Two different commands are used to reproduce the issue:

rclone sync -P --transfers 10 -vv /store/video/anime/Gordian remote:/store/tmp/Gordian
restic backup --files-from filelist.txt -o rclone.connections=10
that in turn executes
rclone serve restic --stdio --b2-hard-delete remote:/store/tmp/restic

The rclone config contents with secrets removed.

[remote]
type = sftp
host = remote.host
key_file = /root/.ssh/id_ed25519
shell_type = unix
md5sum_command = md5sum
sha1sum_command = sha1sum

A log from the command with the -vv flag

When using rclone sync:

2023/07/24 12:34:15 DEBUG : rclone: Version "v1.63.1" starting with parameters ["rclone" "sync" "-P" "--transfers" "10" "-vv" "/store/video/anime/Gordian" "remote:/store/tmp/Gordian"]
2023/07/24 12:34:15 DEBUG : Creating backend with remote "/store/video/anime/Gordian"
2023/07/24 12:34:15 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2023/07/24 12:34:15 DEBUG : Creating backend with remote "remote:/store/tmp/Gordian"
2023/07/24 12:34:16 DEBUG : sftp://root@remote:22//store/tmp/Gordian: New connection 10.0.0.2:51943->x.x.x.x:22 to "SSH-2.0-OpenSSH_9.3 FreeBSD-20230316"
2023/07/24 12:34:16 DEBUG : sftp://root@remote:22//store/tmp/Gordian: Shell type "unix" from config
2023/07/24 12:34:16 DEBUG : sftp://root@remote:22//store/tmp/Gordian: Using root directory "/store/tmp/Gordian"
[chekers lines removed]
2023-07-24 12:34:16 DEBUG : sftp://root@remote:22//store/tmp/Gordian: Waiting for checks to finish
2023-07-24 12:34:16 DEBUG : sftp://root@remote:22//store/tmp/Gordian: Waiting for transfers to finish
2023-07-24 12:34:16 DEBUG : sftp://root@remote:22//store/tmp/Gordian: New connection 10.0.0.2:46459->x.x.x.x:22 to "SSH-2.0-OpenSSH_9.3 FreeBSD-20230316"
2023-07-24 12:34:17 DEBUG : sftp://root@remote:22//store/tmp/Gordian: New connection 10.0.0.2:31240->x.x.x.x:22 to "SSH-2.0-OpenSSH_9.3 FreeBSD-20230316"
2023-07-24 12:34:17 DEBUG : sftp://root@remote:22//store/tmp/Gordian: New connection 10.0.0.2:18421->x.x.x.x:22 to "SSH-2.0-OpenSSH_9.3 FreeBSD-20230316"
Transferred:        1.255 GiB / 21.058 GiB, 6%, 25.073 MiB/s, ETA 13m28s
Errors:                 2 (retrying may help)
Transferred:            2 / 74, 3%
Elapsed time:        49.1s
Transferring:
 * Gordian - s01e02 - L'i…o con Saori [576p].mkv: 26% /293.903Mi, 1.321Mi/s, 2m43s
 * Gordian - s01e03 - Il …serto rosso [576p].mkv: 37% /296.045Mi, 1.964Mi/s, 1m34s
 *           Gordian - s01e01 - Daigo [576p].mkv: 37% /295.907Mi, 1.954Mi/s, 1m35s
 * Gordian - s01e04 - La …gente sacra [576p].mkv: 36% /296.622Mi, 1.943Mi/s, 1m36s
 * Gordian - s01e05 - La …della morte [576p].mkv: 58% /295.846Mi, 3.553Mi/s, 34s
 * Gordian - s01e06 - Una…Victor City [576p].mkv: 25% /295.298Mi, 1.279Mi/s, 2m53s
 * Gordian - s01e07 - Il …reto di Gen [576p].mkv: 25% /295.154Mi, 1.298Mi/s, 2m49s
 * Gordian - s01e08 - L'a…di Mukadeck [576p].mkv: 26% /294.991Mi, 1.315Mi/s, 2m45s
 * Gordian - s01e10 - Capo Naso Rosso [576p].mkv: 51% /294.916Mi, 3.590Mi/s, 40s
 * Gordian - s01e11 - Gli… reggimento [576p].mkv:  0% /294.641Mi, 0/s, -

A similar behaviour when using restic:

rclone: 2023/07/24 12:22:07 DEBUG : rclone: Version "v1.63.1" starting with parameters ["rclone" "serve" "restic" "--stdio" "--b2-hard-delete" "remote:/store/tmp/restic"]
rclone: 2023/07/24 12:22:07 DEBUG : Creating backend with remote "remote:/store/tmp/restic"
rclone: 2023/07/24 12:22:07 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
rclone: 2023/07/24 12:22:07 DEBUG : sftp://root@remote:22//store/tmp/restic: New connection 10.0.0.2:29974->x.x.x.x:22 to "SSH-2.0-OpenSSH_9.3 FreeBSD-20230316"
rclone: 2023/07/24 12:22:07 DEBUG : sftp://root@remote:22//store/tmp/restic: Shell type "unix" from config
rclone: 2023/07/24 12:22:07 DEBUG : sftp://root@remote:22//store/tmp/restic: Using root directory "/store/tmp/restic"
rclone: 2023/07/24 12:22:07 DEBUG : file-11047155567684542504: GET request error: object not found
rclone: 2023/07/24 12:22:07 DEBUG : keys: list request
rclone: 2023/07/24 12:22:08 DEBUG : locks: list request
rclone: 2023/07/24 12:22:09 DEBUG : locks: list request
rclone: 2023/07/24 12:22:09 DEBUG : snapshots: list request
rclone: 2023/07/24 12:22:09 DEBUG : index: list request
rclone: 2023/07/24 12:22:16 DEBUG : sftp://root@remote:22//store/tmp/restic: New connection 10.0.0.2:51118->x.x.x.x:22 to "SSH-2.0-OpenSSH_9.3 FreeBSD-20230316"
rclone: 2023/07/24 12:23:07 INFO  : total 7368 files 38.080 GiB, 0 errors ETA 22:25
rclone: Transferred:   o    1.010 GiB / 1.255 GiB, 81%, 18.761 MiB/s, ETA 13s
rclone: Transferred:            3 / 13, 23%c_012662.png
rclone: Elapsed time:       1m0.5s
rclone: Transferring:
rclone:  * data/a9/a9d821815ccb80…7ae4241fdc44433e769121: 89% /128.550Mi, 1.982Mi/s, 6s
rclone:  * data/a8/a8d1abfbc68104…e4096a901174195eb976ab: 88% /128.562Mi, 2.024Mi/s, 7s
rclone:  * data/d7/d7750dc361bf23…332150ef34fd53429b0e8e: 80% /130.428Mi, 1.829Mi/s, 13s
rclone:  * data/65/65cb8ec7e3847c…89a3e86cbf46614067cf08: 82% /128.239Mi, 1.971Mi/s, 11s
rclone:  * data/cc/cc4153176cf2a6…2af1793dab96d7600c6a49: 79% /128.056Mi, 1.931Mi/s, 13s
rclone:  * data/cc/ccbb314d9fffdd…ebe87eab2a0297c1bc10c8: 80% /128.668Mi, 2.030Mi/s, 12s
rclone:  * data/5c/5cdc55ba53e842…92b7e3c26a3aeb5acc17a4: 78% /128.253Mi, 2.051Mi/s, 13s
rclone:  * data/ce/ce445bdb8780c9…2d3d8dd2820bf79200586d: 75% /128.054Mi, 1.925Mi/s, 16s
rclone:  * data/6d/6d550c57bfc030…37c9ada3004e8a5928b0ef: 75% /128.112Mi, 2.063Mi/s, 15s
rclone:  * data/b3/b3921dce88769d…42c2608f20c886c0374b0b: 74% /128.197Mi, 2.005Mi/s, 16s
rclone:
rclone: 2023/07/24 12:29:39 DEBUG : sftp://root@remote:22//store/tmp/restic: Not closing 2 unused connections as 10 sessions active

As you can see, in both scenarios rclone report 10 concurrent uploads, which is correct, but there are much less concurrent connections as shown in the logs, and I can confirm running top on the remote machine.
In restic scenario, I tried injecting RCLONE_TRANSFERS=10 into the environment, but nothing changes, probably this parameter is not even used in serve command.
The sentence "Not closing 2 unused connections as 10 sessions active" may give some hint of what is happening.

Thank you for any help you can provide.

I think you can multiplex several SFTP connections over one SSH connection, but rclone shouldn't be doing that. It uses one SSH connection per SFTP connection.

That means rclone thinks there are 10 sftp sessions active and 2 inactive which it isn't closing for the moment.

That means that there should be 12 ssh connections open I think.

Can you confirm with netstat -tuanp that there are less than 12 ssh connections open?

When I try this I see the number of ssh connections == the number of sftp connections which is what I expect.

SFTP isn't very good at high latency links - what is the ping time to the remote?

Not OP but for GilGalaad but I ocassionally need to use SFTP from a bad connection (my in-laws refuse to upgrade their internet...) I've had better luck running running rclone serve webdav on the server, using an SSH tunnel, then using WebDAV to the SSH tunnel than to rely on SFTP connections. And since 1.63 support setting ModTime over WebDAV, you can still proceed as usual (well, not hashes I guess).

The above certainly adds some steps but when I can't rely on SFTP, it is a major speed up.

thank you for your quick answer :slight_smile:

I am aware of that multiplexing capability, but I agree that it's pointless unless you are doing some black-ops, and should be not used by rclone.

This is not what is happening. I managed to reproduce it.

  • In sync mode, this log line seems to never show up
  • In serve mode, this log is shown
  • In both modes, rclone always starts with one connection, and then... over time, it opens some more, but i never seen it reaching the full 10. I have seen 4 at max.

For example, in this precise moment

rclone: Transferring:
rclone:  * data/85/85052b09e01bdb…1fc5726f2e560c7cb7e74f: 73% /128.451Mi, 1004.867Ki/s, 34s
rclone:  * data/32/3248715b84543c…a6da2e29b452e20066a510: 73% /129.019Mi, 1.017Mi/s, 33s
rclone:  * data/b6/b6b0ffc08cb476…791086f9b45820c8a97a13: 65% /128.286Mi, 950.070Ki/s, 47s
rclone:  * data/ec/ec620ea41da1d2…6d4a9f849267aeb06be2ca: 68% /128.573Mi, 1016.803Ki/s, 40s
rclone:  * data/6b/6bc2756c888175…12d0f8d82a778e18300c1e: 67% /128.533Mi, 986.941Ki/s, 43s
rclone:  * data/5d/5dd358c70e0291…8ac0f04abe199085e8134c: 64% /128.803Mi, 959.772Ki/s, 48s
rclone:  * data/0f/0f76822d560dcf…b823ffca8bddfeb8ae243a: 56% /128.338Mi, 926.440Ki/s, 1m1s
rclone:  * data/b7/b7b24f90525255…84aa3d8a195dd3117eae0f: 57% /128.318Mi, 998.434Ki/s, 55s
rclone:  * data/ac/ac7f5432a32cc8…900489649847ffab667761: 52% /128.616Mi, 892.920Ki/s, 1m10s
rclone:  * data/24/24685f411b8f72…c98657bc757c0ccc04c84f: 51% /128.705Mi, 912.232Ki/s, 1m10s
rclone:
rclone: 2023/07/24 22:43:30 DEBUG : sftp://root@remote:22//store/tmp/restic: Not closing 1 unused connections as 10 sessions active

The syntax of netstat is a bit different on BSD, but I see two different connection on port 22, there are 2 different pids (just as I saw using top), and only one is an sftp connection, the other one is the shell I used to run this command.

[root@Dragonlance:~]# netstat -4 | grep .ssh | grep ESTABLISHED | wc -l ; pgrep sftp-server | wc -l
       2
       1

I am aware that SFTP is not the greatest protocol to handle big file tranfers, but the performance themselves aren't so bad.
The latency between my home and this server is about 25ms, I can still do 350 mbps with few connections, which is not bad actually. But on the opposite side, when I sync in download, rclone is capable of making a parallel download for each file (of course it can't be done when uploading), and I can easily saturate the 1gbps egress bandwidth of my server.
That's why I started investigating at a lower level about why I could not do the same when uploading, and found out this behaviour.
Considering that my home connection is consumer grade, I don't expect to reach 1gbps in upload always, but I think there can be a margin of improvement if I tweak the parallelism. Always with rclone, on the same target server, I can easily do 800mbps using minio backend.

I dind't emphasize enough on the first post, because it was in the logs, but I run FreeBSD 13.2 on both home and remote servers. There can be a different implementation in the ssh library used in rclone from the linux one? I don't know, I'm just brainstorming.

Thank you again for your kind support, I am at your service to provide more information :slight_smile:

Thank you for your suggestion, but i would not call my 10gbps/2.5gpbs a bad connection :slight_smile:
It's a consumer grade one, so I would prefer a 1gbps with guaranteed bandwidth and better peering all day long, but still is great to have this at home.
Anyway, I have tried with webdav, but it's a bit tricky to integrate letsencypt certificates, since rclone does not support hot reloading of them via HUP, and anyway when I have to do things "properly" I use minio which works like a charm with rclone.
SFTP is damn easy, since you always have an ssh connection ready, and this post was meant to investigate a possible bug and give support to this nice piece of software, more than finding an altenate way to move my data :slight_smile:

I looked through the code and I see there is an opportunity for rclone to share ssh connections while uploading files. This suprised me - this has been in the code since day 1 I think! Anyway have a go with this which attempts to fix the problem.

v1.64.0-beta.7159.4fe82d4ef.fix-sftp-connections on branch fix-sftp-connections (uploaded in 15-30 mins)

If you could check the number of ssh connections is as expected (not too many nor too few) that would be great - thank you.

PS the logs above when it is talking about 10 sessions, it is talking about 10 uploads, not necessarily about 10 sftp sessions so that makes sense too now I have refreshed myself on the code.

I can confirm that now the number connections is exactly as expected.
In sync mode, I almost doubled the speed with default settings, and I can even saturate the gigabit link with 15+ connections.
In serve mode, even if the connection are right, i got an improvement but I still struggle to get much more bandwidth usage, but this depends on restic for sure, a lot of local I/O, a lot of relatively small files, not much we can do here.
And still restic->rclone serve->minio is a lot slower than restic->minio, but is a whole different topic and maybe i will make more testing about that in the future.

Thank you, amazing job sir.
I am glad if I helped to show something that has been unnoticed for years :slight_smile:

Super :slight_smile:

The code I changed was only the sftp backend code so won't have affected the rclone serve sftp code - that may yet have gremlins - I haven't checked!

It shouldn't be a lot slower so that would be an interesting topic to investigate.

Though restic does make a lot of quite small files so I guess there is overhead there.

(I use restic + rclone to back up my laptop!)

You are welcome! I love finding these long hidden problems :slight_smile:

I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.64

1 Like

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.