Rclone's sftp is substantially slower (150%) than an sshfs mount or the os's sftp implementation

What is the problem you are having with rclone?

I am trying to use rclone sftp backend on a local network to transfer files. Rclone's sftp implementation is substantially slower than mounting a simple sshfs or using the filesystem's sftp.

Run the command 'rclone version' and share the full output of the command.

rclone v1.59.1

Which cloud storage system are you using? (eg Google Drive)

sftp

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone mount remote /local/mount

The rclone config contents with secrets removed.

[remote]
user = username
type = sftp
host = localhost
shell_type = unix
md5sum_command = md5sum
sha1sum_command = sha1sum

A log from the command with the -vv flag

2023/05/08 16:11:56 DEBUG : testfile.bin: Attr: 
2023/05/08 16:11:56 DEBUG : testfile.bin: >Attr: a=valid=1s ino=0 size=965738496 mode=-rw-rw-r--, err=<nil>
2023/05/08 16:11:56 DEBUG : testfile.bin: Open: flags=OpenReadOnly
2023/05/08 16:11:56 DEBUG : testfile.bin: Open: flags=O_RDONLY
2023/05/08 16:11:56 DEBUG : testfile.bin: >Open: fd=testfile.bin (r), err=<nil>
2023/05/08 16:11:56 DEBUG : testfile.bin: >Open: fh=&{testfile.bin (r)}, err=<nil>
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=0
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=131072
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.openRange at 0 length 134217728
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 0 length 4096 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 4096 length 8192 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 12288 length 16384 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 28672 length 32768 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 61440 length 65536 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 126976 length 131072 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 258048 length 262144 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil>
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 520192 length 524288 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil>
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=262144
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=393216
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil>
2023/05/08 16:11:56 DEBUG : testfile.bin: ChunkedReader.Read at 1044480 length 1048576 chunkOffset 0 chunkSize 134217728
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil>
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=655360
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=524288
2023/05/08 16:11:56 DEBUG : testfile.bin: waiting for in-sequence read to 655360 for 20ms
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil>
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil>
2023/05/08 16:11:56 DEBUG : &{testfile.bin (r)}: Read: len=131072, offset=917504
2023/05/08 16:11:56 DEBUG : testfile.bin: waiting for in-sequence read to 917504 for 20ms
...
2023/05/08 16:12:41 DEBUG : testfile.bin: waiting for in-sequence read to 965607424 for 20ms
2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil>
2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: >Read: read=131072, err=<nil>
2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: Flush: 
2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: >Flush: err=<nil>
2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: Release: 
2023/05/08 16:12:41 DEBUG : testfile.bin: ReadFileHandle.Release closing
2023/05/08 16:12:41 DEBUG : &{testfile.bin (r)}: >Release: err=<nil>
2023/05/08 16:12:41 DEBUG : /: Lookup: name="testfile.bin"
2023/05/08 16:12:41 DEBUG : /: >Lookup: node=testfile.bin, err=<nil>
2023/05/08 16:12:41 DEBUG : testfile.bin: Attr: 
2023/05/08 16:12:41 DEBUG : testfile.bin: >Attr: a=valid=1s ino=0 size=965738496 mode=-rw-rw-r--, err=<nil>
2023/05/08 16:13:38 DEBUG : sftp://user@x.x.x.x:port/: Closing 2 unused connections

Test setup

I mounted the same locally remote directory with both rclone sftp and sshfs:

  1. Mount the rclone sftp directory with the above settings.
  2. Mount an sshfs directory with the following settings:
sshfs -o rw,follow_symlinks,reconnect,ServerAliveInterval=5,compression=no user@remote: /mount/sshfs
  1. I then use Dolphin to copy a 1gb file from the rclone sftp mounted drive and record the time.
  2. Repeat the same file copy using the simple sshfs mount and record the time.

Test Results

  • rclone sftp copies the file at 44 seconds (around 185 Mb/s)
  • sshfs copies the file at 30 seconds (around 270MB/s).

This is substantially faster (150%) than rclone and I can't figure out why. I have tried adjusting many rclone settings, adding flags with varying values and this has only made a marginal difference.

Different trial settings:

Here is a list of the settings I have tried to adjust:

  1. Add larger transfers to the rclone mount config (--transfers=50)
  2. Add large transfers (--transfers=200)
  3. Add very large transfers (--transfers=2000)
  4. Remove checksum flag ( --no-checksum)
  5. Remove md5 and sha1 commands from the remote config (md5sum_command = md5sum
    sha1sum_command = sha1sum)
  6. Add chunk size flag and adjust various values from large to small (--sftp-chunk-size large, small)
  7. Add -bw limit flag and adjust various values from large to small (--bwlimit-file large small)
  8. Try all 3 values for vfs cache (--vfs-cache-mode=none, off/minimal/writes/full)

I based these trials on the following thread discussions:

The results of adjusting all of these settings are disappointing. I am able to see a very minor change from 43 to 49 seconds (170-190Mb/s average) to transfer the 1gb file but nothing approaching the simple sshfs transfer of 30 seconds (around 270MB/s).

This doesn't really make sense from a conceptual point of view. Isn't sshfs a more complex, heavier and capable protocol. Nothing I have tried gets the rclone implementation of sftp to approach the speed of the simple sshfs mount.

Final test

As a final test I setup a remote using Dolphin's sftp remote template. This also uses sftp and it eliminates the FUSE sshfs mount component of the test. The result was a transfer of 32 seconds, similar to sshfs. This indicates that the issue is not sftp versus sshfs but either rclone's implementation of sftp or a configuration problem with my remote or mount setup.

Versions of other software

Here is the sshfs version I am using:

Local:
SSHFS version 3.7.1
FUSE library version 3.10.5
using FUSE kernel interface version 7.31
fusermount3 version: 3.10.5

Local/Remote:
OpenSSH_8.9p1 Ubuntu-3, OpenSSL 3.0.2 15 Mar 2022

Analysis

I tried asking chatGPT as well but it did not product anything of value. Any suggestions on what I can try or what I have overlooked? I would have expected the two solutions to be in the same ballpark. I really want to use rclone's sftp implementation but I can justify it being so much slower.

hi,
have you tested using rclone copy, not rclone mount?

can you copy a single file, testfile.bin and post the full debug output.

Hi I just did a 4th test as you wrote your comment. You read my mind! Here are the results of doing an rclone copy:

<7>DEBUG : Using config file from "/path/rclone.conf"
<7>DEBUG : sftp://user@x.x.x.x:port/file.bin: New connection x.x.x.x:port->x.x.x.x:port to "SSH-2.0-OpenSSH_8.9p1 Ubuntu-3"
<7>DEBUG : sftp://user@x.x.x.x:port/file.bin: Shell type "unix" from config
<7>DEBUG : sftp://user@x.x.x.x:port/file.bin: Relative path resolved to "/path/file.bin"
<7>DEBUG : sftp://user@x.x.x.x:port/file.bin: Using root directory "/path"
<7>DEBUG : fs cache: adding new entry for parent of "remote:file.bin", "remote:"
<7>DEBUG : Creating backend with remote "/path"
2023-05-08 17:09:41 DEBUG : file.bin: Need to transfer - File not found at Destination
2023-05-08 17:09:41 DEBUG : file.bin: Starting multi-thread copy with 3 parts of size 307Mi
2023-05-08 17:09:41 DEBUG : file.bin: multi-thread copy: stream 2/3 (321912832-643825664) size 307Mi starting
2023-05-08 17:09:41 DEBUG : file.bin: multi-thread copy: stream 3/3 (643825664-965738496) size 307Mi starting
2023-05-08 17:09:41 DEBUG : file.bin: multi-thread copy: stream 1/3 (0-321912832) size 307Mi starting
2023-05-08 17:09:41 DEBUG : sftp://user@x.x.x.x:port/file.bin: New connection x.x.x.x:port->x.x.x.x:port to "SSH-2.0-OpenSSH_8.9p1 Ubuntu-3"
2023-05-08 17:09:41 DEBUG : sftp://user@x.x.x.x:port/file.bin: New connection x.x.x.x:port->x.x.x.x:port to "SSH-2.0-OpenSSH_8.9p1 Ubuntu-3"
2023-05-08 17:10:02 DEBUG : file.bin: multi-thread copy: stream 3/3 (643825664-965738496) size 307Mi finished
2023-05-08 17:10:06 DEBUG : file.bin: multi-thread copy: stream 1/3 (0-321912832) size 307Mi finished
2023-05-08 17:10:09 DEBUG : file.bin: multi-thread copy: stream 2/3 (321912832-643825664) size 307Mi finished
2023-05-08 17:10:09 DEBUG : file.bin: Finished multi-thread copy with 3 parts of size 307Mi
2023-05-08 17:10:09 DEBUG : sftp://user@x.x.x.x:port/file.bin: Shell path "/path/file.bin"
2023-05-08 17:10:09 DEBUG : sftp://user@x.x.x.x:port/file.bin: Running remote command: md5sum /path/file.bin
2023-05-08 17:10:14 DEBUG : sftp://user@x.x.x.x:port/file.bin: Remote command result: md5hashmd5hashmd5hashmd5hash  /path/file.bin
2023-05-08 17:10:14 DEBUG : file.bin: Parsed hash: md5hashmd5hashmd5hashmd5hash
2023-05-08 17:10:14 DEBUG : file.bin: md5 = md5hashmd5hashmd5hashmd5hash OK
2023-05-08 17:10:14 INFO  : file.bin: Multi-thread Copied (new)
Transferred:          921 MiB / 921 MiB, 100%, 23.571 MiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:        44.3s
<6>INFO  :
Transferred:          921 MiB / 921 MiB, 100%, 23.571 MiB/s, ETA 0s
Transferred:            1 / 1, 100%
Elapsed time:        44.3s

So almost exactly the same results as the above 8 tests that used the mount: 44 seconds. This eliminates the mount and various configuration options as the cause.

lol, even crapgpt cannot do that.

yes, it does, the simpler the test the better.

--- up above the testing was done with testfile.bin and now it is file.bin, are the two files the same size?
--- maybe i am confused but 17:10:14 - 17:09:41 = 0:00:33?
--- did you intent to force the md5 hash, which takes additional time?
--- the rclone copy, can you just post the full output, from the terminal including the exact command and without that other log stuff or post the full output of a rclone debug log file.
--- just curious, did you try rclone serve sftp as the sftp server?

Here is the result of fetching the same file using the OS sftp implementation:

$ sftp -v -p remote:/path/file.bin /path

Fetching /path/file.bin to /path/file.bin
file.bin

100%  921MB  31.5MB/s   00:29

So in this case the OS sftp took 29s, versus rclone's sftp of 45s which is 155% slower than the OS sftp. This result corresponds with all results from the above tests.

Since both of these tests are using sftp, this suggests that it isn't the sftp protocol that is slower than the sshfs protocol.

What can account for this substantial decrease in performance?

It's the same file yes, I just redacted the name differently with a find and replace.

Interesting! I did a straight copy of the debug output form the terminal. I only redacted the IP addresses using a universal Find and Replace for IP addresses, credentials etc.

I see the hash starts at 17:10:09 and ends at 17:10:14 taking 5 seconds. The has ran because I have the md5sum command as part of the remote.

Here is the command:
rclone copy remote:file.bin /path --progress -vv
I copied and pasted the output from that command from the terminal and used a find and replace to chage path names, login credentials and IP addresses. No other editing by hand.

No I haven't tried this. I am using the remote settings from the top of the post:

type = sftp
shell_type = unix

I will run another test and look at the times reported by the log versus the times reported by the progress at the end however in the previous tests, I was timing manually by pressing Start when I copied the file and stop as soon as Dolphin told me the file copy had completed.

just copy/paste the full rclone debug output, not snippets. the full debug output and the exact command.
or better yet, post the full output of a rclone debug log.

might be best to disable checksums unless whatever you are comparing against also calculates the hash.

17:10:14 - 17:09:41 = 0:00:33
0:00:33 - 0:00:05 = 0:00:28
do you agree?

Yes I am already on it because the OS's sftp is not doing an after checksum.

Yes this is what I did. I started at the top of the terminal with the mouse and copied the entire operation to kate, then I used a find and replace function to redact three pieces of information: IP addresses, file paths and login credentials.

I will run another test without the checksum and post the full results including the command to ensure we are comparing apples to apples.

I do but I cannot account for the discrepancy here between the file transfer times in the log and the time the rclone reports the file transfer is complete.

I think this is fixed in v1.62.2 - I suggest you give that a go.

See

Hi @ncw thanks for the reply. The issue you linked mentions uploads and crypted remotes, would it affect remotes that are not wrapped in a crypt? In this case I am transferring (downloading) files from a sftp remote only.

This brings up a question however, I assume that an rclone sftp transfer would already be encrypted over the ssh tunnel which is configured in the rclone remote config file. In this case would using a crypt wrapper be redundant?

I have downloaded 1.62.2 as mentioned and it did not improve the performance, same results as the test conducted with the previous version.

Since my last post I have setup and run many tests and have a few conceptual questions. Here is what I found so far.

  1. rclone's sftp performance (doing an rclone copy from a sftp remote, no crypt) is close to the OS sftp performance. Although a few second slower it is roughly 30 seconds to transfer a 1gb file in this environment. So the major issue is not sftp implementation. Thanks to asdffdsa for helping to peel away the layers and clarify this!

  2. The checksum at the end of the operation does add 3-5 seconds to the operation. It also seems to affect the accuracy of the --progress indicator. I assume this hash is an expensive operation so this can be accounted for.

Question 1:
Is it safe to turn off the checksum and realize the improved performance when using an sftp remote?

According to a user in this thread sftp and checksums:

With the SFTP, running over an encrypted SSH session, there's negligible chance the file contents could get corrupted while transferring. The SSH itself does data integrity verification.

but then he does on to say:

While there's the check-file extension to the SFTP protocol to calculate a remote file checksum, it's not widely supported. Particularly it's not supported by the most widespread SFTP server implementation, the OpenSSH.

It appears I am using openSSH so likely not using the mentioned sftp file integrity checks.

Question 2
I see that rclone sftp is spiting the file into 3 parts and using a "multi-copy". Is this configurable? I tried adding --transfers=10 but it did not change it. Does the multi-copy increase the changes of corruption thus necessitating the end checksum that would otherwise not be needed with a straight sftp transfer?

Question 3:
It appears that from the results of all the testing, the mount is the root cause of the increased transfer time. If I mount the same sftp remote and do a cp operation from the mounted remote to the local disk (rather than an rclone copy), the same file will transfer in roughly 45 seconds (versus 31 seconds). When I look at the debug log the chunk reader in involved now.

So the question, is it expected that using an sftp remote with rclone mount will add an additional (13-15 seconds or 120% - 150%) of overhead to a simple copy operation, in this case a single 1gb file across a lan? If so what could account for this? Is it the additional overhead of transferring the file via sftp then reading it again with the chunkreader?

I noticed there were 1729 instances of 'file.bin: waiting for in-sequence read to xxxxxxxx for 20ms in the debug log.

1729 x 20ms = 1.7 seconds so this can account for a small portion of the difference.

I have tried many different settings and flags but nothing makes a substantial difference.

If this is the case, mounting the same sshfs drive does not incur the same penalty, even though they are both using FUSE.

Summary
So in summary, after another 50 test transfers, to transfer a 1gb file over a LAN (with no end checksums and no caching):

  • sftp implementation (ubuntu jammy implementation): ~30 seconds
  • sftp (rclone implementation: rclone copy from an sftp remote): ~32 seconds
  • sshfs (FUSE mount using all sshd defaults and -o compression=no: ~31 seconds
  • rclone mount (sftp remote using cp)" ~44 seconds.

Since a mounted sshfs is a similar solution to an rclone mount, unless there is a way to increase the performance of the rclone mount it makes sense to use the sshfs solution. I would strongly prefer to use the rclone solution if there is any chance it can be further tuned to be as performant as an sshfs mount.

It affects remotes which send chunks that are shorter than the SFTP block size. So it could do but probably not.

The data is encrypted in transit by ssh. crypt will ensure your data is encrypted at rest on the remote server which may/may not be what you want.

:frowning:

Great!

Yes the checksum does take some time.

It is an end to end verification. The file is unlikely to get corrupted when transferring but it could get corrupted in a multitude of other places (eg bit flip in ram, disk drive corrupts it, bug in the transfer code, etc, etc).

It depends on your tolerance for corrupted files. Rclone by defaults always defaults to maximum protection for your data!

I've not seen that extension before. Looks very useful. I think you'd find it was no faster than shelling out to md5sum which is what rclone does though.

  --multi-thread-cutoff SizeSuffix   Use multi-thread downloads for files above this size (default 250Mi)
  --multi-thread-streams int         Max number of streams to use for multi-thread downloads (default 4)

The overhead here is the fact that your data has to loop in and out the kernel. I think it is unlikely to be the chunkreader.

That's a consequence of the kernel sending multiple read requests to rclone and them arriving out of order. With --vfs-cache-mode off (the default) the requests can only be serviced in order so rclone waits for up to 20ms for it to arrive.

The fact the messages arrive out of order is very annoying and I tried to convince the author of the upstream library that it was a bug with no luck!

You can try disabling this with --async-read=false which might help. Though it will slow other stuff down!

Note that if the read arrives sooner it won't wait 20ms and if it does wait 20ms then it will seek the file. This really does slow down the performance a lot.

Are you seeing logs which say failed to wait for in-sequence or aborting in-sequence?

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.