SFTP backend against synology NAS results in chaos

Synology support assisted in getting to the bottom of this.

If the pathname component is missing a leading "/" (as in: sftpnas:home, rather than sftpnas:/home) there is a bug somewhere in the SSHD programme which corrupts memory eventually resulting in an abort being raised in the malloc/free code resulting in SSHD segfaulting. This causes rclone to report spurious failures (getting EOF for various ops), plus the retry count keeps going up, which seems to show there are problems in the rclone retry code, as it often eventually hangs, even when asked for just one retry.

The simple workaround is always to have a leading "/". Then, it works more or less flawless, other than the spurious "cannot have control chars in filenames" bug. Oh, I still haven't figured out if it should be able to do a checksum or not.

Anyway, given that synology are unlikely to be able to find the malloc issue (well, I think there are some obvious places to look, given that it is sensitive to the leading "/" or "") and even if so they will have to report it to whoever maintains the library or sshd programme, I think it would be good if rclone had some warning somewhere regarding the use of a leading "/" for synology. Add an "is it synology" question or perhaps if there is no leading "/" and it is SFTP?

Not sure, but I have been going back and forth between me and synology for several weeks now until I finally got them to run rclone and they came back with "works for me" and I was able to note that they had a leading "/" whereas I did not.

Also, given that this is most likely a common linux-based SSHD issue, I would expect others to have issues with SFTP.

Note that an initial / selects the root of the file system, whereas leaving it out selects the users home directory.

The sftp backend gets quite heavily used so I think this is probably a synology specific problem.

Note that it does say this in the docs

Paths are specified as remote:path . If the path does not begin with a / it is relative to the home directory of the user. An empty path remote: refers to the user’s home directory.

“Note that some SFTP servers will need the leading / - Synology is a good example of this. rsync.net, on the other hand, requires users to OMIT the leading /.

My experience is that both refer to the same SFTP target dir, but one is rife with SSHD segfaults. Note that it seems to take a few command/response cycles for the SSHD instance to run into the malloc arena corruption and kark, so I do not see it for these simple lsd examples.

lust% rclone lsd nas:/ -vv
2019/08/29 08:56:05 DEBUG : rclone: Version "v1.48.0" starting with parameters ["rclone" "lsd" "nas:/" "-vv"]
2019/08/29 08:56:06 DEBUG : sftp://nimda@192.168.3.33:22//: New connection 192.168.3.2:62143->192.168.3.33:22 to "SSH-2.0-OpenSSH_7.4"
          -1 2019-08-21 12:13:12        -1 NetBackup
          -1 2019-08-28 23:13:22        -1 TimeMachine
          -1 2019-08-28 17:36:46        -1 home
          -1 2019-08-20 23:28:42        -1 homes
          -1 2019-02-27 11:14:57        -1 shared
2019/08/29 08:56:06 DEBUG : 13 go routines active
2019/08/29 08:56:06 DEBUG : rclone: Version "v1.48.0" finishing with parameters ["rclone" "lsd" "nas:/" "-vv"]
lust% rclone lsd nas: -vv
2019/08/29 08:56:53 DEBUG : rclone: Version "v1.48.0" starting with parameters ["rclone" "lsd" "nas:" "-vv"]
2019/08/29 08:56:53 DEBUG : sftp://nimda@192.168.3.33:22/: New connection 192.168.3.2:62154->192.168.3.33:22 to "SSH-2.0-OpenSSH_7.4"
          -1 2019-08-21 12:13:12        -1 NetBackup
          -1 2019-08-28 23:13:22        -1 TimeMachine
          -1 2019-08-28 17:36:46        -1 home
          -1 2019-08-20 23:28:42        -1 homes
          -1 2019-02-27 11:14:57        -1 shared
2019/08/29 08:56:54 DEBUG : 13 go routines active
2019/08/29 08:56:54 DEBUG : rclone: Version "v1.48.0" finishing with parameters ["rclone" "lsd" "nas:" "-vv"]
lust% 

This is what I see in the synology:

[  200.010078] traps: sshd[12215] general protection ip:7f1cdd0b4d8d sp:7ffd485f5610 error:0 in libc-2.20-2014.11.so[7f1cdd03d000+19b000]

A core dump of sshd shows up somewhere, and this formatted error string can be found within:

*** Error in `sshd: nimda@internal-sftp': free(): invalid next size (fast): 0x000055e09d90de80 ***

Well that is a definite bug in the Synology sshd implementation! It is likely expoitable too so really needs fixing.

I think the moral of the story is always use a leading / like it says in the docs.

synology has agreed that it is a bug and is allegedly working on a fix.

In the interim, they (also) suggest using the leading "/".

1 Like