Do we have SFTP/SSH keepalive capability?

I'm trying to transfer files to an SFTP target and the remote keeps dropping it after about an hour, even with --timeout=0 so I'm pretty sure it's the remote guys that are dropping it (I manage the server for them but it's their network). A little odd that they're dropping an SSH session with an active file transfer going on but they are. So do we have anything like this capability?

if that sftp server kills an active file transfer, not sure what rclone can do about that.
tho i could be wrong,

have you tried to tweak --sftp-idle-timeout

and would need to see a rclone debug log......

Yeah, it's not idle but I did set sftp-idle-timeout=0. I'm trying to rig up openssh itself to send keepalive packets, there's a setting for it. Not sure if it'll help.

Here's some output when it dropped:

2022-10-14 16:33:08 ERROR : sftp://root@10.5.17.74:22//NL10/: Discarding closed SSH connection: read tcp 10.200.7.117:46182->10.5.17.74:22: i/o timeout
2022-10-14 16:33:08 ERROR : sftp://root@10.5.17.74:22//NL10/: Discarding closed SSH connection: read tcp 10.200.7.117:46178->10.5.17.74:22: i/o timeout
2022-10-14 16:33:08 ERROR : sftp://root@10.5.17.74:22//NL10/: Discarding closed SSH connection: read tcp 10.200.7.117:46180->10.5.17.74:22: i/o timeout
2022-10-14 16:33:08 ERROR : 9R2/NL10_ACES_V26E09R2.db: Failed to copy: Update ReadFrom failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:08 ERROR : 9R1/NL10_ACES_V26E09R1.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:08 ERROR : 10R2/NL10_ACES_V26E10R2.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:08 ERROR : 10R1/NL10_ACES_V26E10R1.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:08 ERROR : sftp://root@10.5.17.74:22//NL10/: not deleting files as there were IO errors
2022-10-14 16:33:08 ERROR : sftp://root@10.5.17.74:22//NL10/: not deleting directories as there were IO errors
2022-10-14 16:33:08 ERROR : Attempt 1/3 failed with 4 errors and: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:09 ERROR : 9R2/NL10_ACES_V26E09R2.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:09 ERROR : 10R1/NL10_ACES_V26E10R1.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:09 ERROR : 9R1/NL10_ACES_V26E09R1.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:09 ERROR : 10R2/NL10_ACES_V26E10R2.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:09 ERROR : sftp://root@10.5.17.74:22//NL10/: not deleting files as there were IO errors
2022-10-14 16:33:09 ERROR : sftp://root@10.5.17.74:22//NL10/: not deleting directories as there were IO errors
2022-10-14 16:33:09 ERROR : Attempt 2/3 failed with 4 errors and: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:10 ERROR : 10R1/NL10_ACES_V26E10R1.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:10 ERROR : 9R2/NL10_ACES_V26E09R2.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:10 ERROR : 9R1/NL10_ACES_V26E09R1.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:10 ERROR : 10R2/NL10_ACES_V26E10R2.db: Failed to copy: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)
2022-10-14 16:33:10 ERROR : sftp://root@10.5.17.74:22//NL10/: not deleting files as there were IO errors
2022-10-14 16:33:10 ERROR : sftp://root@10.5.17.74:22//NL10/: not deleting directories as there were IO errors
2022-10-14 16:33:10 ERROR : Attempt 3/3 failed with 4 errors and: Update Create failed: sftp: "Failure" (SSH_FX_FAILURE)

when you posted, there was a template of questions to be answered.
including the exact command, a full debug log or at least the top 20 lines and the output of rclone version

fwiw, rclone can emulate a sftp server
rclone serve sftp remote:

if that does not work, then hopefully, we have enough detailed info for someone else to look at.

This was originally in features, not help and support. It got moved.

rclone v1.59.2
- os/version: rocky 8.5 (64 bit)
- os/kernel: 4.18.0-348.2.1.el8_5.x86_64 (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.18.6
- go/linking: static
- go/tags: none

Command:

/usr/bin/rclone sync /Array/datasync/NL10/ twhi3pss04:/NL10/ -P --transfers=1 --sftp-idle-timeout=0

I'm limited in the direction I can go here. I can push out to the other server, but I can't run rclone in the other direction, they aren't allowing it.

The openssh keepalive method didn't work. I think they're doing this at the network level, but scp seems to be working. What other subsystems does --sftp-subsystem support? It won't take "scp" as an option.

I'll work on getting a more extensive log but I'm not sure if it'll be all that helpful. Here's what I have so far but the timeout will take at least 30-45 minutes.

[root@acesstage02-cv2 ESACES.jb]# /usr/bin/rclone sync /Array/datasync/NL10/ twhi3pss04:/NL10/ -P --transfers=1 --sftp-idle-timeout=0 -vv
2022/10/14 18:46:39 DEBUG : rclone: Version "v1.59.2" starting with parameters ["/usr/bin/rclone" "sync" "/Array/datasync/NL10/" "twhi3pss04:/NL10/" "-P" "--transfers=1" "--sftp-idle-timeout=0" "-vv"]
2022/10/14 18:46:39 DEBUG : Creating backend with remote "/Array/datasync/NL10/"
2022/10/14 18:46:39 DEBUG : Using config file from "/root/.config/rclone/rclone.conf"
2022/10/14 18:46:39 DEBUG : Creating backend with remote "twhi3pss04:/NL10/"
2022/10/14 18:46:39 DEBUG : twhi3pss04: detected overridden config - adding "{IPeaW}" suffix to name
2022/10/14 18:46:41 DEBUG : sftp://root@10.5.17.74:22//NL10/: New connection 10.200.7.117:48582->10.5.17.74:22 to "SSH-2.0-OpenSSH_8.0"
2022/10/14 18:46:42 DEBUG : sftp://root@10.5.17.74:22//NL10/: Shell type "unix" from config
2022/10/14 18:46:43 DEBUG : sftp://root@10.5.17.74:22//NL10/: Using root directory "/NL10/"
2022/10/14 18:46:43 DEBUG : fs cache: renaming cache item "twhi3pss04:/NL10/" to be canonical "twhi3pss04{IPeaW}:/NL10/"
2022-10-14 18:46:46 DEBUG : sftp://root@10.5.17.74:22//NL10/: New connection 10.200.7.117:48584->10.5.17.74:22 to "SSH-2.0-OpenSSH_8.0"
2022-10-14 18:46:46 DEBUG : sftp://root@10.5.17.74:22//NL10/: New connection 10.200.7.117:48586->10.5.17.74:22 to "SSH-2.0-OpenSSH_8.0"
2022-10-14 18:46:46 DEBUG : sftp://root@10.5.17.74:22//NL10/: New connection 10.200.7.117:48588->10.5.17.74:22 to "SSH-2.0-OpenSSH_8.0"
2022-10-14 18:46:53 DEBUG : sftp://root@10.5.17.74:22//NL10/: Waiting for checks to finish
2022-10-14 18:46:53 DEBUG : sftp://root@10.5.17.74:22//NL10/: Waiting for transfers to finish

This might also just not be something rclone can solve as-is, I just prefer this tool over the other options.

as i never had an issue with any of the sftp servers i use on daily basis, not sure what the exact issue here is.

yes, at this point, i agree.

tho, would be helpful if you can confirm that.

welcome to the fellowship of rcloners...
i am sure someone more experienced will stop by soon.

A keep alive is generally for a connection that isn't doing anything so it sends a 'beep' to keep the connection open every so often.

If you have a live transfer getting killed, a keepalive won't do much there.

Those messages "look" like the connection from source to destination don't work anymore. The other side closing the connection would be different.

I see two potential issues:

2022-10-14 16:33:08 ERROR : sftp://root@10.5.17.74:22//NL10/: Discarding closed SSH connection: read tcp 10.200.7.117:46182->10.5.17.74:22: i/o timeout
2022-10-14 16:33:08 ERROR : sftp://root@10.5.17.74:22//NL10/: Discarding closed SSH connection: read tcp 10.200.7.117:46178->10.5.17.74:22: i/o timeout
2022-10-14 16:33:08 ERROR : sftp://root@10.5.17.74:22//NL10/: Discarding closed SSH connection: read tcp 10.200.7.117:46180->10.5.17.74:22: i/o timeout

These are actually DEBUG messages and not really ERRORs. They just tell that rclone is taking some unused and expired SFTP connections out the the connection pool.

This process however isn't 100% robust (there is a potential race condition), so these messages are best to avoid or keep at a minimum.

I guess it happens because your sync starts out by using all the available --checkers (default is 8), which takes an SFTP connection each. Later the when the checks are finished the connections are no longer needed and start to fall out of the pool as the SFTP server cleans up the unused connections. I see this on my Synology.

You can probably work around this by lowering --checkers. I think --checkers=2 would be fine when you have --transfers=1.

The SSH_FX-FAILURE is a broad message from the SFTP server that basically means that you cannot create a file (at the given location). This is typically due to missing access rights or out of disk space.

Here are a few recent threads with hints:
https://forum.rclone.org/t/ssh-fx-failure-for-normal-usage/32836
https://forum.rclone.org/t/error-with-sftp-remote-mkdir/33499

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.