SFTP – FolderName: error reading destination directory: List failed: dirExists: couldn't connect SSH: ssh: handshake failed: EOF

Hi,

when syncing from local to remote VPS thru SFTP, I've got the following error message:

ERROR : FolderName: error reading destination directory: List failed: dirExists: couldn\'t connect SSH: ssh: handshake failed: EOF

Watching live /var/log/auth.log on the remove server, I see that rclone does tens of logins to sync my numerous folders. I thought that rclone would only login once to the server, then perform syncing afterwards.

Does the error relate to some kind of strict sshd_config I put in place?

MACs umac-128-etm@openssh.com,umac-128@openssh.com,hmac-sha2-256-etm@openssh.com

MaxSessions 3
MaxStartups 2

AllowTcpForwarding no
AllowUsers toto
MaxAuthTries 6

# Logging
SyslogFacility AUTH
LogLevel INFO

# Authentication:
LoginGraceTime 120
PermitRootLogin no
StrictModes yes

What rclone version?

Can you run the same command you are testing with -vv on it and share the output?

Thank you for your time.

Version:

rclone v1.47.0
- os/arch: darwin/amd64
- go version: go1.12.4

Ridiculously, it does not happen every time, but it does happen often. What I currently have on the VPS in terms of security are Fail2ban, psad and enabled UFW. None of them blocked the originating IP address from which rclone does the job.

Can you run the same command you are testing with -vv on it and share the output?

Yes, I will add this option to my cron job now. As the error cannot be replicated every time, I shall wait until it happens again before sending the log file. See you soon.

Also, I have updated the rule in UFW to ALLOW instead of LIMIT via SSH from a specific IP that I use. Indeed, LIMIT allows the connection, but an IP address trying to initiate 6 or more connections within 30 seconds will be denied.

Maybe, it'd be a nice idea to put this requirement in the doc. Please, see this solved issue. Thank you.

However, the updated UFW rule has nothing to do to with this current raised issue.

I forgot to say that I also use crypt on top.

By all means, for any doc updates, you can submit a pull request with the update :slight_smile:

After using option -vv, here's the digested version of the log file:

2019/05/31 20:01:45 DEBUG : rclone: Version "v1.47.0" starting with parameters ["rclone" "-vv" "--log-file=/tmp/rclone.log" "--filter-from" "/usr/local/etc/rclone/filter_mbp.txt" "sync" "/Users/N" "Safe:/MBP/" "--backup-dir" "Safe:/OLD/MBP/2019-05-31•20:00:00"]
2019/05/31 20:01:45 DEBUG : Using config file from "/Users/N/.config/rclone/rclone.conf"
2019/05/31 20:01:46 DEBUG : .zshrc-e: Excluded
2019/05/31 20:01:46 DEBUG : .nnnCHD4wZ: Excluded
2019/05/31 20:01:46 DEBUG : .gnuplot_history: Excluded
2019/05/31 20:01:46 DEBUG : Music: Excluded
2019/05/31 20:01:46 DEBUG : .keychain: Excluded
2019/05/31 20:01:46 DEBUG : .nnnPdJ43L: Excluded
2019/05/31 20:01:46 DEBUG : .bin: Excluded
2019/05/31 20:01:46 DEBUG : .docker: Excluded
2019/05/31 20:01:46 DEBUG : .distlib: Excluded
2019/05/31 20:01:46 DEBUG : perl5: Excluded
2019/05/31 20:01:46 DEBUG : Selected Pane: Excluded
2019/05/31 20:01:46 DEBUG : .archey-ip: Excluded
2019/05/31 20:01:46 DEBUG : .cshrc.pysave: Excluded
…
2019/05/31 20:01:46 DEBUG : .gitconfig: Size and modification time the same (differ by 0s, within tolerance 1s)
2019/05/31 20:01:46 DEBUG : .gitconfig: Unchanged skipping
2019/05/31 20:01:46 DEBUG : .tmux.conf: Size and modification time the same (differ by -675.771509ms, within tolerance 1s)
2019/05/31 20:01:46 DEBUG : .tmux.conf: Unchanged skipping
2019/05/31 20:01:46 DEBUG : .vimrc: Size and modification time the same (differ by 0s, within tolerance 1s)
2019/05/31 20:01:46 DEBUG : .vimrc: Unchanged skipping
2019/05/31 20:01:46 DEBUG : .zshrc: Size and modification time the same (differ by -626.490585ms, within tolerance 1s)
…
2019/05/31 20:01:46 DEBUG : .config/configstore: Excluded
2019/05/31 20:01:46 NOTICE: .ssh/socket-nfa@192.168.178.12:22: Cant transfer non file/directory
2019/05/31 20:01:46 NOTICE: .ssh/ssh-agent.sock: Cant transfer non file/directory
2019/05/31 20:01:46 DEBUG : .config/dconf: Excluded
2019/05/31 20:01:46 DEBUG : .ssh/.DS_Store: Excluded
…
…
…
2019/05/31 20:01:47 ERROR : .config: error reading destination directory: List failed: dirExists: couldnt connect SSH: ssh: handshake failed: read tcp 10.95.20.109:61067->191.03.60.45:9783: read: connection reset by peer
…
…
…
2019/05/31 20:01:47 DEBUG : ABI/Allowed JS.txt: Size and modification time the same (differ by -693.261768ms, within tolerance 1s)
2019/05/31 20:01:47 DEBUG : ABI/Critical Letter Set.txt: Size and modification time the same (differ by 0s, within tolerance 1s)
2019/05/31 20:01:47 DEBUG : ABI/Critical Letter Set.txt: Unchanged skipping
2019/05/31 20:01:47 DEBUG : ABI/.DS_Store: Size and modification time the same (differ by -72.790794ms, within tolerance 1s)
2019/05/31 20:01:47 DEBUG : ABI/.DS_Store: Unchanged skipping
2019/05/31 20:01:47 DEBUG : ABI/Anciens noms génériques.md: Size and modification time the same (differ by 0s, within tolerance 1s)
2019/05/31 20:01:47 DEBUG : ABI/Anciens noms génériques.md: Unchanged skipping
…
…
…
2019/05/31 20:02:27 ERROR : Encrypted drive 'Safe:/MBP/': not deleting files as there were IO errors
2019/05/31 20:02:27 ERROR : Encrypted drive 'Safe:/MBP/': not deleting directories as there were IO errors
2019/05/31 20:02:27 ERROR : Attempt 1/3 failed with 2 errors and: not deleting files as there were IO errors
…
…
…
Transferred:   	   35.541M / 35.541 MBytes, 100%, 639.818 kBytes/s, ETA 0s
Errors:                 0
Checks:              9546 / 9546, 100%
Transferred:            4 / 4, 100%
Elapsed time:       56.8s

2019/05/31 20:02:42 DEBUG : 138 go routines active
2019/05/31 20:02:42 DEBUG : rclone: Version "v1.47.0" finishing with parameters ["rclone" "-vv" "--log-file=/tmp/rclone.log" "--filter-from" "/usr/local/etc/rclone/filter_mbp.txt" "sync" "/Users/N" "Safe:/MBP/" "--backup-dir" "Safe:/OLD/MBP/2019-05-31•20:00:00"]
…
2019/05/31 20:02:42 DEBUG : rclone: Version "v1.47.0" starting with parameters ["rclone" "-Lvv" "--log-file=/tmp/rclone.log" "--filter-from" "/usr/local/etc/rclone/filter_etc.txt" "sync" "/usr/local/etc/" "Safe:/ETC/" "--backup-dir" "Safe:/OLD/ETC/2019-05-31•20:00:00"]
2019/05/31 20:02:42 DEBUG : Using config file from "/Users/Nathalie/.config/rclone/rclone.conf"
2019/05/31 20:02:43 DEBUG : bash_completion.d: Excluded
2019/05/31 20:02:43 INFO  : Privoxy.7z: Moved (server side)
2019/05/31 20:02:43 INFO  : brew_list: Moved (server side)
…
…
…
2019/05/31 20:02:44 ERROR : rclone: error reading destination directory: List failed: dirExists: couldnt connect SSH: ssh: handshake failed: EOF
2019/05/31 20:02:44 DEBUG : BitBar (off)/DNS_Status.30s.sh: Size and modification time the same (differ by 0s, within tolerance 1s)
2019/05/31 20:02:44 DEBUG : BitBar (off)/DNS_Status.30s.sh: Unchanged skipping
2019/05/31 20:02:44 INFO  : brew_list: Copied (new)
…
Transferred:   	    4.709M / 4.709 MBytes, 100%, 676.694 kBytes/s, ETA 0s
Errors:                 0
Checks:                54 / 54, 100%
Transferred:            2 / 2, 100%
Elapsed time:        7.1s

2019/05/31 20:02:49 DEBUG : 102 go routines active
2019/05/31 20:02:49 DEBUG : rclone: Version "v1.47.0" finishing with parameters ["rclone" "-Lvv" "--log-file=/tmp/rclone.log" "--filter-from" "/usr/local/etc/rclone/filter_etc.txt" "sync" "/usr/local/etc/" "Safe:/ETC/" "--backup-dir" "Safe:/OLD/ETC/2019-05-31•20:00:00"]

Overall, among 20,775 entries in the log file, there are only two errors in connexion with SSH:

2019/05/31 20:01:47 ERROR : .config: error reading destination directory: List failed: dirExists: couldnt connect SSH: ssh: handshake failed: read tcp 10.95.20.109:61067->191.03.60.45:9783: read: connection reset by peer
2019/05/31 20:02:44 ERROR : rclone: error reading destination directory: List failed: dirExists: couldnt connect SSH: ssh: handshake failed: EOF

Looking at closely the log file at time 20:01:47 there are 665 entries; one entry is the following error already shown above:

2019/05/31 20:01:47 ERROR : .config: error reading destination directory: List failed: dirExists: couldnt connect SSH: ssh: handshake failed: read tcp 10.95.20.109:61067->191.03.60.45:9783: read: connection reset by peer

The other 664 entries are DEBUG-related, each one containing either of both:

  • Unchanged skipping
  • Size and modification time the same (differ by …s, within tolerance 1s)

Questions

  1. Overall, does that mean we have 665 connexions to the server from at 20:01:47?
  2. Is there an rclone option to limit the number of concurrent connexions to the SFTP server?

Among those connexions, we might have several concurrent connexions at the same time.

Below, an excerpt of my sshd_config file:

MaxSessions 3
MaxStartups 2

AllowTcpForwarding no
AllowUsers toto
MaxAuthTries 6

I guess it might be a little too restrictive for rclone.

Looking at sshd_config man pages:

MaxAuthTries
Specifies the maximum number of authentication attempts permitted per connection. Once the num- ber of failures reaches half this value, additional failures are logged. The default is 6.

MaxSessions
Specifies the maximum number of open shell, login or subsystem (e.g. sftp) sessions permitted per network connection. Multiple sessions may be established by clients that support connection multi- plexing. Setting MaxSessions to 1 will effectively disable session multiplexing, whereas setting it to 0 will prevent all shell, login and subsystem sessions while still permitting forwarding. The default is 10.

MaxStartups
Specifies the maximum number of concurrent unauthenticated connections to the SSH daemon. Additional connections will be dropped until authentication succeeds or the LoginGraceTime expires for a connection. The default is 10:30:100.

Alternatively, random early drop can be enabled by specifying the three colon separated values start:rate:full (e.g. "10:30:60"). sshd(8) will refuse connection attempts with a probability of rate/100 (30%) if there are currently start (10) unauthenticated connections. The probability increases linearly and all connection attempts are refused if the number of unauthenticated connec- tions reaches full (60).

My guess is that I'd need to tweak those settings…

Or, looking at rclone man pages:

--checkers=N
The number of checkers to run in parallel. Checkers do the equality checking of files during a sync. For some storage systems (eg S3, Swift, Dropbox) this can take a significant amount of time so they are run in parallel.

The default is to run 8 checkers in parallel.

--retries int
Retry the entire sync if it fails this many times it fails (default 3).

Some remotes can be unreliable and a few retries help pick up the files which didn’t get transferred because of errors.

Disable retries with --retries 1.

--transfers=N
The number of file transfers to run in parallel. It can sometimes be useful to set this to a smaller number if the remote is giving a lot of timeouts or bigger if you have lots of bandwidth and a fast remote.

The default is to run 4 file transfers in parallel.


Questions

  1. If I were to tweak on the rclone side instead, which one of the above options would be more relevant?
  2. Anyone to give an novice-oriented explanation for --checkers=N?

Thanks for your suggestions.
I'll come back for an update after my tests.

For the SFTP backend, rclone will run one connection per "checker" and one connection per "transfer", so default settings you could expect 12 connections. It should make those connections and keep using them.

That is probably worth a note in the docs if you wanted to send a pull request or suggest some wording for me to put in!

Note that for the ftp backend we put in

  --ftp-concurrency int         Maximum number of FTP simultaneous connections, 0 for unlimited

We could do similar for the SFTP backend.

Checkers are used for things like directory listings and reading checksums, transfers are only used for file transfers.

Thanks for your nice and useful explanation.

After reverting MaxStartups and MaxSessions (sshd_config) back to their default values rclone's been running perfectly with no SSH-related connexion issue any more.

So, to recap, I had to do 3 things to make rclone runs flawlessly when choosing the SFTP route:

  1. Update my crontab with adding new environment variable, e.g. SSH_AUTH_SOCK = /Users/$USER/.ssh/ssh-agent.sock
  2. Update my firewall settings using UFW's ALLOW rule (more info here) instead of LIMIT, from specific IP addresses that I use.
  3. Revise MaxStartups and MaxSessions values in sshd_config file.
2 Likes

And thank you for your summary of the changes needed - I'm sure that will be useful to future visitors :slight_smile:

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.