2023/05/09 12:12:11 DEBUG : rclone: Version "v1.62.2" starting with parameters ["/home/ejackson/bin/rclone/rclone" "-vv" "ls" "provider:get/XYZ"]
2023/05/09 12:12:11 DEBUG : Creating backend with remote "provider:get/XYZ"
2023/05/09 12:12:11 DEBUG : Using config file from "/home/ejackson/.config/rclone/rclone.conf"
2023/05/09 12:12:12 DEBUG : sftp://myuser@mycompany.com@sftp.eu.provider.com:22/get/XYZ: New connection 10.10.16.59:38374->167.188.80.52:22 to "SSH-2.0-DataPowerSSH_1.1"
2023/05/09 12:12:12 DEBUG : sftp://myuser@mycompany.com@sftp.eu.provider.com:22/get/XYZ: Running shell type detection remote command: echo ${ShellId}%ComSpec%
2023/05/09 12:12:13 DEBUG : sftp://myuser@mycompany.com@sftp.eu.provider.com:22/get/XYZ: Remote command failed: ssh: command echo ${ShellId}%ComSpec% failed (stdout=[]) (stderr=[])
2023/05/09 12:12:13 DEBUG : sftp://myuser@mycompany.com@sftp.eu.provider.com:22/get/XYZ: Shell type "unix" detected (set option shell_type to override)
2023/05/09 12:12:13 DEBUG : Saving config "shell_type" in section "provider" of the config file
2023/05/09 12:12:13 DEBUG : sftp://myuser@mycompany.com@sftp.eu.provider.com:22/get/XYZ: Relative path resolved to "/apps/cdts/data_in_datapower/get/XYZ"
2023/05/09 12:12:14 Failed to create file system for "provider:get/XYZ": stat failed: sftp: "" (SSH_FX_FAILURE)
Details
We're trying to use rclone to pull from SFTP. We have a handful of SFTP configurations up and running successfully already for different servers. However, this backend server looks a little weird, and is causing rclone to raise an error.
In terms of what we have tried:
Options in the config including the below, in various combinations, with no luck
Absolute file paths in the ls request. Note in the log how the user home is /apps/cdts/data_in_datapower, so if we try that absolutely path we also get 2023/05/09 12:19:51 Failed to create file system for "provider:/apps/cdts/data_in_datapower/get/XYZ": stat failed: sftp: "" (SSH_FX_FAILURE)
SFTP command on the same machine, to check firewall, credentials, etc. It works, e.g.,
sftp myuser@mycompany.com@sftp.eu.provider.com
myuser@mycompany.com@sftp.eu.provider.com's password:
Connected to sftp.eu.provider.com.
sftp> ls
remote readdir("/apps/cdts/data_in_datapower/"): Failure
sftp> ls get/XYZ
... lots files listed here ...
sftp> pwd
Remote working directory: /apps/cdts/data_in_datapower/
sftp> ls /apps/cdts/data_in_datapower/get/XYZ
... lots files listed here ...
sftp>
What it looks like is happening is that when rclone is trying to work out whether the path you gave it is a directory or a file it stats the path. At that point the sftp server is returning SSH_FX_FAILURE rather than file not found for some reason I think.
Can you give this a go please?
This will
print an ERROR log with more info in
ignore that error which should allow everything to continue
Note also that it would be best to test with rclone lsf which doesn't recurse directories and does the simplest possible listing. So if you could do a before and after test with rclone lsf -vv that would be useful - thank you.
Thanks Nick, I'll get on straight on it.
I played around a bit more yesterday with different commands.
I can get it to produce List failed: dirExists stat failed depending on the different file paths I put it., and although Go isn't one of the languages I use I can see it's at least slightly different code paths at backend/sftp/sftp.go.
lsf and lsl produce the same error.
A sync doesn't error, but also does nothing.
2023/05/10 08:36:45 INFO :
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Elapsed time: 2m3.6s
2023/05/10 08:37:45 INFO :
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Elapsed time: 3m3.6s
2023/05/10 08:38:45 INFO :
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Elapsed time: 4m3.6s
2023/05/10 08:39:45 INFO :
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Elapsed time: 5m3.6s
2023/05/10 08:40:45 INFO :
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Elapsed time: 6m3.6s
2023/05/10 08:41:45 INFO :
Transferred: 0 B / 0 B, -, 0 B/s, ETA -
Elapsed time: 7m3.6s
Will get back to you with the trace from the new build shortly.
You used a trailing / on the tests above whereas you didn't on the first test. This may make a difference in that rclone doesn't need to check the root is a file if it has a trailing slash. Can you try it without the / and it should trigger the ERROR message I put it.
Either way though, it looks like stat does not work on the backend which is odd... Or maybe stat doesn't work on the root directory but it does elsewhere?
That was implemented following on from this forum thread
However I think this is likely a different issue though it is another IBM server so perhaps it is similar.
I removed the stat before doing a directory listing (can't remember why that is there anyway) which may at least get you a directory listing, so try this
Success! This build worked and lsf returns a directory listing now it skip the stat step.
ls also works.
That's with ~/bin/rclone-v1.63.0-beta.7035.497960d11.fix-sftp-error-linux-amd64/rclone -vv ls provider:get/XYZ/
With -vv, and the trailing slash (e.g., provider:get/XYZ/), I see
2023/05/10 13:49:32 DEBUG : sftp://myuser@mycompany.com@sftp.eu.provider.com:22/get/XYZ: Relative path resolved to "/apps/cdts/data_in_datapower/get/XYZ"
2023/05/10 13:49:33 ERROR : sftp://myuser@mycompany.com@sftp.eu.provider.com:22/get/XYZ: Unexpected error (f.root="get" f.absRoot="/apps/cdts/data_in_datapower/get", remote="XYZ"): assuming object does not exist: stat failed: sftp: "" (SSH_FX_FAILURE)
which is the expanded error trace you mentioned.
--sftp-use-fstat didn't help in this case either alas. No idea why IBM backends are so tricksy.
Perhaps we need a --sftp-skip-stat option then for the ls* commands (and perhaps others such as copy & sync if stat fails), so the behaviour doesn't change for existing sftp remotes but can be skipped if it's failing?
I don't know why rclone has a stat before listing. I did a bit of software archeology and its been there since the first commit of the SFTP backend in Jan 2017! It seems unecessary to me.
I tidied up that commit here.
I also made the sftp backend check for a trailing / so if you use provider:get/XYZ/ then rclone won't stat get/XYZ to see if it is a file or a directory which should work around the problem.
Mainframes are just so different to Windows/mac/Linux architectures it isn't surprising things get lost in translation!
Can you see if transfers work to and from the server with the binary above? I suspect that we aren't allowed to stat the root of the transfer for some security reason, but hopefully the other stats rclone does will work.
If not then this will need a flag, but I'm hoping to get away without one!
I don't have write permissions to that server, as it's a vendor one, but have tested several directories on the server with the five ls* commands and copy, and it seems to be functioning happily.