Strict sftp environment makes it impossible to do --checksum

Hi all, first of all thank you for a great piece of software and it seems a great community surrounding it.

This is less of a question and more of a thing I stumbled upon that made for a lot of confusion and testing. Maybe it should have a mention in the documentation.

What is the problem you are having with rclone?

It seems --checksum cannot generate md5sum in a strict sftp environment.

I have a Veracrypt container file that does not update modification time or size by default (though I have been forced to change the default behavior, both because of this and of Nextcloud). So I naturally want to use --checksum in the command. However after much testing it seems it is just not possible because
I have this set for the user "xyz" in the sshd_config of the destination to lock down the user logging in to a strict sftp environment.

Match User xyz
        ForceCommand internal-sftp
        ChrootDirectory /home/xyz/
        PasswordAuthentication no

It otherwise works as advertised, but the log says the source and destination have no hashes in common. If I comment out the changes in the sshd_config, everything works as advertised.

I just think it may be confusing at best to say checksumming works over sftp, though it does work when connecting over sftp and not strictly ssh, just not in such a locked down environment.

What is your rclone version (output from rclone version)

rclone v1.50.2

  • os/arch: linux/amd64
  • go version: go1.13.6

Which cloud storage system are you using? (eg Google Drive)

Self hosted raspberry pi with ssh/sftp.

The command you were trying to run (eg rclone copy /tmp remote:tmp)

Anything with --checksum

The rclone config contents with secrets removed.

Irrelevant

A log from the command with the -vv flag

Again if I open up the envisonment and comment out the Match user of the sshd_config of the destination, checksumming works. The important bit is this line.

--checksum is in use but the source and destination have no hashes in common; falling back to --size-only

That's an old version, you should update it.

There's no debug log so hard to tell what's going on since you deleted that part :frowning:

SFTP doesn't have any inherit checksum so you need access to a command to run on the host.

See:

https://rclone.org/sftp/#limitations

Thanks for the answer.

The version is what the default repo's for Ubuntu Server 20.04 has, and yes, I know it is some main versions behind. I can post the entire log if it helps. Side question, is -vv effectively the same as --log-level=DEBUG?

Regarding the access to run remote commands, I think that is the problem, and not something with rclone. Will test and report back later.

Can you pretty pretty please post the rclone version output? I still don't know what version you are running.

Yes -vv and --log-level DEBUG are the same.

I recommend you upgrade ASAP! I have discovered a bug (a race condition) in the SFTP package used by rclone versions before 1.56.0 that may cause loss of data in worst case scenarios.

Oh, isn't it this bit in the OP?

Here is a log where I try to do checksumming in the strict sftp environment.

2021/10/12 21:47:34 DEBUG : rclone: Version "v1.50.2" starting with parameters ["/usr/bin/rclone" "copy" "--checksum" "--log-file=/home/server/rclonesum.log" "--log-level=DEBUG" "--transfers" "30" "--checkers" "8" "--contimeout" "60s" "--timeout" "300s" "--retries" "3" "--low-level-retries" "10" "--stats" "1s" "--stats-file-name-length" "0" "/media/bzpool/nextcloud_storage/Max/files/Crypt/Crypt25" "xyz:/backup"]
2021/10/12 21:47:34 DEBUG : Using config file from "/home/server/.config/rclone/rclone.conf"
2021/10/12 21:47:35 DEBUG : sftp://xyz@192.168.89.10:22//backup: New connection 192.168.5.210:50478->192.168.89.10:22 to "SSH-2.0-OpenSSH_7.9p1 Raspbian-10+deb10u2+rpt1"
2021/10/12 21:47:36 NOTICE: sftp://xyz@192.168.89.10:22//backup: --checksum is in use but the source and destination have no hashes in common; falling back to --size-only
2021/10/12 21:47:36 DEBUG : Crypt25: Size of src and dst objects identical
2021/10/12 21:47:36 DEBUG : Crypt25: Unchanged skipping
2021/10/12 21:47:36 INFO  : 
Transferred:             0 / 0 Bytes, -, 0 Bytes/s, ETA -
Errors:                 0
Checks:                 1 / 1, 100%
Transferred:            0 / 0, -
Elapsed time:          0s

2021/10/12 21:47:36 DEBUG : 12 go routines active
2021/10/12 21:47:36 DEBUG : rclone: Version "v1.50.2" finishing with parameters ["/usr/bin/rclone" "copy" "--checksum" "--log-file=/home/server/rclonesum.log" "--log-level=DEBUG" "--transfers" "30" "--checkers" "8" "--contimeout" "60s" "--timeout" "300s" "--retries" "3" "--low-level-retries" "10" "--stats" "1s" "--stats-file-name-length" "0" "/media/bzpool/nextcloud_storage/Max/files/Crypt/Crypt25" "xyz:/backup"]

From my own testing and what I can read online, you cannot just run md5sum during an sftp session, this is with or without the strict environment I mentioned. So I have no idea how rclone does it.

Thank you Ole. I will look into updating to the latest version :slight_smile:

rlcone doesn't do it per se as the server runs the md5sum command. If you aren't able to execute md5sum on the server, there won't be any hashes in common since it can't run anything.

hello and welcome to the forum,

i was able to get this working on my pi4 running ubuntu server 21.04 64bit.
tho this has nothing to do with the pi itself. just a linux issue.

using this in /etc/ssh/sshd_config

Match user testuser
   ForceCommand internal-sftp
   ChrootDirectory /home/testuser
   PasswordAuthentication no

do the following commands
mkdir -p /home/testuser/bin
cp /bin/md5sum /home/testuser/bin

and here is the remote

[pi4-internal-sftp]
type = sftp
host = 192.168.62.114
user = testuser
pass = 
key_file = C:\data\c\ssh\keys\pi4\openssh.key
md5sum_command = /bin/md5sum

and here is the rclone copy output

rclone copy D:\source pi4-internal-sftp:test -vv 
DEBUG : rclone: Version "v1.56.0" starting with parameters ["c:\\data\\rclone\\scripts\\rclone.exe" "copy" "D:\\source" "pi4-internal-sftp:test" "-vv"]
DEBUG : Creating backend with remote "D:\\source"
DEBUG : Using config file from "c:\\data\\rclone\\scripts\\rclone.conf"
DEBUG : fs cache: renaming cache item "D:\\source" to be canonical "//?/D:/source"
DEBUG : Creating backend with remote "pi4-internal-sftp:test"
DEBUG : sftp://testuser@192.168.62.114:22/test: New connection 192.168.62.234:59751->192.168.62.114:22 to "SSH-2.0-OpenSSH_8.4p1 Ubuntu-5ubuntu1.1"
DEBUG : sftp://testuser@192.168.62.114:22/test: Using absolute root directory "/home/testuser/test"
DEBUG : sftp://testuser@192.168.62.114:22/test: Waiting for checks to finish
DEBUG : sftp://testuser@192.168.62.114:22/test: Waiting for transfers to finish
DEBUG : sftp cmd = /home/testuser/test/file01.txt
DEBUG : sftp output = "c4ca4238a0b923820dcc509a6f75849b  /home/testuser/test/file01.txt"
DEBUG : sftp hash = "c4ca4238a0b923820dcc509a6f75849b"
DEBUG : file01.txt: md5 = c4ca4238a0b923820dcc509a6f75849b OK
INFO  : file01.txt: Copied (new)

Hi, and thanks for taking the time to test it.

I have done as requested. Substituted your testuser with my user "xyz" and added the md5sum command to the rclone config. It did not work. Just to be sure, I chown'ed the executable to the user xyz. Which also did not work. Did you make sure to restart the sshd service after the sshd_config change? Just to be sure we're testing in the same environment.

Thinking it might somehow, even though unlikely, be an issue tied to the rclone version, I went ahead and updated.

Here is the log. Which looks identical on both rclone v1.50.2 and v1.56.2, and incidentally, it is the same as before trying the custom md5sum command.

2021/10/13 09:37:20 DEBUG : rclone: Version "v1.56.2" starting with parameters ["/usr/bin/rclone" "copy" "--checksum" "--log-file=/home/server/rclonesum.log" "--log-level=DEBUG" "--transfers" "30" "--checkers" "8" "--contimeout" "60s" "--timeout" "300s" "--retries" "3" "--low-level-retries" "10" "--stats" "1s" "--stats-file-name-length" "0" "/media/bzpool/nextcloud_storage/Max/files/Crypt/Crypt25" "xyz:/backup"]
2021/10/13 09:37:21 DEBUG : Creating backend with remote "/media/bzpool/nextcloud_storage/Max/files/Crypt/Crypt25"
2021/10/13 09:37:21 DEBUG : Using config file from "/home/server/.config/rclone/rclone.conf"
2021/10/13 09:37:21 DEBUG : fs cache: adding new entry for parent of "/media/bzpool/nextcloud_storage/Max/files/Crypt/Crypt25", "/media/bzpool/nextcloud_storage/Max/files/Crypt"
2021/10/13 09:37:21 DEBUG : Creating backend with remote "xyz:/backup"
2021/10/13 09:37:22 DEBUG : sftp://xyz@192.168.89.10:22//backup: New connection 192.168.5.210:51148->192.168.89.10:22 to "SSH-2.0-OpenSSH_7.9p1 Raspbian-10+deb10u2+rpt1"
2021/10/13 09:37:23 DEBUG : sftp cmd = /backup/Crypt25
2021/10/13 09:37:23 DEBUG : Crypt25: Failed to calculate md5 hash: Process exited with status 1 ()
2021/10/13 09:37:23 DEBUG : Crypt25: Size of src and dst objects identical
2021/10/13 09:37:23 DEBUG : Crypt25: Unchanged skipping
2021/10/13 09:37:23 INFO  : 
Transferred:              0 / 0 Byte, -, 0 Byte/s, ETA -
Checks:                 1 / 1, 100%
Elapsed time:         2.1s

2021/10/13 09:37:23 DEBUG : 11 go routines active

Maybe this could be a way to handle it, downloading the file and calculating the hash sum locally.

You are right md5sum is not part of the sftp protocol.

What rclone does is run an ssh shell session (not an sftp session) and runs the md5sum command.

Rclone attempts to auto detect whether md5sum and sha1sum are available and it caches their paths in the config file.

md5sum_command = md5sum
sha1sum_command = sha1sum

However you can set these to your own commands.

So what you'd need to do to allow rclone to run md5sum etc is to allow ssh shell commands to run md5sum, sha1sum and echo.

You could do this with an ssh authorized key limited to a single command. That could be just md5sum provided you don't mind configuring it manually in the rclone config.

From: SFTP

SFTP supports checksums if the same login has shell access and md5sum or sha1sum as well as echo are in the remote's PATH. This remote checksumming (file hashing) is recommended and enabled by default. Disabling the checksumming may be required if you are connecting to SFTP servers which are not under your control, and to which the execution of remote commands is prohibited. Set the configuration option disable_hashcheck to true to disable checksumming.

I am not the expert in this, but it would be helpful to my understanding if you post the redacted output from

rclone config show xyz

I vaguely remember something about an issue where the md5sum command timed out before completion on large files. This may explain the different results seen by you and @asdffdsa

It would therefore be helpful to see the output from these commands:

rclone lsl xyz:backup --include Crypt25
rclone md5sum xyz:backup --include Crypt25 -vv

and to hear whether you can replicate the issue if you use a small testfile.

note: rclone can act as a sftp server using rclone serve sftp.
it supports checksums without the need to the external program md5sum.

i am not too knowledgeable about linux, and the source of your issue could be something else.
but using that locked down /etc/ssh/sshd_config, at first, checksum did not work, and then i made the changes and it did work.

this was the first time that i:

  • modified /etc/ssh/sshd_config
  • used ForceCommand internal-sftp
  • used ChrootDirectory

i made sure to systemctl restart sshd

this is what i did

  1. created a new user using adduser
  2. rclone was NOT able to run md5sum.
  3. copied md5sum to /home/testuser/bin
  4. changed the rclone remote to hardcode the path
  5. rclone was able to run md5sum

can you post the redacted config file?

Very interesting, I will have a look at this!


I read up on it and it seems I left out a pretty important detail that, if I understand it correctly, leaves ForceCommand internal-sftp ineffective. I am sorry I left it out, it is a long time ago I set up this Raspberry Pi and I had simply forgotten about it.

In /etc/ssh/sshd_config the following line should be uncommented.

Subsystem sftp /usr/lib/openssh/sftp-server


Another thing: I had a second look at the output from your successful attempt and noticed this line:

Which indicates to me the ChrootDirectory /home/testuser did not work either, unless your path, as seen from root is /home/testuser/home/testuser/test/file01.txt

The chroot would make the client connecting think the directory /home/testuser is the absolute root directory. In my output, the path is omitting /home/xyz.

There is a detail about chrooting that I remember missing when I set it up, it is that the parent directory needs to be owned by root. So your /home/testuser would need to be owned by root. However it should throw an error when you try to log in and not work at all, so I don't know if it is even the case.


If these two points are true, it baffles me that it worked at all, and that it would not work without the custom md5sum path and command. Also as @ncw mentions:

In my testing, an ssh session is impossible if ForceCommand internal-sftp is effective.


@Ole
The file is 25Mibyte, so not a large file at all, I'd say.

Here is the output of rclone config show xyz

type = sftp
host = 192.168.89.10
user = xyz
md5sum_command = /bin/md5sum
sha1sum_command = none

The output of rclone lsl xyz:backup --include Crypt25 returned a list of around 25.000 lines, as it lists everything it is excluding. See, this is a device in (home) production, working as an off-site backup, only momentarily at home for some setup and sync. I think I'd need to replicate this in a more sterile lab environment to be able to get to the bottom of it all.

Also, read my answer to @asdffdsa above.


Hi @ncw and thanks for your comment. I tried different solutions to this, as I can understand there are a few different approaches to this. However, none was entirely fruitful. I will have to do a lot more tinkering to get this working. Using rclone itself as an sftp server as @asdffdsa mentions, is the most interesting possibility here, and one I have not yet tried.

What I did try was to leave the chroot and make the requested binaries available in the user directory, like @asdffdsa's approach. However, I found out an ssh session needs more than just /bin/bash to work. I ended up copying everything from /usr/lib into the chroot'ed directory, with inspiration from this guide. The important bit is this:

all of the required files and directories for whatever you want to be able to do within the chroot jail need to be available. To run a basic bash shell, the required files/directories are usually just the following:
/bin/bash
/lib
/lib64
However, in some cases you'll also need /usr

It turned out on raspbian, /lib was symlinked to /usr/lib, and /lib64 did not exist. It is also a possibility to bind certain directories from the root directory to the chroot'ed one, like mounting in a way, however as I understand it, this has to be done with care and the other approach is easier.

This made it possible to initiate an ssh session, however, sftp timed out and was not possible. I have no idea why. Rclone timed out and did not work as it was not able to initiate an sftp session.

There is also the possibility to restrict the possible commands in the authorized_keys, also outlined in the guide above, however this also ruined the initiation of the session. So as I mentioned, I guess it's back to the drawing board, if I want a really strict environment and checksumming.

However, I will maybe try rclone as an sftp server, or leave it as is, because the updated time on the Veracrypt container is not that big of a deal to me, and another part of the equation is that Nextcloud needs to be able to sync the file too, and it also looks at time and size only. Which is an entirely different rabbit hole.

well, as i wrote, not a expert at this.
i followed this and this

i do not have that global setting, instead the setting is only for the user testuser
when i tried to open a terminal, it failed, as expected with "This service allows sftp connections only."

not sure that is correct, as per this

Okay well that confirms it worked as intended :slight_smile:

Oh I think we misunderstand each other. My setup is: client pc through nextcloud client - - > nextcloud server - - > rclone sync to off-site RPi. So i was talking about nextclouds client, not the webdav implementation to something like rclone. Guess it wasn't obvious from my reply.

I agree, the issues isn't due to size.

Thanks! Nothing fancy and that was exactly what I wanted to confirm.

Very good idea. That will make it much easier to play with settings without risking your data, and also easier to ask for help. I would start with a testuser with a few testfiles and very relaxed security settings and then gradually increase the security/strictness while testing access/checksums etc. - that makes it much easier to identify the setting(s) causing issues.

1 Like