Crypt Documentation error?

What is the problem you are having with rclone?

I think the crypt documentation is incorrect. It says:

Each chunk contains:

  • 16 Bytes of Poly1305 authenticator
  • 1 - 65536 bytes XSalsa20 encrypted data

then gives the example

1 byte file will encrypt to

  • 32 bytes header
  • 17 bytes data chunk

49 bytes total

1 MiB (1048576 bytes) file will encrypt to

  • 32 bytes header
  • 16 chunks of 65568 bytes

1049120 bytes total (a 0.05% overhead). This is the overhead for big files.

I think this second example is wrong.

It should be 16 chunks of 65536 + 16 = 65552 bytes.

A quick test with a 1 byte and 1 MiB (1048576 bytes) file (see below for comands)

        1 2023-01-10 14:47:18.892514542 1b
       49 2023-01-10 14:47:18.892514542 1b.bin
  1048576 2023-01-10 14:41:16.517542777 1kb
  1048864 2023-01-10 14:41:16.517542777 1kb.bin

1048864 and not 1049120.

Please confirm and I will fix and create a PR

Run the command 'rclone version' and share the full output of the command.

rclone v1.61.1
- os/version: darwin 12.6.1 (64 bit)
- os/kernel: 21.6.0 (x86_64)
- os/type: darwin
- os/arch: amd64
- go/version: go1.19.4
- go/linking: dynamic
- go/tags: cmount

(current as of docs version too)

Which cloud storage system are you using? (eg Google Drive)

Crypt

The command you were trying to run (eg rclone copy /tmp remote:tmp)

These were used for the above test. Made up passwords (abcd, 1234)

rclone copy 1b crypt:
rclone copy 1kb crypt: 

The rclone config contents with secrets removed.

These were used for the above test. Made up passwords (abcd, 1234)

[crypt]
type = crypt
remote = .
password = vENjtZL-E-6OQ77fGY6H4WwF57s
password2 = DJrXvTm8658avycmzDjpASEMuiI
filename_encryption = off
directory_name_encryption = false

A log from the command with the -vv flag

N/A

This bit looks wrong indeed!

So I think you are right.

The calculation is here in the code if you want to double check

Thanks! I will take a look and update.

A bit of context: I am curious if I can use Python to decode the encrypted files. There are lots of parts to this including de-obscuring passwords, name encryption, etc. But for now, I am just working on the main decrytion.

I hit some other snags. Notably in the "Key Derivation" section:

Derive the 32+32+16 = 80 bytes of key material

And above in the chunk section:

This uses a 32 byte (256 bit key) key derived from the user password.

In the Python NaCl library, nacl.hashlib.scrypt returns 64 bytes. But, using 64 bytes on the SecretBox object returns an error. From just trying it, truncating to 32 worked....

I will be the first to tell you I am an amateur with cryptography so maybe it is all clear and I am missing it.

If you're interested, the following is my current working code:

password = b'abcd'
password2 = b'1234'
DEFAULT_SALT = b"\xA8\x0D\xF4\x3A\x8F\xBD\x03\x08\xA7\xCA\xB8\x3E\x58\x1F\x86\xB1"
import hashlib
import nacl.hashlib
import nacl.bindings
import nacl.secret
sourcehash = hashlib.sha256()
with open('testdata','rb') as f:
    while (block := f.read(64 * 1024)): # use 64kb but can be anything
        sourcehash.update(block)
print(sourcehash.hexdigest())
f29c666d786b256a49a656d40a6762f5c51407e25964a62080093aefab977e94
key = nacl.hashlib.scrypt(password, 
                          salt=password2 if password2 else DEFAULT_SALT, 
                          n=16384, r=8, p=1)

# I am confused. Rclone docs says the key should be 32-32+16=80
# but the key comes out to 64
# Then, nacl wants the key on only be 32.
# Trial and error yields it works as follows:
box = nacl.secret.SecretBox(key[:32])
desthash = hashlib.sha256()

with open('testdata.bin','rb') as fobj:
    if not fobj.read(8) == b'RCLONE\x00\x00':
        raise ValueError('not rclone')
    Nonce = fobj.read(24)
    while (cipherblock := fobj.read(64 * 1024 + 16)):
        plainblock = box.decrypt(cipherblock,Nonce)
        desthash.update(plainblock)
        Nonce = nacl.bindings.sodium_increment(Nonce)
print(desthash.hexdigest())
f29c666d786b256a49a656d40a6762f5c51407e25964a62080093aefab977e94

That is a great idea.

Yes that looks right. The rclone code is here

It uses the first len(c.dataKey) as the secretbox key, and len(c.dataKey) is 32

Having a second implementation of the crypt encoding is a great idea.

If you want to do the name encoding, I suspect that will be the bit you will struggle with as that uses a special wide cipher that will probably need porting to python.

So is the "32-32+16=80" line in the docs also wrong?

To be fair, this would be at least a third. See crypt_rclone.c.

Yeah... Doesn't look like fun. I will probably tackle that when I get a chance. For now, even the proof-of-concept above is good enough for me! I'd also need to pull in something for base32768 if I really want to support it all.

And same with removing the obscure of a password but How to retrieve a 'crypt' password from a config file is also helpful.

Do you mean this bit?

Key derivation

Rclone uses scrypt with parameters N=16384, r=8, p=1 with an optional user supplied salt (password2) to derive the 32+32+16 = 80 bytes of key material required. If the user doesn't supply a salt then rclone uses an internal one.

scrypt makes it impractical to mount a dictionary attack on rclone encrypted data. For full protection against this you should always use a salt.

In 32+32+16 the first 32 bytes are used for the secretbox key, the next 32 are used for the file name encryption and the next 16 are used for the IV for file name encryption.