Should Lack of MD5 on Encrypted s3 Concern Me?

What is the problem you are having with rclone?

Question: I read in the doc, "Note that files uploaded both with multipart upload and through crypt remotes do not have MD5 sums."

Should I be concerned about data integrity? I'm backing up some VMs to an S3 provider and encrypting with crypt.

  • Does "do not have MD5 sums" mean there is no integrity checking on multipart encrypted files, or is there some lesser integrity checking?
  • Is there a better work-around than cutting my large files into parts myself?

What is your rclone version (output from rclone version)

rclone v1.55.1

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Ubuntu Linux 20.04.02 64-bit

Which cloud storage system are you using? (eg Google Drive)

cloud.idrive.com (the s3 interface)

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy /data cloudbackup:/

The rclone config contents with secrets removed.

[idrivecloud]
type = s3
provider = Other
env_auth = false
access_key_id = aa
secret_access_key = bb
endpoint = s3.us-west-1.idrivecloud.io
acl = private

[xidrivecloud]
type = crypt
remote = idrivecloud:
filename_encryption = off
directory_name_encryption = false
password = xx
password2 = yy

[cloudbackup]
type = alias
remote = xidrivecloud:

A log from the command with the -vv flag

(Truly not applicable, and since I'm uploading a few hundred thousand files, -vv would generate a lot of irrelevant text.)

Let me just enumerate the levels of protection we have normally

  1. the S3 protocol has sha1 checksums for each chunk transferred
  2. the crypt backend will check the MD5SUM of the encrypted file after the transfer
  3. the crypt stream itself has a very strong message authenticator based on Poly1305 so downloading the data will definitely detect corruptions.
  4. you can use rclone cryptcheck to check MD5SUMs of uploaded files. This has to locally encrypt the file, create the MD5SUM and compare it against the hash stored by S3.

Lets talk about multipart uploads now. These don't have MD5SUMs when in use by crypt because crypt would have to encrypt the file locally first before the upload to create the MD5SUM then encrypt it again. This would be possible (for the local backend) and there is an issue about implementing it.

Given that you still have protection from 1) and 3)

You could use the chunker backend to do this. This can also add MD5SUMs...

However as far as integrity protection the Poly1305 protection is very good, but has the disadvantage you need to download the file to check it.

that should be MD5, not sha1, correct?

For each chunk transferred is the key rest of that sentence :slight_smile:

s3 uses sha1 for each chunk.
ok. thanks

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.