What is the problem you are having with rclone?
We use rclone as a cloud drive mounting tool for our workstations and as a backup tool for said S3 drives. Occassionally, which has only started happening recently, we get "corrupted on transfer: md5 hashes differ src" errors on some files. There doesn't seem to be anything in common with the files. They're different file formats, they've been put there by different users and their sizes vary wildly from 20 - 120 Megabytes.
The solution so far is to copy the files to any desktop, deleting the files + their delete markers on the backend and then just pasting the files again from the desktop copies.
I'd like to investigate how this can even happen, because to me it doesn't make any sense. Let's assume the uploaded files are somehow corrupted and are then backed up by the backup script. Shouldn't it just 1:1 copy the corrupted file and NOT get an md5 hash error? It's also happening with the same files every single day, unless I reupload them like mentioned above. So these files never get backed up.
I am aware of --ignore-checksum but I'd rather not get rid of an additional security check if I can avoid it.
Thank you!
Run the command 'rclone version' and share the full output of the command.
rclone v1.68.0
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 6.5.0-35-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.23.1
- go/linking: static
- go/tags: none
Which cloud storage system are you using? (eg Google Drive)
Amazon S3 compliant Cloud Drive
The command you were trying to run (eg rclone copy /tmp remote:tmp
)
rclone sync <SOURCE-BUCKET> <TARGET-BUCKET> --backup-dir=<SCRIPT GENERATED BACKUP PATH> --progress --s3-no-check-bucket --error-on-no-transfer --fast-list --transfers=8 --checkers=32
Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.
[<Backup-User>]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = <endpoint>
acl = private
[<Default-User>]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = <endpoint>
acl = private
[<Secret-Backup-User>]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = <endpoint>
acl = private
[<Secret-Default-User>]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = <endpoint>
acl = private
A log from the command that you were trying to run with the -vv
flag
Way too long due to tens of thousands of files. And also no file throwing the error right now. Also can't provide it due to NDA anyways.
What I can provide is this:
2024/09/16 16:13:04 ERROR : <File1>: corrupted on transfer: md5 hashes differ src(S3 bucket <Source Bucket>) "41a3ba32456e2babf01ed2d8f1329404" vs dst(S3 bucket <Backup Path>) "0388955d01960ab7fe921d086654ccf5"
2024/09/16 16:13:05 ERROR : <File2>: corrupted on transfer: md5 hashes differ src(S3 bucket <Source Bucket>) "a4656874ff9607cf242306fba3b45e95" vs dst(S3 bucket <Backup Path>) "37a4851917c44de1fcfc2a08a61ca6af"
2024/09/16 16:13:06 ERROR : <File3>: corrupted on transfer: md5 hashes differ src(S3 bucket <Source Bucket>) "ca6ea546dd5d53e2737b1bc24b210bf6" vs dst(S3 bucket <Backup Path>) "da025a2de3b05b7602272ad4854c7dc1"
2024/09/16 16:14:53 ERROR : S3 bucket <Backup Path>: not deleting files as there were IO errors
2024/09/16 16:14:53 ERROR : S3 bucket <Backup Path>: not deleting directories as there were IO errors
2024/09/16 16:14:53 ERROR : Attempt 1/3 failed with 3 errors and: corrupted on transfer: md5 hashes differ src(S3 bucket <Source Bucket>) "ca6ea546dd5d53e2737b1bc24b210bf6" vs dst(S3 bucket <Backup Path>) "da025a2de3b05b7602272ad4854c7dc1"
This then repeats two times after that.