"corrupted on transfer: md5 hashes differ src" on some files

What is the problem you are having with rclone?

We use rclone as a cloud drive mounting tool for our workstations and as a backup tool for said S3 drives. Occassionally, which has only started happening recently, we get "corrupted on transfer: md5 hashes differ src" errors on some files. There doesn't seem to be anything in common with the files. They're different file formats, they've been put there by different users and their sizes vary wildly from 20 - 120 Megabytes.

The solution so far is to copy the files to any desktop, deleting the files + their delete markers on the backend and then just pasting the files again from the desktop copies.

I'd like to investigate how this can even happen, because to me it doesn't make any sense. Let's assume the uploaded files are somehow corrupted and are then backed up by the backup script. Shouldn't it just 1:1 copy the corrupted file and NOT get an md5 hash error? It's also happening with the same files every single day, unless I reupload them like mentioned above. So these files never get backed up.

I am aware of --ignore-checksum but I'd rather not get rid of an additional security check if I can avoid it.

Thank you!

Run the command 'rclone version' and share the full output of the command.

rclone v1.68.0
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 6.5.0-35-generic (x86_64)
- os/type: linux
- os/arch: amd64
- go/version: go1.23.1
- go/linking: static
- go/tags: none

Which cloud storage system are you using? (eg Google Drive)

Amazon S3 compliant Cloud Drive

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync <SOURCE-BUCKET> <TARGET-BUCKET> --backup-dir=<SCRIPT GENERATED BACKUP PATH> --progress --s3-no-check-bucket --error-on-no-transfer --fast-list --transfers=8 --checkers=32

Please run 'rclone config redacted' and share the full output. If you get command not found, please make sure to update rclone.

[<Backup-User>]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = <endpoint>
acl = private

[<Default-User>]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = <endpoint>
acl = private

[<Secret-Backup-User>]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = <endpoint>
acl = private

[<Secret-Default-User>]
type = s3
provider = Other
access_key_id = XXX
secret_access_key = XXX
endpoint = <endpoint>
acl = private

A log from the command that you were trying to run with the -vv flag

Way too long due to tens of thousands of files. And also no file throwing the error right now. Also can't provide it due to NDA anyways.

What I can provide is this:

2024/09/16 16:13:04 ERROR : <File1>: corrupted on transfer: md5 hashes differ src(S3 bucket <Source Bucket>) "41a3ba32456e2babf01ed2d8f1329404" vs dst(S3 bucket <Backup Path>) "0388955d01960ab7fe921d086654ccf5"
2024/09/16 16:13:05 ERROR : <File2>: corrupted on transfer: md5 hashes differ src(S3 bucket <Source Bucket>) "a4656874ff9607cf242306fba3b45e95" vs dst(S3 bucket <Backup Path>) "37a4851917c44de1fcfc2a08a61ca6af"
2024/09/16 16:13:06 ERROR : <File3>: corrupted on transfer: md5 hashes differ src(S3 bucket <Source Bucket>) "ca6ea546dd5d53e2737b1bc24b210bf6" vs dst(S3 bucket <Backup Path>) "da025a2de3b05b7602272ad4854c7dc1"
2024/09/16 16:14:53 ERROR : S3 bucket <Backup Path>: not deleting files as there were IO errors
2024/09/16 16:14:53 ERROR : S3 bucket <Backup Path>: not deleting directories as there were IO errors
2024/09/16 16:14:53 ERROR : Attempt 1/3 failed with 3 errors and: corrupted on transfer: md5 hashes differ src(S3 bucket <Source Bucket>) "ca6ea546dd5d53e2737b1bc24b210bf6" vs dst(S3 bucket <Backup Path>) "da025a2de3b05b7602272ad4854c7dc1"

This then repeats two times after that.

sorry, a bit confused. the command you posted was rclone sync, not rclone mount?

focus on one single file. rclone copy that one single file and post a full debug log.

which provider?

We use rclone mount to mount drives in production. We use these drives for all kinds of things from project files to asset libraries etc.

We use rclone sync in our backup script. That script syncs a S3 bucket to another bucket every night. These buckets are mounted for production like mentioned above. The backup buckets are only for backup.

The sync is what fails now and then with the aforementioned error.

I had to fix the errors we had tonight and the night before for production, I can post another error once it happens. I just thought I'd post because maybe somebody already had a similiar problem or I was doing something fundamentally wrong.

23M.

can you post the full output of
rclone backend features <Backup-User>:

Sure thing, here it is:

	"Name": "local",
	"Root": "/home/s3b-user/Backup-User",
	"String": "Local file system at /home/s3b-user/Backup-User",
	"Precision": 1,
	"Hashes": [
		"md5",
		"sha1",
		"whirlpool",
		"crc32",
		"sha256",
		"dropbox",
		"hidrive",
		"mailru",
		"quickxor"
	],
	"Features": {
		"About": true,
		"BucketBased": false,
		"BucketBasedRootOK": false,
		"CanHaveEmptyDirectories": true,
		"CaseInsensitive": false,
		"ChangeNotify": false,
		"ChunkWriterDoesntSeek": false,
		"CleanUp": false,
		"Command": true,
		"Copy": false,
		"DirCacheFlush": false,
		"DirModTimeUpdatesOnWrite": true,
		"DirMove": true,
		"DirSetModTime": true,
		"Disconnect": false,
		"DuplicateFiles": false,
		"FilterAware": true,
		"GetTier": false,
		"IsLocal": true,
		"ListR": false,
		"MergeDirs": false,
		"MkdirMetadata": true,
		"Move": true,
		"NoMultiThreading": false,
		"OpenChunkWriter": false,
		"OpenWriterAt": true,
		"Overlay": false,
		"PartialUploads": true,
		"PublicLink": false,
		"Purge": false,
		"PutStream": true,
		"PutUnchecked": false,
		"ReadDirMetadata": true,
		"ReadMetadata": true,
		"ReadMimeType": false,
		"ServerSideAcrossConfigs": false,
		"SetTier": false,
		"SetWrapper": false,
		"Shutdown": false,
		"SlowHash": true,
		"SlowModTime": false,
		"UnWrap": false,
		"UserDirMetadata": true,
		"UserInfo": false,
		"UserMetadata": true,
		"WrapFs": false,
		"WriteDirMetadata": true,
		"WriteDirSetModTime": true,
		"WriteMetadata": true,
		"WriteMimeType": false
	},
	"MetadataInfo": {
		"System": {
			"atime": {
				"Help": "Time of last access",
				"Type": "RFC 3339",
				"Example": "2006-01-02T15:04:05.999999999Z07:00",
				"ReadOnly": false
			},
			"btime": {
				"Help": "Time of file birth (creation)",
				"Type": "RFC 3339",
				"Example": "2006-01-02T15:04:05.999999999Z07:00",
				"ReadOnly": false
			},
			"gid": {
				"Help": "Group ID of owner",
				"Type": "decimal number",
				"Example": "500",
				"ReadOnly": false
			},
			"mode": {
				"Help": "File type and mode",
				"Type": "octal, unix style",
				"Example": "0100664",
				"ReadOnly": false
			},
			"mtime": {
				"Help": "Time of last modification",
				"Type": "RFC 3339",
				"Example": "2006-01-02T15:04:05.999999999Z07:00",
				"ReadOnly": false
			},
			"rdev": {
				"Help": "Device ID (if special file)",
				"Type": "hexadecimal",
				"Example": "1abc",
				"ReadOnly": false
			},
			"uid": {
				"Help": "User ID of owner",
				"Type": "decimal number",
				"Example": "500",
				"ReadOnly": false
			}
		},
		"Help": "Depending on which OS is in use the local backend may return only some\nof the system metadata. Setting system metadata is supported on all\nOSes but setting user metadata is only supported on linux, freebsd,\nnetbsd, macOS and Solaris. It is **not** supported on Windows yet\n([see pkg/attrs#47](https://github.com/pkg/xattr/issues/47)).\n\nUser metadata is stored as extended attributes (which may not be\nsupported by all file systems) under the \"user.*\" prefix.\n\nMetadata is supported on files and directories.\n"

that is for the local file system, we need to see the ouptut for one of the remotes you created.

is that the real name of the remote or not?
need to run the command using the real name of the remote, not a fake name, ok?

this time, post the command and the output of the command, ok?
note: if you want, before posting, it is ok to redact the real names.

Well I redacted the real name but of course I ran the command with the real remote name.

But it seems something went wrong anyways. I'll investigate and post again if I fixed that issue.

Thank you!

check the command that i shared with you

i think you left out the : on the remote name.
without the :, rclone thinks you are asking about the local file system.

Got it, thank you very much for your patience.

Here's the output:

{
	"Name": "<Backup-User>",
	"Root": "",
	"String": "S3 root",
	"Precision": 1,
	"Hashes": [
		"md5"
	],
	"Features": {
		"About": false,
		"BucketBased": true,
		"BucketBasedRootOK": true,
		"CanHaveEmptyDirectories": false,
		"CaseInsensitive": false,
		"ChangeNotify": false,
		"ChunkWriterDoesntSeek": false,
		"CleanUp": true,
		"Command": true,
		"Copy": true,
		"DirCacheFlush": false,
		"DirModTimeUpdatesOnWrite": false,
		"DirMove": false,
		"DirSetModTime": false,
		"Disconnect": false,
		"DuplicateFiles": false,
		"FilterAware": false,
		"GetTier": true,
		"IsLocal": false,
		"ListR": true,
		"MergeDirs": false,
		"MkdirMetadata": false,
		"Move": false,
		"NoMultiThreading": false,
		"OpenChunkWriter": true,
		"OpenWriterAt": false,
		"Overlay": false,
		"PartialUploads": false,
		"PublicLink": true,
		"Purge": true,
		"PutStream": true,
		"PutUnchecked": false,
		"ReadDirMetadata": false,
		"ReadMetadata": true,
		"ReadMimeType": true,
		"ServerSideAcrossConfigs": false,
		"SetTier": true,
		"SetWrapper": false,
		"Shutdown": false,
		"SlowHash": false,
		"SlowModTime": true,
		"UnWrap": false,
		"UserDirMetadata": false,
		"UserInfo": false,
		"UserMetadata": true,
		"WrapFs": false,
		"WriteDirMetadata": false,
		"WriteDirSetModTime": false,
		"WriteMetadata": true,
		"WriteMimeType": true
	},
	"MetadataInfo": {
		"System": {
			"btime": {
				"Help": "Time of file birth (creation) read from Last-Modified header",
				"Type": "RFC 3339",
				"Example": "2006-01-02T15:04:05.999999999Z07:00",
				"ReadOnly": true
			},
			"cache-control": {
				"Help": "Cache-Control header",
				"Type": "string",
				"Example": "no-cache",
				"ReadOnly": false
			},
			"content-disposition": {
				"Help": "Content-Disposition header",
				"Type": "string",
				"Example": "inline",
				"ReadOnly": false
			},
			"content-encoding": {
				"Help": "Content-Encoding header",
				"Type": "string",
				"Example": "gzip",
				"ReadOnly": false
			},
			"content-language": {
				"Help": "Content-Language header",
				"Type": "string",
				"Example": "en-US",
				"ReadOnly": false
			},
			"content-type": {
				"Help": "Content-Type header",
				"Type": "string",
				"Example": "text/plain",
				"ReadOnly": false
			},
			"mtime": {
				"Help": "Time of last modification, read from rclone metadata",
				"Type": "RFC 3339",
				"Example": "2006-01-02T15:04:05.999999999Z07:00",
				"ReadOnly": false
			},
			"tier": {
				"Help": "Tier of the object",
				"Type": "string",
				"Example": "GLACIER",
				"ReadOnly": true
			}
		},
		"Help": "User metadata is stored as x-amz-meta- keys. S3 metadata keys are case insensitive and are always returned in lower case."
	}
}

are the source and dest the same provider, just different buckets or what?

can do that, with a few redactions as possible?

which file has has the correct md5, source or dest?