--checksum error but MD5 success?

What is the problem you are having with rclone?

I duplicated my Google Drive to another domain. I did this via Team Drives.

Both drives now have the exact same # of files and exact same bytes. I want to check the file integrity to make sure it was a bit-for-bit transfer.

However, when I run rclone sync gd1: gd2: --checksum --progress --log-file=gd1backup.txt -v I get the error that

Google drive root '': --checksum is in use but the source and destination have no hashes in common; falling back to --size-only

However, in the log every file that is checked has this notation:

Size and MD5 of src and dst objects identical

What does this mean?

What is your rclone version (output from rclone version)

rclone v1.52.1

  • os/arch: darwin/amd64
  • go version: go1.14.3

Which OS you are using and how many bits (eg Windows 7, 64 bit)

OS X

Which cloud storage system are you using? (eg Google Drive)

Google Drive

I'm guessing gd1 and gd2 are crypts? Crypts won't have checksums. And the checksums would be different if they were copied at the crypt level because they would have different noonce. The best way to duplicate a crypt would be to copy them at the underlying drive level. Then they would get the same noonce and you could then do checksums.

They are not crypts which is why it is weird. I copied the files with folderclone.

What happens if you try this?

rclone check gd1: gd2: 

That is a better way of checking all the checksums.

Can you post the output of rclone backend features gd1: and rclone backend features gd2: then we can check those checksums!

If you could post the log file (or the start of it) with -vv when you did your sync that would be helpful too :slight_smile:

Interesting. Output is below. I think everything is working correctly. The only issue I have right now is rclone outputting : --checksum is in use but the source and destination have no hashes in common; falling back to --size-only when in fact is looks like the checksums are correctly identified.

rclone check gd1: gd2:
2020/06/26 09:39:17 ERROR : Movies/.mkv: File not in Google drive root ''
2020/06/26 09:49:57 NOTICE: Google drive root '': 1 files missing
2020/06/26 09:49:57 NOTICE: Google drive root '': 1 differences found
2020/06/26 09:49:57 NOTICE: Google drive root '': 1 hashes could not be checked
2020/06/26 09:49:57 NOTICE: Google drive root '': 119457 matching files
2020/06/26 09:49:57 Failed to check with 2 errors: last error was: 1 differences found

Unfortunately I did not save the original log file for my sync as it became 55MB and it all essentially said the same thing. I reran some of it, here's a sample:

2020/06/26 09:58:09 NOTICE: Google drive root '': --checksum is in use but the source and destination have no hashes in common; falling back to --size-only
2020/06/26 09:58:09 DEBUG : Games/readme.docx: Size of src and dst objects identical
2020/06/26 09:58:09 DEBUG : Games/readme.docx: Unchanged skipping
2020/06/26 09:58:09 DEBUG : .config/rclone/rclone.conf: MD5 = 3bfdc4d078308aa8d2a1e47c7812e752 OK
2020/06/26 09:58:09 DEBUG : .config/rclone/rclone.conf: Size and MD5 of src and dst objects identical
2020/06/26 09:58:09 DEBUG : .config/rclone/rclone.conf: Unchanged skipping
2020/06/26 09:58:09 DEBUG : Education/RunMe.bat: MD5 = 7dc43494df25bc8597110f25ca066602 OK
2020/06/26 09:58:09 DEBUG : Education/RunMe.bat: Size and MD5 of src and dst objects identical
2020/06/26 09:58:09 DEBUG : Education/RunMe.bat: Unchanged skipping

Here are the backend features

rclone backend features gd1:
{
	"Name": "gd1",
	"Root": "",
	"String": "Google drive root ''",
	"Precision": 1000000,
	"Hashes": [
		"MD5"
	],
	"Features": {
		"About": true,
		"BucketBased": false,
		"BucketBasedRootOK": false,
		"CanHaveEmptyDirectories": true,
		"CaseInsensitive": false,
		"ChangeNotify": true,
		"CleanUp": true,
		"Command": true,
		"Copy": true,
		"DirCacheFlush": true,
		"DirMove": true,
		"Disconnect": false,
		"DuplicateFiles": true,
		"GetTier": false,
		"IsLocal": false,
		"ListR": true,
		"MergeDirs": true,
		"Move": true,
		"OpenWriterAt": false,
		"PublicLink": true,
		"Purge": true,
		"PutStream": true,
		"PutUnchecked": true,
		"ReadMimeType": true,
		"ServerSideAcrossConfigs": true,
		"SetTier": false,
		"SetWrapper": false,
		"UnWrap": false,
		"UserInfo": false,
		"WrapFs": false,
		"WriteMimeType": true
	}
}

rclone backend features gd2:
{
        "Name": "gd2",
        "Root": "",
        "String": "Google drive root ''",
        "Precision": 1000000,
        "Hashes": [
                "MD5"
        ],
        "Features": {
                "About": true,
                "BucketBased": false,
                "BucketBasedRootOK": false,
                "CanHaveEmptyDirectories": true,
                "CaseInsensitive": false,
                "ChangeNotify": true,
                "CleanUp": true,
                "Command": true,
                "Copy": true,
                "DirCacheFlush": true,
                "DirMove": true,
                "Disconnect": false,
                "DuplicateFiles": true,
                "GetTier": false,
                "IsLocal": false,
                "ListR": true,
                "MergeDirs": true,
                "Move": true,
                "OpenWriterAt": false,
                "PublicLink": true,
                "Purge": true,
                "PutStream": true,
                "PutUnchecked": true,
                "ReadMimeType": true,
                "ServerSideAcrossConfigs": true,
                "SetTier": false,
                "SetWrapper": false,
                "UnWrap": false,
                "UserInfo": false,
                "WrapFs": false,
                "WriteMimeType": true
        }
}

That is good!

That is confusing, certainly!

Can you make me a repro of that? I have tried to make it do it locally but I haven't yet.

1 Like

No problem. Just tell me what you need and I'll do it :slight_smile:

Sure, just make a simple set of steps I can follow to reproduce the problem here.

Once I've got that then I can fix it :slight_smile:

Sure. It's below. Maybe it's too simple? lol

  1. set up 2 Google Drive accounts on different domains, gd1 and gd2
  2. add config option for gd1 and gd2: server_side_across_configs = true
  3. share gd1 folders with gd2
  4. run rclone sync gd1: gd2: --checksum --progress --log-file=gd1backup.txt -v'
  5. open log file in beginning: Google drive root '': --checksum is in use but the source and destination have no hashes in common; falling back to --size-only
  6. open log file for each transfer: Size and MD5 of src and dst objects identical

Bottom line: the log output of steps 3 and 4 are contradictory

I'll have a go with that tomorrow.

I don't have access to two domains, but I'll try two personal accounts.

Thanks

1 Like

@ncw I know you're super busy with bigger things and probably haven't gotten a chance to test this yet but I want to add a few notes for some larger problems that might be coming from this issue.

  1. I renamed some files in gd1. When I redid the sync from gd1 to gd2, instead of recognizing the rename it completely re-transferred all the renamed files. Is this expected behavior? This happens whether or not I use --size-only or --checksum

  2. Files that are now duplicates in the destination gd2 (because of a rename from 1) are not removed. I thought sync was supposed to make source and destination identical? Does this change when we do a server-side transfer?

For point 1. Use --track-renames

For point 2. Use rclone dedupe first. The sync algorithm doesn't deal with duplicates yet.

Got it, thanks! Any reason why --track-renames isn't default? From my perspective it should be, esp on sync.

Also just to clarify, for point 2, I'll use rclone dedupe after, not before, because the files are now duplicated in the destination.

It uses more resources - memory mostly. It is a hold over from rsync that it isn't the default. I suspect it could be and no-one would notice!

I would dedupe the source before the transfer. If the destination has dupes then do that too!

1 Like