Check command gives unreliable results

Hi all,

I seem to have found an unreliability on the check command with a b2 remote. If I call the command from the backup root folder (78xSfeQ3AvFBktnp7DqiZg==/), it finds a number of differences between the files according to their sha1 sum. Ignore the directory/file names, I’m using gocryptfs to encrypt files in my local drive:

[…]
2017/02/11 21:39:17 78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A==/RhHL6MH8dZ8xY4ShZiTblg==: SHA-1 differ
2017/02/11 21:46:30 B2 bucket path : 202 differences found
2017/02/11 21:46:30 Failed to check: 202 differences found

(I’m just showing the last failing file for reference)

However, if I check just that particular folder (78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A==/) I obtain a totally fine check:

rclone check 78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A== remote:bucket/78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A==
[…]: 0 differences found

Also, I’ve tried obtaining the sha1sums of the local file and the remote file and they are exactly the same, so it seems that when traversing large lists of files, rclone check functionality acts weirdly:

rclone sha1sum remote:bucket/78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A==/RhHL6MH8dZ8xY4ShZiTblg==
SHA1: 96098182b85689792a77a37f06add629e8fe533b

sha1sum 78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A==/RhHL6MH8dZ8xY4ShZiTblg==
SHA1: 96098182b85689792a77a37f06add629e8fe533b

I have run the check command several times and obtained the exact same results, always 202 differences and the same set of files (or a big overlap, I didn’t check them all). Anyone else have noticed this behaviour?

Thanks

That is strange!

Can you reproduce with the latest beta?

I suspect it is B2 weirdness rather than rclone weirdness, but maybe that is wishful thinking :wink:

I have also noticed that check (on my encrypted GDrive) said that all files matched. But running a subsequent sync (with exact same source and destination) immediately after the check resulted in the sync copying over a bunch of my files to the encrypted GDrive.

I know that checksums don’t work on crypt but there must be some weirdness with file dates or something, because the files it uploaded were definitely not modified any time recently.

I wish I had more info or more specific data to provide on this. I only started using rclone a week or so ago.

Can you try b2 with the latest beta please?

If you can still get the problem, then can you run the check with -vv --dump-bodies --logfile b2.log and post the log somewhere so I can look at it? You might want to check the log to make sure there aren’t any secrets in it.

Sorry, I just use GDrive and ACD, not B2.

(on GDrive) After the check didn’t detect any changes, I ran a sync which did sync over some changes. A subsequent sync no longer updated anything (because nothing had changed since last sync).

So as of right now check and sync say no changes.

Best I can really do as I move forward is always run a check before my sync and log it to see if check ever misses any files that sync ends up picking up.

Looks like B2 is the other guy, the OP :slight_smile:

Sorry, missed that it wasn’t the OP!

If you run with -v it will say exactly why each file got transferred - that should give a clue!

Hi again

I have been able to test with the latest beta, and the same issue can be observed. I also tried to compare the data from the original local folder and a copy downloaded from the server. This check command works fine, so it does not seem to be a problem with filenames, and I guess is related to the remote.

Not sure how to debug this, -v just reports differences in SHA-1 that are not true. Any ideas?

I think this might be related to: https://github.com/ncw/rclone/issues/578

I think the only way to debug this is if you send me the logs.

So can you do the two rclone checks with the latest beta, one over everything from the root and one over the directory that gave an error in the first check but not in the second.

And can you add to those commands -vv --dump-bodies --log-file test1.log

If you could email them to me at nick@craig-wood.com with subject “rclone check-command-gives-unreliable-results logs”

Hi!
well, I run the check command with the -vv --dump-bodies arguments and the relevant part of the output is the following:

{
“action”: “upload”,
“contentLength”: 255732836,
“contentSha1”: null,
“contentType”: “application/octet-stream”,
“fileId”: “4_z63eef2e0f9fdb3985b7d041a_f201145fad8b43667_d20170129_m175845_c001_v0001032_t0009”,
“fileInfo”: {
“large_file_sha1”: “676ae975903b9629f10d2d75cd3db30bf9222724”,
“src_last_modified_millis”: “1382701286000”
},
“fileName”: “crypt/Essentials/78xSfeQ3AvFBktnp7DqiZg==/ns3-Qjw6TQkHS_SZ02Mz_g==/-8APGQQ1tOP4ThKjcSeYQw==/MVqaoIYUUu2Cv_UdMbkLfUx0SnFvFrx6YRdKCaW5PQs=/-KcNDOMvSfghMg7_T-oZaQ==/_BkUNvxqxjLZHl3emoNZCw==”,
“size”: 255732836,
“uploadTimestamp”: 1485712725000
}

As you can see, the contentSha1 field is null, hence the difference in SHA-1 sums. The most interesting part is that this behaviour seems pretty consistent, always the same files fail to match the hash.

Thanks

It is normal for the "contentSha1" to be null for files bigger than 100MB - these will have been uploaded in parts. In these cases the sha1 will be in the metadata "large_file_sha1".

What I’d be really interested in is that exact dump of data, but for

* a file which fails its check test
* the same file when it passed its test

If you could try the latest beta http://beta.rclone.org/v1.35-137-ge2f0fee/ (uploaded in 15-30 mins) this will show the sha1 values that differed (with -vv). If you could report those lines too - ie which sha1 is being calculated wrong - the one from the local fs or the one from b2. If you could try it more than once to see if the values change that would be useful.

The fact that rclone sha1sum always gives the correct sha1sum is maybe indicative that it isn’t a b2 problem but a local problem.

Hi again,

find the relevant lines of the output log using the latest beta below:

{
“action”: “upload”,
“contentLength”: 259549257,
“contentSha1”: null,
“contentType”: “application/octet-stream”,
“fileId”: “4_z63eef2e0f9fdb3985b7d041a_f200f99297b16ae68_d20170129_m184954_c001_v0001025_t0044”,
“fileInfo”: {
“large_file_sha1”: “43a0869b81faeb9fa078dacc930b186f79f588ba”,
“src_last_modified_millis”: “1367647362000”
}
“fileName”: “crypt/Essentials/78xSfeQ3AvFBktnp7DqiZg==/ns3-Qjw6TQkHS_SZ02Mz_g==/-8APGQQ1tOP4ThKjcSeYQw==/oMdN_MU_NRMyqnsZcTMQ0w==/kdb9afgsrIJf9vATFmxrRg==/Yitvt9XNyBaQrmjRSDV4jw==”,
“size”: 259549257,
“uploadTimestamp”: 1485715794000
}

2017/02/23 20:38:56 DEBUG : 78xSfeQ3AvFBktnp7DqiZg==/ns3-Qjw6TQkHS_SZ02Mz_g==/-8APGQQ1tOP4ThKjcSeYQw==/oMdN_MU_NRMyqnsZcTMQ0w==/kdb9afgsrIJf9vATFmxrRg==/Yitvt9XNyBaQrmjRSDV4jw==: SHA-1 = 43a0869b81faeb9fa078dacc930b186f79f588ba (Local file system at [LOCAL_PATH])
2017/02/23 20:38:56 DEBUG : 78xSfeQ3AvFBktnp7DqiZg==/ns3-Qjw6TQkHS_SZ02Mz_g==/-8APGQQ1tOP4ThKjcSeYQw==/oMdN_MU_NRMyqnsZcTMQ0w==/kdb9afgsrIJf9vATFmxrRg==/Yitvt9XNyBaQrmjRSDV4jw==: SHA-1 = 3c8aaea5af7799bd5f1f638fbf62d8ef2d830dd4 (B2 BUCKET)
2017/02/23 20:38:56 ERROR : 78xSfeQ3AvFBktnp7DqiZg==/ns3-Qjw6TQkHS_SZ02Mz_g==/-8APGQQ1tOP4ThKjcSeYQw==/oMdN_MU_NRMyqnsZcTMQ0w==/kdb9afgsrIJf9vATFmxrRg==/Yitvt9XNyBaQrmjRSDV4jw==: SHA-1 differ

These results are a bit puzzling, the first part of the log shows the correct hash, while in the second the hash is different. I also looked for the wrong hash in the log and found another file with exactly that hash:

{
“action”: “upload”,
“contentLength”: 16290057,
“contentSha1”: “3c8aaea5af7799bd5f1f638fbf62d8ef2d830dd4”,
“contentType”: “application/octet-stream”,
“fileId”: “4_z63eef2e0f9fdb3985b7d041a_f111da6b8d49b4cb0_d20170129_m191512_c001_v0001004_t0041”,
“fileInfo”: {
“src_last_modified_millis”: “1405683194000”
},
“fileName”: “crypt/Essentials/78xSfeQ3AvFBktnp7DqiZg==/ns3-Qjw6TQkHS_SZ02Mz_g==/-8APGQQ1tOP4ThKjcSeYQw==/BfRVr_S-HRO6pI6z3580yA==/6goVyhjEyIPEUvU2i7jEsQ==/Ppb6SvllJ3LPUgnnqvUMIw==/Qz_dC9uNRXirw-_i172EOA==”,
“size”: 16290057,
“uploadTimestamp”: 1485717312000
}

Hope this helps to debug this issue!

That is very interesting…

I’m don’t know what is happening, but let’s rule something out.

I’ve compiled a version of rclone with the race detector. It is for linux 64 bit - I hope that will work for you - if not let me know.

http://pub.rclone.org/rclone-v1.35-147-ge48d19f-race.zip

Can you run your failing check without much logging, but put the standard output and standard error to a file

rclone check /path/to/local remote:bucket >log 2>&1

If that comes up with anything which says race or RACE it in then can you post it please!

Hi,

I tried your new “RACE” version of rclone, but it’s having problems processing symlinks. It basically exits with an error on encountering the first symlink with the message “/path/to/XXX is a not a regular file”. This does not happen with the current beta version (rclone v1.35-137-ge2f0feeβ) which is what I was using. This looks like a regression. Any tips on how to make this work?

Do you use --copy-links the command was added here: https://github.com/ncw/rclone/commit/94947f2523a0f137d9b741221c5bf271eed214c9

If you supply this flag then rclone will follow the symlink and copy
the pointed to file or directory.

This flag applies to all commands.

p.s. Also use the latest beta as something was fixed recently: https://github.com/ncw/rclone/commit/5fba9132071b8696cb9ae595f2bf700f2ec62403

Well, the point is I didn’t want to copy the files pointed by symlinks. In previous versions symlinks would generate a Warning and continue, in this “RACE” version I’m getting an error and the process is terminated.

I’ve fixed that problem with the symlinks.

Here is another race enabled version: http://pub.rclone.org/rclone-v1.35-149-g5fba913-race.zip - can you have a go with that please!

Thanks

Nick

Arg, 3rd time lucky - here is a race enabled version for you to try - ignore the previous one I forgot to build in the race detector

http://pub.rclone.org/rclone-v1.35-149-g5fba913-race-2.zip

Just run this race-2 version and got no “RACE” logged in the output. I did not enable verbose logging (-v) while running the command, should I try again with this enabled?

The race if any will appear on stderr - -v shouldn’t affect it.

Did it still give the problem? You might need to try it a few times…

Same problem, 202 files differ and still no RACE in the logs. I’ll run it again a couple of times tomorrow and report back.