I seem to have found an unreliability on the check command with a b2 remote. If I call the command from the backup root folder (78xSfeQ3AvFBktnp7DqiZg==/), it finds a number of differences between the files according to their sha1 sum. Ignore the directory/file names, I’m using gocryptfs to encrypt files in my local drive:
[…]
2017/02/11 21:39:17 78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A==/RhHL6MH8dZ8xY4ShZiTblg==: SHA-1 differ
2017/02/11 21:46:30 B2 bucket path : 202 differences found
2017/02/11 21:46:30 Failed to check: 202 differences found
(I’m just showing the last failing file for reference)
However, if I check just that particular folder (78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A==/) I obtain a totally fine check:
rclone check 78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A== remote:bucket/78xSfeQ3AvFBktnp7DqiZg==/odBS5PWKtpepYDnGhdgzBw==/TGGKb9R74-LI_T2tTpyXtKXg8a6O54KoSRLyPoiQJoA=/KCcKJIBY5Hs-lU3VLTqR_A==
[…]: 0 differences found
Also, I’ve tried obtaining the sha1sums of the local file and the remote file and they are exactly the same, so it seems that when traversing large lists of files, rclone check functionality acts weirdly:
I have run the check command several times and obtained the exact same results, always 202 differences and the same set of files (or a big overlap, I didn’t check them all). Anyone else have noticed this behaviour?
I have also noticed that check (on my encrypted GDrive) said that all files matched. But running a subsequent sync (with exact same source and destination) immediately after the check resulted in the sync copying over a bunch of my files to the encrypted GDrive.
I know that checksums don’t work on crypt but there must be some weirdness with file dates or something, because the files it uploaded were definitely not modified any time recently.
I wish I had more info or more specific data to provide on this. I only started using rclone a week or so ago.
If you can still get the problem, then can you run the check with -vv --dump-bodies --logfile b2.log and post the log somewhere so I can look at it? You might want to check the log to make sure there aren’t any secrets in it.
(on GDrive) After the check didn’t detect any changes, I ran a sync which did sync over some changes. A subsequent sync no longer updated anything (because nothing had changed since last sync).
So as of right now check and sync say no changes.
Best I can really do as I move forward is always run a check before my sync and log it to see if check ever misses any files that sync ends up picking up.
I have been able to test with the latest beta, and the same issue can be observed. I also tried to compare the data from the original local folder and a copy downloaded from the server. This check command works fine, so it does not seem to be a problem with filenames, and I guess is related to the remote.
Not sure how to debug this, -v just reports differences in SHA-1 that are not true. Any ideas?
I think the only way to debug this is if you send me the logs.
So can you do the two rclone checks with the latest beta, one over everything from the root and one over the directory that gave an error in the first check but not in the second.
And can you add to those commands -vv --dump-bodies --log-file test1.log
If you could email them to me at nick@craig-wood.com with subject “rclone check-command-gives-unreliable-results logs”
As you can see, the contentSha1 field is null, hence the difference in SHA-1 sums. The most interesting part is that this behaviour seems pretty consistent, always the same files fail to match the hash.
It is normal for the "contentSha1" to be null for files bigger than 100MB - these will have been uploaded in parts. In these cases the sha1 will be in the metadata "large_file_sha1".
What I’d be really interested in is that exact dump of data, but for
* a file which fails its check test
* the same file when it passed its test
If you could try the latest beta http://beta.rclone.org/v1.35-137-ge2f0fee/ (uploaded in 15-30 mins) this will show the sha1 values that differed (with -vv). If you could report those lines too - ie which sha1 is being calculated wrong - the one from the local fs or the one from b2. If you could try it more than once to see if the values change that would be useful.
The fact that rclone sha1sum always gives the correct sha1sum is maybe indicative that it isn’t a b2 problem but a local problem.
These results are a bit puzzling, the first part of the log shows the correct hash, while in the second the hash is different. I also looked for the wrong hash in the log and found another file with exactly that hash:
I tried your new “RACE” version of rclone, but it’s having problems processing symlinks. It basically exits with an error on encountering the first symlink with the message “/path/to/XXX is a not a regular file”. This does not happen with the current beta version (rclone v1.35-137-ge2f0feeβ) which is what I was using. This looks like a regression. Any tips on how to make this work?
Well, the point is I didn’t want to copy the files pointed by symlinks. In previous versions symlinks would generate a Warning and continue, in this “RACE” version I’m getting an error and the process is terminated.
Just run this race-2 version and got no “RACE” logged in the output. I did not enable verbose logging (-v) while running the command, should I try again with this enabled?