Rclone check - SUM

trying to understand your use-case.

what is lacking with rclone copy to copy the missing files from cloud to local
or
rclone check with one of these flags

--missing-on-dst string   Report all files missing from the destination to this file
--missing-on-src string   Report all files missing from the source to this file

HI - yes nothing is "missing" per si but I need to run rclone check to see what files need to be copied...
using
rclone check C:\Users\Tony\sum2.txt remote:/ --checkfile=md5 --filter-from "C:\Program Files (x86)\rclone\filter-file.txt" --missing-on-src C:\Users\Tony\missing_on_src.txt

there's an entry in the missing_on_src.txt file called:
My Pictures/Tony/Automatic Upload/Tony’s iPhone/2021-02-02 08-53-03.heic
and in my sum file:
1dce89040d9ac5e3bc07d813d7a4eacb Pictures/2021-02-02 08-53-03.heic

if the goal is to find missing files, why take the time to rclone hashum to create a static, point in time, sum2.txt

this would be a real-time and quicker check.
rclone check remote: --filter-from "C:\Program Files (x86)\rclone\filter-file.txt" --missing-on-src C:\Users\Tony\missing_on_src.txt

yes that's what I'm doing but I haven't been able to get it to successfully FIND anything it keeps giving the sum file errors per the above. Which is why I generated a sum file directly from rclone for a few to ensure the syntax was correct - but if that produces the same errors!

2021/08/24 17:06:15 NOTICE: sum2.txt: improperly formatted checksum line 1

not sure what you mean?
the last command i shared does not use a sum file, so there would not be checksum errors in the log such as you just posted.
if you want to find missing files, what is the purpose of taking the time to create a checksum file?

I just want to get the above to work but its not without throwing the improperly formatted checksum line # error

never had a problem with check and sum files, so not sure what the problem is...

perhaps re-create the sum2.txt file and try again...

if the problem happens again, need to post the full debug log

Here's the command/debug showing the file exists in the remote:

C:\Users\Tony>rclone md5sum remote:/"My Pictures"/2001/12/ -vv
2021/08/24 19:55:53 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "md5sum" "remote:/My Pictures/2001/12/" "-vv"]
2021/08/24 19:55:53 DEBUG : Creating backend with remote "remote:/My Pictures/2001/12/"
2021/08/24 19:55:53 DEBUG : Using config file from "C:\\Users\\Tony\\.config\\rclone\\rclone.conf"
2021/08/24 19:55:54 DEBUG : fs cache: renaming cache item "remote:/My Pictures/2001/12/" to be canonical "remote:My Pictures/2001/12"
1f4673e70daaea746df45ae65c2f9c85  2001_12_05_01.jpg
97abfa62d536dc94061171341239975d  2001_12_05_55.jpg
2021/08/24 19:55:59 DEBUG : 3 go routines active

My SUM file: (I've tired modifying the file path in this sum file to the following, all produce the same result

"My Pictures"/... 
"My Pictures/..."
My Pictures
/My Pictures

SUM File:
97abfa62d536dc94061171341239975d /My Pictures/2001/12/2001_12_05_55.jpg

The rclone check command:
C:\Users\Tony>rclone check P:\scripts\test.csv remote:/"My Pictures"/2001/12/ --checkfile=md5 -vvv
2021/08/24 19:51:27 DEBUG : rclone: Version "v1.56.0" starting with parameters ["rclone" "check" "P:\scripts\test.csv" "remote:/My Pictures/2001/12/" "--checkfile=md5" "-vvv"]
2021/08/24 19:51:27 DEBUG : Creating backend with remote "P:\scripts\test.csv"
2021/08/24 19:51:27 DEBUG : Using config file from "C:\Users\Tony\.config\rclone\rclone.conf"
2021/08/24 19:51:27 DEBUG : fs cache: adding new entry for parent of "P:\scripts\test.csv", "//?/P:/scripts"
2021/08/24 19:51:27 DEBUG : Creating backend with remote "remote:/My Pictures/2001/12/"
2021/08/24 19:51:28 DEBUG : fs cache: renaming cache item "remote:/My Pictures/2001/12/" to be canonical "remote:My Pictures/2001/12"
2021/08/24 19:51:28 NOTICE: test.csv: improperly formatted checksum line 0
2021/08/24 19:51:28 NOTICE: test.csv: improperly formatted checksum line 1
2021/08/24 19:51:28 NOTICE: test.csv: improperly formatted checksum line 2
2021/08/24 19:51:28 NOTICE: test.csv: more warnings suppressed...
2021/08/24 19:51:28 ERROR : 2001_12_05_66 (1).jpg: sum not found
...
2021/08/24 19:51:28 NOTICE: 73 hashes missing
2021/08/24 19:51:28 NOTICE: pcloud root 'My Pictures/2001/12': 73 differences found
2021/08/24 19:51:28 NOTICE: pcloud root 'My Pictures/2001/12': 73 errors while checking
2021/08/24 19:51:28 INFO :
Transferred: 0 / 0 Byte, -, 0 Byte/s, ETA -
Errors: 73 (retrying may help)
Checks: 73 / 73, 100%
Elapsed time: 1.0s

2021/08/24 19:51:28 DEBUG : 3 go routines active
2021/08/24 19:51:28 Failed to check with 73 errors: last error was: 73 differences found

i noticed that to create the sum file, you use rclone md5sum and i use rclone hashsum

rclone check and rclone checksum against the sum files works for me.
including folders and files with space characters in the sum file, no need to quote them....

rclone md5sum is just a short form of rclone hashsum md5

rclone sha1sum is short for rclone hashsum sha1

we provide two short forms because md5 and sha1 are widespread and used most of the time.

94918a22d08f0af48da351e517ad9ae6  NTUSER.DAT{53b39e88-18c4-11ea-a811-000d3aa4692b}.TM.blf
                           ERROR  NTUSER.DAT
6f7d4c882b008493c20d4b590e548de3  _netrc
                           ERROR  ntuser.dat.LOG1
                           ERROR  ntuser.dat.LOG2
6fc234ad3752e1267b34fb12bcd6718b  ntuser.ini
b4f52fc75ea9380f652e0bb17055e1f4  

Windows prohibited reading contents of NTUSER.DAT and two more files. As rclone could not sum up the contents, it produced the word ERROR instead of the sum. The SUM file writer in rclone blindly streamed the word into output file.

It's a deficiency in my code. I will fix it by rclone 1.57. For the time being please remove ERROR lines manually.

If I now use that file to try and run rclone check it complains that there's invalid lines??

rclone check C:\Users\Tony\sum2.txt remote:/ --checkfile=md5 --filter-from "C:\Program Files (x86)\rclone\filter-file.txt"
2021/08/24 17:06:15 NOTICE: sum2.txt: improperly formatted checksum line 1

rclone encountered the line with ERROR and noticed it loudly. Valid sum files cannot have ERROR in the sum field. It's a consequence of the above blunder.
Additionally I can see that reported line number is zero based which is inconvenient for humans. Actually errors appeared in lines 2,4,5 (for NTUSER.DAT et al).
I will fix it by rclone 1.57.

2021/08/24 17:06:16 ERROR : My Videos/100_1324.MOV: sum not found
2021/08/24 17:06:16 ERROR : My Videos/100_1325.MOV: sum not found

Checker noticed a brand new file which wasn't in sum file. That's ok.

@tb582
Please prepend your filters file with:

- /NTUSER.DAT
- /ntuser.dat.*

This workaround will make rclone skip the file without read attempts (Windows will reject it as the file contains secrets and can be read only by system processes) so you can avoid the ERROR lines in the sum file and misleading notice later.

Thanks all for thorough testing.

Thanks! - going to do some more testing of things tomorrow and will attempt with removing ERROR lines and report back.

@tb582

I think I fixed your bug in the beta build Release v1.56.2-nobadhash01 · ivandeex/rclone · GitHub

Could you verify?

Thanks - looks good.

@ivandeex noticed another odd thing that the sum file check is case-sensitive not sure if that's necessary? in that the checkSums need to be lowercase as that's what rclone checks against.

hi,

a file can contain uppercase/lowercase as well as unicode characters, therefore so must the sum file.

rclone.exe D:\files\check
        0 ╬♫❽_UPPER_lower.txt

rclone md5sum D:\files\check --output-file=sum.txt

rclone check sum.txt D:\files\check --checkfile=md5 -vv
DEBUG : ╬♫❽_UPPER_lower.txt: md5 = d41d8cd98f00b204e9800998ecf8427e OK

if i misunderstood the issue, can you post a detailed example?

I think this is the problem @ivandeex noticed

$ touch empty
$ echo "d41d8cd98f00b204e9800998ecf8427e  empty" > ok
$ rclone md5sum -C ok empty
= empty
2021/10/11 10:51:43 NOTICE: Local file system at /tmp: 0 differences found
2021/10/11 10:51:43 NOTICE: Local file system at /tmp: 1 matching files
$ echo "D41D8CD98F00B204E9800998ECF8427E  empty" > bad
$ rclone md5sum -C bad empty
2021/10/11 10:52:48 ERROR : empty: files differ
* empty
2021/10/11 10:52:48 NOTICE: Local file system at /tmp: 1 differences found
2021/10/11 10:52:48 NOTICE: Local file system at /tmp: 1 errors while checking
2021/10/11 10:52:48 Failed to md5sum: 1 differences found

thanks, i misunderstood the issue.

Well, it was topic starter who noticed, not me.

tl;dr I admit the problem but I don't have an immediate solution.

Hex-based checksums (md5, sha1, sha256) are in fact case-insensitive.
But Base64 aren't.
A general-case checksum is an opaque string and we must preserve it.
If it will be fixed, the fix will need case sensitivity info for each particular hash type.