Greetings, everyone.
I would like to (cordially) request some information/help.
As expected, by default, rclone converts Google documents when they are downloaded by, e.g., using the copy
command, and adds the corresponding extension to the files' names (docx, pptx...) [https://rclone.org/drive/#import-export-of-google-documents].
I use the following command to copy my whole Google Drive to my local device: rclone copy gdrive: ~/A-Drive --progress --exclude "Google Photos/**"
(sometimes using the --checksum
flag).
As someone who wants to be really sure about things, I like to count on the check
command and its logging-related flags (specifically, --combined, --differ, --error, --missing-on-dst, --missing-on-src
) [https://rclone.org/commands/rclone_check/].
NOTE: I omitted these flags from the commands below for the sake of keeping things organized.
I tried 3 "variations" of the check
command, using: 1- the --download
flag; 2- the --checksum
flag (which is redundant in that case, I assume); and 3- no flags.
I have also generated logs with --log-file
and --log-level INFO
.
NOTE: the files generated by the flags --error
, --missing-on-dst
, --missing-on-src
contained zero entries.
=======================================
Using the --download
flag (rclone check gdrive: ~/A-Drive --download --progress --exclude "Google Photos/**"
) led to the following results:
2021/01/21 18:53:23 NOTICE: Local file system at /home/myuser/A-Drive: 153 differences found
2021/01/21 18:53:23 NOTICE: Local file system at /home/myuserA-Drive: 153 errors while checking
2021/01/21 18:53:23 NOTICE: Local file system at /home/myuser/A-Drive: 4740 matching files
2021/01/21 18:53:23 INFO :
Transferred: 28.781G / 28.781 GBytes, 100%, 6.014 MBytes/s, ETA 0s
Errors: 154 (retrying may help)
Checks: 4893 / 4893, 100%
Transferred: 9786 / 9786, 100%
Elapsed time: 1h22m0.3s
2021/01/21 18:53:23 Failed to check with 154 errors: last error was: 153 differences found
The file generated by the --differ
flag contained 153 file names.
So, 154 errors were found, and 153 of them were related to differences, which makes me wonder what the missing error ("number 154", so to speak) is.
Also, even though they were referred to as "errors", the files generated by the flag --error
had zero entries, as stated above. Maybe I didn't properly understand the "terminology".
Searching for "*" followed by a space (which indicates different files) in the file generated by the --combined
flag fittingly yielded 153 results.
=======================================
Using the --checksum
flag (rclone check gdrive: ~/A-Drive --checksum --progress --exclude "Google Photos/**"
) led to the following results:
2021/01/21 21:03:16 NOTICE: Local file system at /home/myuser/A-Drive: 0 differences found
2021/01/21 21:03:16 NOTICE: Local file system at /home/myuser/A-Drive: 153 hashes could not be checked
2021/01/21 21:03:16 NOTICE: Local file system at /home/myuser/A-Drive: 4893 matching files
2021/01/21 21:03:16 INFO :
Transferred: 0 / 0 Bytes, -, 0 Bytes/s, ETA -
Checks: 4893 / 4893, 100%
Elapsed time: 8m47.9s
So, unlike the previous command, no "errors" were found, but 153 hashes could not be checked.
Also, no entries were found in the file generated by the flag --differ
.
Searching for "*" in the --combined
file, as described above, yielded no results.
However, searching for "=" led to 4893 results, which corresponds to the total amount of files checked.
As such, even though 153 hashes weren't checked, all files were reported as matching.
The same results were observed without the --checksum
flag (rclone check gdrive: ~/A-Drive --progress --exclude "Google Photos/**"
).
NOTE2: only the command containing the --download
flag actually provided a list with the names of the "different" (--differ
) files, which amounted to 153. The other variations only mentioned that 153 hashes weren't checked. Considering that the amount is the same (153), I assume that the "different" files are the same ones whose hashes could not be checked.
=======================================
All of those different files were in one of the following formats after copying them to my local drive: docx, xlsx, pptx.
Considering I am not aware of an efficient ("automated") way of comparing all of them, I did some manual work.
Suitably, the files I compared were all Google documents (in the cloud) that got converted when copied to the local drive (as expected).
Just to be sure, I manually downloaded some files directly from my Google Drive (which got converted to the same aforementioned formats) and compared their hashes with the ones of the corresponding files downloaded and converted by rclone.
For some reason, none of the hashes matched. In other words, a document copied to my local drive with rclone had different hashes than the same document downloaded manually from my drive.
I inspected the log generated by using the -vv
flag along the check
command (without --download
), compared some files and noticed that: a) 153 hashes could not be checked;
b) the files whose hashes could not be checked match the ones listed as "different" when using the --download
flag (at least the ones I manually compared).
=======================================
The questions:
1- how can I ensure (aside from manually checking) that those 153 files whose hashes could not be checked really are the converted Google docs documents?
2- when the --download
flag is provided to the check
command, how are the files compared (e.g. MD5 hashes calculated on the go and compared)?
3- while using the flag --download
, rclone warned about "153 errors while checking", but the corresponding --error
file contained zero entries. Is this behavior expected?
4- still about the --download
flag: the message displayed after completing the operation (""Failed to check with 154 errors: last error was: 153 differences found") states that 154 errors were found, but only specifies 153 of those. What could the missing error be? Or is the existence of errors considered "one" error by itself?
5- the results of the check differ according to the use of the --download
flag: using it leads to errors being mentioned and to files being listed as different under de corresponding logs; not using it just leads to hashes that "could not be checked", but files are still reported as matching. Is this behavior expected?
6- would setting rclone to get the links to the Google documents (instead of downloading and converting them) avoid this kind of errors in the future?
=======================================
Sorry for the long post. I wanted to be as specific as possible. I hope the provided information is enough.
Thanks a lot for your help and attention.
rclone version:
v1.53.2 (latest version is 1.53.4)
os/arch: linux/amd64
go version: go1.15.3
OS:
Ubuntu 20.04 LTS 64-bit
Storage system
Google Drive + local (HDD)