I'm trying to run cryptcheck to compare my local and remote for any issues (e.g. bit rot). If any of my local files have changed I can download them from the remote.
my guess is that those files can be checked? did you try running the check again? only for those files?
I’ve found that for example googledrive will return roughly 2-5 errors in 200,000 files when everything is fine. The error will be a google server request error, something like 500 (server didn’t respond) or 503 (server responded but didn’t give us the nonce)?
just running rclone cryptcheck --fast-list --verbose /path/to/files/ encrypted remote:path
in order to get the filenames just include a log file? then search that log file for errors, may or may not prefer -vv over -v.
That said… it seems maybe you already tried this? and are sure this isn’t the issue? if so, sorry.
Any suggestions on how to identify the files that are missing md5sums? I’ve tried adding -v, -vv, -vvv, -vvvv and -vvvvv but it only lists the files that are “OK”.
If you run rclone --fast-list md5sum s3remote:bucket it will show files with and without hashes. The ones missing hashes will be blank. Run that on the underlying remote, not on the crypt remote.
Run md5sum to get list of files that are missing a md5sum hash (Must be run again non-crypt remote). Note the names of the files that are missing a md5sum.
rclone md5sum --fast-list remote:path
Run rclone lsd with --crypt-show-mapping to identify the files that are missing the md5sum hash. I tried piping this to grep and searching for the filename but rclone is not sending the result of --crypt-show-mapping to stdout/stderr?
What’s interesting is that I’m finding small files that were uploaded with rclone that are also missing a md5sum hash. I’m planning to try deleting them from the remote and running a sync again to upload them. Is there a better way to do this?
The threshold for multipart uploading for s3 is quite small - 5MB maybe? So you will see quite small files without hash.
You could add the md5 metadata without re-uploading, but that would require a bit of custom coding... So if you haven't got too many files, I'd just re-upload them.
I think I misunderstood. Is the limit on hashes 5 MB and any files larger than 5 MB will NOT have a hash?
I uploaded 15 files to S3 yesterday and "rclone cryptcheck" is reporting "13 hashes could not be checked". Below is the output of "rclone ls /path/to/files" with the filenames removed. Only the first two files of size 426 and 43783 have hashes per "rclone md5sum --fast-list remote:path".
That was correct for rclone 1.39. For rclone 1.40 all files should have a hash if you uploaded them with rclone sync/copy/move. If you uploaded them by copying them via a mount then only the files smaller than 5MB will have a hash - is that what you are doing?
So as of now running “rclone cryptcheck --fast-list --verbose /path/to/files encryptedremote:path/to/files” is expected to return “X hashes could not be checked” for any files over 5 MB?
Is there anything else that I can do to verify the integrity of my local and remote files. My biggest concern is bit rot.
Thank you for clarifying. I’ve been reading over Option “–s3-disable-checksum” #2213 and it’s great to see that you are actively discussing a resolution. Let me know if I can do any testing.
If I were to use the latest beta would it create issues to do “–s3-chunk-size 5 GB”? I try to avoid sync any files to S3 that are 5 GB or larger to avoid multipart uploads but the reality is that I do have some. My goal is for all files in my crypt remote to have hashes and for cryptcheck to succedd.
@ncw with the release of 1.41 I see that “–s3-disable-checksum” was added. I wanted to get your insight on how to use the new command to address my issue. Do I need to delete all my files that are missing a hash and re-upload them using “–s3-disable-checksum”?