Thank you, @ncw, for sharing your thoughts and ideas on this complicated issue.
Your broad view of data integrity question gave me good starting point to create bullet-proof test case and find out which one of two guns is still smoking.
This was the very first step I took to check if my backup data was corrupted. Hashes for both local and downloaded files turned out to be the same. And this became a clear sign that, most likely, only metadata in the cloud (ETag) contains an error.
The following is a brief description of test case scenario that I carried out.
Prerequisites
- Focus on a single file
VTS_01_1.VOB
stored onFreeBSD NAS
- Create twin copy
VTS_01_1dp.VOB
and keep it onMS Windows PC
only.
Sequence of steps
1. Calculate local MD5
1st env:
FreeBSD NAS
tools:openssl dgst -MD5
,rclone md5sum local:
file:VTS_01_1.VOB
2nd env:
MS Windows PC
tool:HashTab
file:VTS_01_1dp.VOB
Step result
Identical for both files, in all environments and with all tools.
27ba7feaa6eaffda8b4c51be0375333d VTS_01_1.VOB
27ba7feaa6eaffda8b4c51be0375333d VTS_01_1dp.VOB
2. Upload to Yandex.Cloud
1st env:
FreeBSD NAS
tool:rclone copy local: remote:
file:VTS_01_1.VOB
2nd env:
MS Windows PC
tool:WinSCP copy
file:VTS_01_1dp.VOB
Step result
Both files were sent to the cloud from independent sources using independent tools.
3. Get cloud MD5 (ETag object property)
1st env:
FreeBSD NAS
tool:rclone md5sum remote:
files:VTS_01_1.VOB
,VTS_01_1dp.VOB
2nd env:
MS Windows PC
tool:aws s3api list-objects
files:VTS_01_1.VOB
,VTS_01_1dp.VOB
Step result
Identical for both files, in an independent environment, using independent tools.
36bd5ce679ce97325b9973c6a850a6ac VTS_01_1.VOB
36bd5ce679ce97325b9973c6a850a6ac VTS_01_1dp.VOB
Overall result
Hash identical files, being uploaded to the Yandex.Cloud via independent paths and using independent tools, get an identical and invalid ETag (MD5) object property.
36bd5ce679ce97325b9973c6a850a6ac invalid cloud ETag
27ba7feaa6eaffda8b4c51be0375333d correct MD5
Solution
If I understand the test results correctly, there is no way to use rclone --checksum
option while working with Yandex.Cloud.
Corrupted hash resulted in corrupted trust to the particular cloud service.
Anyway, my dialogue with Yandex tech support guys is not yet complete, and I will keep rclone
community informed.