What is the problem you are having with rclone?
MD5 hash mismatch after files upload
What is your rclone version (output from rclone version
)
1.52.0, 1.52.1, 1.52.2
Which OS you are using and how many bits (eg Windows 7, 64 bit)
FreeBSD 11.4, 64 bit
Which cloud storage system are you using? (eg Google Drive)
Yandex.Cloud (S3 API)
The command you were trying to run (eg rclone copy /tmp remote:tmp
)
/path/to/rclone/rclone --config /path/to/rclone/rclone.conf --log-file /path/to/logs/rclone.log --no-update-modtime --timeout 30m --bwlimit 6M --checksum --exclude-from /path/to/rclone/exl.rule -vv --stats-one-line check /long/path/to/VIDEO_TS yandexcloud:/bucket-name/long/path/to/VIDEO_TS
The rclone config contents with secrets removed.
[yandexcloud]
type = s3
provider = Other
env_auth = false
access_key_id = <...>
secret_access_key = <...>
endpoint = storage.yandexcloud.net
storage_class = STANDARD_IA
A log from the command with the -vv
flag
2020/07/09 21:30:34 DEBUG : rclone: Version "v1.52.2" starting with parameters ["/path/to/rclone/rclone" "--config" "/path/to/rclone/rclone.conf" "--log-file" "/path/to/logs/rclone.log" "--no-update-modtime" "--timeout" "30m" "--bwlimit" "6M" "--checksum" "--exclude-from" "/path/to/rclone/exl.rule" "-vv" "--stats-one-line" "check" "/long/path/to/VIDEO_TS" "yandexcloud:/bucket-name/long/path/to/VIDEO_TS"]
2020/07/09 21:30:34 DEBUG : Using config file from "/path/to/rclone/rclone.conf"
2020/07/09 21:30:34 INFO : Starting bandwidth limiter at 6MBytes/s
2020/07/09 21:30:34 DEBUG : fs cache: renaming cache item "yandexcloud:/bucket-name/long/path/to/VIDEO_TS" to be canonical "yandexcloud:bucket-name/long/path/to/VIDEO_TS"
2020/07/09 21:30:34 DEBUG : S3 bucket bucket-name path long/path/to/VIDEO_TS: Waiting for checks to finish
2020/07/09 21:30:34 DEBUG : VIDEO_TS.BUP: MD5 = 9ecacb7b6f6fc7f2080b09ea913ccc26 OK
2020/07/09 21:30:34 DEBUG : VIDEO_TS.BUP: OK
2020/07/09 21:30:34 DEBUG : VIDEO_TS.IFO: MD5 = 9ecacb7b6f6fc7f2080b09ea913ccc26 OK
2020/07/09 21:30:34 DEBUG : VIDEO_TS.IFO: OK
2020/07/09 21:30:34 DEBUG : VIDEO_TS.VOB: MD5 = 741261a71ed8873a90b73374220ea569 OK
2020/07/09 21:30:34 DEBUG : VIDEO_TS.VOB: OK
2020/07/09 21:30:34 DEBUG : VTS_01_0.BUP: MD5 = 384630d9636da1f6a82b4e7093000ddd OK
2020/07/09 21:30:34 DEBUG : VTS_01_0.BUP: OK
2020/07/09 21:30:34 DEBUG : VTS_01_0.IFO: MD5 = 384630d9636da1f6a82b4e7093000ddd OK
2020/07/09 21:30:34 DEBUG : VTS_01_0.IFO: OK
2020/07/09 21:30:35 DEBUG : VTS_01_1.VOB: MD5 = 27ba7feaa6eaffda8b4c51be0375333d (Local file system at /long/path/to/VIDEO_TS)
2020/07/09 21:30:35 DEBUG : VTS_01_1.VOB: MD5 = 36bd5ce679ce97325b9973c6a850a6ac (S3 bucket bucket-name path long/path/to/VIDEO_TS)
2020/07/09 21:30:35 ERROR : VTS_01_1.VOB: MD5 differ
2020/07/09 21:30:36 DEBUG : VTS_02_0.BUP: MD5 = 0575c6db8e20fb8e388062eddfb79e2c OK
2020/07/09 21:30:36 DEBUG : VTS_02_0.BUP: OK
2020/07/09 21:30:36 DEBUG : VTS_02_0.IFO: MD5 = 0575c6db8e20fb8e388062eddfb79e2c OK
2020/07/09 21:30:36 DEBUG : VTS_02_0.IFO: OK
2020/07/09 21:30:37 DEBUG : VTS_02_1.VOB: MD5 = aa8f09e25e1c0326451654de8cc5396a (Local file system at /long/path/to/VIDEO_TS)
2020/07/09 21:30:37 DEBUG : VTS_02_1.VOB: MD5 = 5410878ab81ad276863192eaabb7232d (S3 bucket bucket-name path long/path/to/VIDEO_TS)
2020/07/09 21:30:37 ERROR : VTS_02_1.VOB: MD5 differ
2020/07/09 21:30:37 NOTICE: S3 bucket bucket-name path long/path/to/VIDEO_TS: 2 differences found
2020/07/09 21:30:37 NOTICE: S3 bucket bucket-name path long/path/to/VIDEO_TS: 2 errors while checking
2020/07/09 21:30:37 NOTICE: S3 bucket bucket-name path long/path/to/VIDEO_TS: 7 matching files
2020/07/09 21:30:37 DEBUG : 7 go routines active
2020/07/09 21:30:37 Failed to check with 3 errors: last error was: 2 differences found
Looking through the rclone logs, I found a bunch of unexpected records that briefly look like this.
07 July rclone 1.52.2 - Massive 50 GB upload
30 June rclone 1.52.2 - Massive 50 GB upload
23 June rclone 1.52.1 - Regular small upload
16 June rclone 1.52.1 - Regular small upload
09 June rclone 1.52.0 - Regular small upload
02 June rclone 1.52.0 - Regular small upload
On June 07, to collect more detailed information, I narrowed down uploads to the single folder (with home made dvd image). In total, there are 9 files in this folder, and only 2 files are bigger than 200 MB (default for multipart).
I tried rclone 1.52.2, 1.51.0 and v1.50.0 and got an MD5 mismatch in all cases, followed by file uploads.
Latest answer from Yandex.Cloud technical support doesn't shed much light on the root cause.
rclone не использует алгоритмы расчета MD5 облака, если они отличаются от простого расчета MD5 всего файла.
Т.е. для небольших файлов, загружающихся как simple object используется ETag, который совместимо рассчитывается как MD5 от файла; для больших файлов используется multipart upload, и контрольная сумма рассчитывается на стороне клиента и сохраняется в метаданных.
Проблема как раз возникает для больших файлов, где алгоритмы облака не используются. Метаданные объекта, судя по нашим исследованиям, сохраняются адекватно.
Here is Google translation of what they said.
rclone does not use cloud MD5 calculation algorithms if they are different from a simple MD5 calculation of the entire file.
That is, for small files loading as a simple object, an ETag is used, which is compatible as MD5 from the file; for large files, multipart upload is used, and the checksum is calculated on the client side and stored in metadata.
The problem just arises for large files where cloud algorithms are not used. According to our investigation, object metadata is adequately stored.
The ticket is still open with a couple of questions to the tech support guys.
- What is the reason for MD5 mismatch even for older rclone versions? It matches before June 23 even for objects larger 200 MB, but failed to match after.
- Before the massive upload of large files on July 7, there was similair upload on June 30. And the multipart assumption is not suitable for it, since the log shows file uploads even for small jpg photos from 3 to 7 MB in size.
I would be very grateful for any hint to figure out what happened after all, and on which side the ball is.