I'm assuming you broke the file into 1 MiB chunks ie each 1048576 bytes long... so chunk 3419 will start at 0x356c00
The corruption appears to start at 0xd3ff7 and end at 0xd5b0c inclusive so it is 6934 bytes long or 0x1b16. In absolute terms it starts at 0x42abf7 and ends at 0x42c70c
If this was an rclone corruption I'd expect it to be one of
lined up with the crypt blocks which are 64KiB +16 = 0x10010 bytes long, starting at 0x20, That corruption is in the middle of a block, neither at the start or end.
lined up with the upload block size which is 320KiB for onedrive. The corruption is about 1/3 of the way through a 320KiB block.
So nothing which points any fingers at rclone internals unless I got my maths wrong! I can't think where a pile of base64 data came from either.
I guess the next thing to do would be to either find a bug which seems relevant or report a bug here: Issues · OneDrive/onedrive-api-docs · GitHub - I've had reasonable success reporting bugs and Microsoft fixing them here. I didn't find a relevant one but there may be one somewhere!
If we are going to report a bug then we need a reasonably reliable way of reproducing, preferably as simple as possible.
What is the smallest file we've seen the corruptions on? 134,217,728 seems to be it? I've made a script to upload a file continuously of that size until it breaks - let's see if I can reproduce!
Here is the script if anyone else wants to have a go
#!/bin/bash
size=134217728
destination=TestOneDrive:thrashfiles/
for round in $(seq 100); do
name="test-${round}-${RANDOM}${RANDOM}${RANDOM}.bin"
echo
echo --------------- $(date -Is) - round $round - $name ------------------
echo
dd if=/dev/urandom of=$name bs=1M count=$(($size/1048576+1))
truncate -s $size $name
sha1sum $name
rclone --low-level-retries 1 --retries 1 -vv --dump responses copy "${name}" "${destination}"
error=$?
if [ $error -ne 0 ]; then
echo "ERROR $error on $name"
else
rm $name
rclone -v deletefile "${destination}${name}"
fi
sleep 5
done
Just wanted to say thank you to everyone looking into this issue.
It's well beyond my expertise at this point so I have nothing to add beyond the fact that I appreciate all the effort being put in here to diagnose a (likely) Microsoft bug.
(Firstly - apologies for the deleted posts above!)
I've also been seeing these types of errors when syncing about 15 x MP4 files from my Linux machine to OneDrive. Each file is between 2.5GB and 3.7GB in size.
I decided to try a few different ways of copying one of the files that had been failing with rsync.
The first attempt was by uploading the file via the OneDrive website (via a Windows 11 machine - where the sha1sum was checked and correct). This seemed to work, but when checking the sha1sum via rclone the files were different. I then downloaded the file back to my Windows 11 machine and the sha1sum was again incorrect (but the same as the rclone sha1sum).
I checked this suspect downloaded corrupt MP4 file with ffmpeg which did indeed find errors. Also, comparing the MP4 file with the original file using 'cmp' differences were found. Doing similar checks with hexdump as in a previous post, showed a definite difference in the 'look' of the data at the point indicated to by cmp.
I am now attempting to let the native Windows OneDrive program sync a copy of the file. I'm not seeing any errors, but so far it has not been successful, and it seems to be on its third attempt now.
So to me this definitely looks like a OneDrive issue, and somewhat concerning that I managed to upload a file to OneDrive that seemed to succeed but was in fact corrupt.
Hopefully Microsoft will come back to you on the bug that was raised with them.
Since my test case seems to be useless in the meantime, I did a check on the webupload.
I can confirm that 4 out of 100 generated testfiles have different SHA1.
2022/04/06 20:43:15 ERROR : sewewot7c/mefipuz7ka/qoyureb1ne: sha1 differ
2022/04/06 20:43:19 ERROR : sewewot7c/nubotog9x/lazapa: sha1 differ
2022/04/06 20:43:23 ERROR : sewewot7c/nubotog9x/xacuseb4: sha1 differ
2022/04/06 20:43:23 ERROR : sewewot7c/nubotog9x/luyif: sha1 differ
2022/04/06 20:43:24 NOTICE: One drive root 'testfiles': 4 differences found
2022/04/06 20:43:24 NOTICE: One drive root 'testfiles': 4 errors while checking
2022/04/06 20:43:24 NOTICE: One drive root 'testfiles': 96 matching files
2022/04/06 20:43:24 Failed to check with 4 errors: last error was: 4 differences found
I'm reading through the issue here and on gitbub. @ncwsaid there
presumably there are thousands or millions of corrupted files people have uploaded to OneDrive over the period the problem was active. Will Microsoft be issuing a statement and/or contacting affected users?
Am I correct though that if all of my uploads were with rclone (and no flags to disable checks) that I should be fine? I definitely had the issues discussed here but on retries, it eventually worked.
"Three days ago, Nick Craig-Wood, creator of rclone, posted a bug report to the GitHub repo for Microsoft's OneDrive. "Sometimes (maybe one time in 20) multipart uploads of a 128MiB file get corrupted," his post explains."
Agree, it is already mentioned on the GitHub issue, so we just need the log.
@automaton82 not sure how well you understand the steps to troubleshoot this issue, so here is a little guide to you and everybody else experiencing “corrupted on transfer: SHA-1 hash differ” when copying/syncing to OneDrive even if the file successfully transfers in a following attempt/retry.
These are the three things we would like to see:
An extract of your ordinary or debug log looking something like this:
2022/04/09 05:40:46 ERROR : somefolder/somefile: corrupted on transfer: SHA-1 hash differ "4ed4780309e547e848a6db94b3e563c89ddea7c3" vs "ab5b31734aaad82c6948c4e26b508831db5dd3cd"
2022/04/09 05:40:46 INFO : somefolder/somefile: Removing failed copy
Confirmation that you have access to your OneDrive recycle bin via the web interface and can see the file that was deleted in 1. - that is filename and time of deletion match.
Then we will return with (simple) instructions to find the information needed by Microsoft to troubleshoot the issue. We will need the original file and the uploaded file in your OneDrive recycle bin, so please don’t delete, recover or move them.
PS: If your log disappeared (or scrolled away), then we would still like to hear from you if you are able to see the deletion of the corrupt file in your OneDrive recycle bin.