STOP and READUSE THIS TEMPLATENO EXCEPTIONS - By not using this, you waste your time, our time and really hate puppies.
I've run into an odd problem when copying Microsoft document files to sharepoint, the copy fails because the copied filesize doesn't match the source.
It seems to be related to the filename extension so .txt or .pdf files copy fine. The problem only occures with .docx or .xlsx if I rename the destination file to .doc (or .xls) the copy works perfectly. If I use --ignore-size --ignore-checksum the copy succeeds but the destination file is bigger than the source and analysing it using 7z to open the .docx shows that an extra folder has been added called [trash]. Not sure where to go with this...
What is the problem you are having with rclone?
Run the command 'rclone version' and share the full output of the command.
rclone v1.74.3
- os/version: Microsoft Windows 11 Pro 24H2 24H2 (64 bit)
- os/kernel: 10.0.26100.8655 (x86_64)
- os/type: windows
- os/arch: amd64
- go/version: go1.26.4
- go/linking: static
- go/tags: cmount
Which cloud storage system are you using? (eg Google Drive)
The command you were trying to run (eg rclone copy /tmp remote:tmp)
Many thanks for getting in touch, really appreciate that. That's all very useful information, I think the thing I find most odd is that it's all down to the filename extension, almost like graph is recognising an office document and mucking about with it. For example rclone copyto "D:\sharepoint backup\test.xlsx" "Planetcs:test.xlsx" fails but rclone copyto "D:\sharepoint backup\test.xlsx" "Planetcs:test.xls" works without error. Using --ignore-size --ignore-checksum does allow the copy to succeed but the files have been changed in the process which isn't good. Interestingly the .docx has to be a real Office file, If I create a .txt file and rename it to .docx the copy proceeds without error.
For backup use I would be careful with --ignore-size/--ignore-checksum here. It makes the transfer finish, but it also hides exactly the case you noticed: SharePoint/Graph has changed the Office file after upload. If keeping the original bytes matters, I would either store those Office files inside a zip/crypt container, or upload with a neutral extension/name and only restore the real name after download if needed. A quick test is to copy one file up, copy it back to a new local folder, and compare the hashes with the original.
Yes, I totally agree, I'm not comfortable using --ignore-size/--ignore-checksum it does 'fix' the problem but could allow real corruption to go unchecked. Unpacking the .docx file I can see that the copied version has gained a [trash] folder containing a single 3k file called 0000.dat also the customXml folder gets slightly changed, it would be great if you could just turn this 'feature' off...
not sure your exact setup but looks like you have a onedrive account.
so, for backups, create another onedrive remote for that same account and use that.
onedrive will not muck up the file, so rclone can verify the file transfer using checksums and size.
Thanks for sharing this discussion. Problems copying DOCX files to SharePoint can be tricky because the issue isn't always the file itself. It could be related to SharePoint restrictions, filename or path length limits, permissions, or even how the transfer tool handles metadata during the upload. Verifying whether other file types upload successfully is often a good way to narrow down the root cause.
I have a question for anyone who has dealt with this before: Did the issue affect only specific DOCX files, or was it happening with every Word document? Also, were you able to resolve it by changing any SharePoint settings or rclone options, or did the files themselves require modification before they could be copied successfully?
during upload, microsoft modifies office files. that would apply to every document.
that is why rclone cannot verify file transfers using size and checksum.
this is a known issue for many years in the forum, affecting many users.
if there is a solution, it would have been mentioned in the forum and the rclone documentation.
not sure about your setup and goals but one potential workaround, as i mentioned up above.
copy files to onedrive, not sharepoint
run something like rclone mount onedrive: x: pointing to onedrive
now you can open, edit, save, create new office documents and microsoft will not muck it up.