Rclone sync failing with XMLs


#1

Setup rclone on a headless CentOS7 to sync to an S3 compatible storage. Run rclone sync with -vv option and noticed that most of the XML coping is failing with SignatureDoesNotMatch error (see below)!

Any recommendations?

2018/12/27 14:03:41 INFO : 1080i 59.94/Mac Pro 4/WaveformCache/WaveformCache_262144x.awf: Copied (new)
2018/12/27 14:03:41 INFO : 1080i 59.94/NBCZ800MC892/WaveformCache/WaveformCache_65536x.awf: Copied (new)
2018/12/27 14:03:41 INFO : 1080i 59.94/Mac Pro 4/WaveformCache/WaveformCache_65536x.awf: Copied (new)
2018/12/27 14:03:41 INFO : 1080i 59.94/SG-Z840-BASE/SearchData/SearchDB: Copied (new)
2018/12/27 14:03:41 ERROR : 1080i 59.94/Z440-IMAGE/1080i 59.94 Settings.xml: Failed to copy: SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your AWS Secret Access Key and signing method. For more information, see REST Authentication and SOAP Authentication for details.
status code: 403, request id: be7e97cd-e163-15e9-8561-d8c49756f210, host id:
2018/12/27 14:03:41 INFO : 1080i 59.94/PROTOOLS001/WaveformCache/WaveformCache_16384x.awf: Copied (new)


#2

Which s3 storage are you using?

Can you try the latest release and if that doesn’t work the latest beta


#3

Its an on premise S3 compliant storage called Cloudian. I’m running rclone 1.45. Very strange everything sync’s with no issues except for XMLs!!.


#4

Same behavior with the new beta version!. i tried changing the authentication from V4 to V2 with no change.


#5

That is odd indeed! v2 vs v4 auth can cause this problem but you’ve tried that.

You could also try --s3-force-path-style which may help.

I note there are spaces in the xml file names which might also be the problem so it might be worth trying to upload some different files with spaces in, eg “hello world.txt”

I suspect it might be a bug in Cloudian (rclone has found a whole raft of bugs in s3 compatible interfaces!).

Can you make a log with -vv --dump bodies uploading a very small XML file - that would be very interesting and might shed some light on matters.


#6

Hi ncw
forcing the path style didn’t help either!
It doesn’t have issue with copying files that have spaces. I tried syncing/copying files with spaces plus all the other files that are non XML are copying with spaces with no issues.
I created a log file with -vv --dump for more details on the issue.

Thanks alot for your help
dump bodies log


#7

What it looks like to me is that Cloudian is interpreting the XML upload as an XML request some how.

You could try using a different tool to upload that XML file - say cyberduck and see if that works?


#8

I used s3 browse and it uploaded with no issues!.
I’ll enable the debug on cloudian for more logs but I created a small XML and rclone failed to move it as well!!


#9

Interesting…

What happens if you rename that .xml file to a .txt file say and try to upload that?


#11

Nothing changes same behavior.
I’ll get the debugger going on Cloudian to see how s3 Browse uploads the files with no issues


#12

Hi ncw
After getting the debugger going on the Cloudian Storage. I believe I know whats causing the issue. rclone is specifying the character-set which is causing the authentication mismatch failures. I tried the same XML with S3 Browse and S3cmd and both uploads were successful and logs didn’'t show a character-set flag.
I set the same XML to an unknown type (.bk) and rclone uploaded with no issues (logs didn’t show a character-set).
I’m not sure if this a correct s3 behavior nor do I know if Amazon supports it but Cloudian is built completely off Amazon (what works for Amazon works for Cloudian)
I started a case with Cloudian but is there any flags in rclone to prevent it from setting the character set?


#13

Do you mean the Content-Type field? Can you paste a good one vs a bad one? (I can’t see your logs any more).

You can see what Content-Type rclone would use if you do rclone lsjson /path/to/file.xml

The content types are configurable in your OS probably in /etc/mime.types


#14

{“Path”:“Mick 8.6 Settings.xml”,“Name”:“Mick 8.6 Settings.xml”,“Size”:2647481,“MimeType”:“text/xml; charset=utf-8”,“ModTime”:“2017-01-19T11:29:13-05:00”,“IsDir”:false},

The charset=utf-8 is causing the signature mismatch.

{“Path”:“Mick 8.6 Settings - Copy.bk”,“Name”:“Mick 8.6 Settings - Copy.bk”,“Size”:2647481,“MimeType”:“application/octet-stream”,“ModTime”:“2017-01-19T11:29:13-05:00”,“IsDir”:false},

Same XML renamed to an unknown file type copied with no issues.


#15

This is where it comes from: https://golang.org/pkg/mime/#TypeByExtension

Text types have the charset parameter set to “utf-8” by default.

I made a version of rclone which strips “; charset=utf-8” from mime types. This isn’t something I want to merge, but you can give it a go to see if it fixes the problem!

https://beta.rclone.org/branch/v1.45-064-ga5863650-experiment-no-charset-beta/ (uploaded in 15-30 mins)

My suspicion is it is either the space or the ; that Cloudian doesn’t like - I suspect it is expecting them URL encoded whereas rclone is sending them in the clear (which is correct I think).


#16

I’m getting a 404 Not Found error on any upload (XML, non XML)!! Looking at dump bodies, the put request has the options as attributes and not headers!!

PUT /projects-bkup/SG05-CS_PROJECTS/Avid%20User%20Settings/Mick%208.6%20Settings.xml?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=009975676cca1515ffbb%2F20190108%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20190108T205247Z&X-Amz-Expires=900&X-Amz-SignedHeaders=content-md5%3Bcontent-type%3Bhost%3Bx-amz-acl%3Bx-amz-meta-mtime&X-Amz-Signature=f4a2ab1cf538cf980c67a0f1f166d6aa146126d765f882b5806deca91ae01516 HTTP/1.1


#17

OK that is a consequence of a different change… If you use -s3--upload-cutoff 0 then it will use the old upload method…


#18

Thanks that worked. I really appreciate it.
The --s3-upload-cutoff 0 option isn’t documented because I looked and knew that you changed something but couldn’t find any documentation about it.
Do you think you can role MIME TYPES in an official build?


#19

Great. It is a slight worry that you needed to use it though. I’ve changed the single part upload method to be more efficient which works well with s3/digital ocean/ceph/minio but maybe doesn’t work for all s3 “compatible” solutions.

That is still in the “beta” - the docs will be in sync come the release. You can look at tip.rclone.org for a sneak preview: https://tip.rclone.org/s3/#multipart-uploads

I assume you are saying that v1.45-064-ga5863650-experiment-no-charset-beta/ worked OK for you?

I don’t want to merge that - it was just an experiment to prove that your suspicions are correct.

I think you should be reporting the issue to Cloudian probably!


#20

Agree i did report to them as a bug but haven’t heard back yet.
Thanks again for all the time and effort you put in


#21

Let me know what happens. I could put some workaround flags into rclone, but I’d rather not if they will produce a fix for you!