I’m trying to clone a minio source to a aws s3 destination
rclone sync my_minio_source: my_aws_destination:
This was working perfectly until the minio server I am trying to copy from turned on compression.
I have reproduced the error on my own minio server to confirm that the compression is what triggered this error.
The error seems to be caused by aws calculating a md5 checksum that is different to the one rclone is telling it to expect.
What is the best way to get around this?
I’ve tried this flag which did not do anything [–s3-disable-checksum]
Minio compression is a bit funny in that the files are compressed on disk but when you download them you get the uncompressed version. My suspicion is that this is where the mismatch comes from.
ERROR : site-1v2-phase-a0-traffic-movements.geojson: Failed to copy: s3 upload: 400 Bad Request:
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>BadDigest</Code>
<Message>The Content-MD5 you specified did not match what we received.</Message>
<ExpectedDigest>57d9468c8b2e44f1873c48b2d5e0c9b3</ExpectedDigest>
<CalculatedDigest>xqmjKJaxsHCLdt8YI9Ia4w==</CalculatedDigest>
<RequestId>8B0C0A0D253B0166</RequestId>
<HostId>d7YPtGWTiDe/qLjrmQ4XjbSW4GtYCE/xmVOita0yCXTIITdhLbPRydjSeufiXh3VUZ0mXYkuCxU=</HostId>
</Error>
I also have ran this to confirm the md5 checksum where “himas-test-bucket” only has the uncompressed file. rclone md5sum local-minio:/himas-test-bucket/
This is minio returning the checksum of the compressed file with Content-Encoding: gzip or similar. There are some issues about this sort of thing, but I haven’t found a satisfactory solution.
What is happening here is that s3 needs a hash before doing the upload so rclone asks the source to provide one. If it can it uses it rather than calculating it again.
You could probably fix this by setting --s3-upload-cutoff 0 which will make all files be uploaded with multipart uploads which doesn’t need the hash in advance.
Just an update on this, it seems when I use --s3-upload-cutoff 0 the files that are failing are those that are less than some size x. When I use --s3-upload-cutoff 0 --ignore-checksum all files are now being copied. I don’t quite know if this has some side effects I am unaware of but it seems to have done the trick.
Again, thanks for your help ncw
I hit the same when trying to move SSE objects from AWS S3 to IBM COS. The solution works, but makes me nervous about data integrity since the flags seem to drop the checking functions. Reading the issues, I see a possible fix is roadmapped, but not getting traction. One comment worried about the extra overhead of the HEAD call, but if it solved data integrity, that overhead is worth it. If I understand it correctly, this solution is also extra overhead as now all objects are multipart.