[HELP] Integrity Data Check: Compare ETAG/md5 hash local to s3 Bucket


#1

Hi there guys, I’m having some issues trying to compare hashes (md5 or etag) from local files with a remote s3 bucket.
My files are bigger than 5Gb and this why this is why I’m uploading as multipart, but the flag:
–s3-chunk-size 100M
seems to not be working in current version (I guess if it was an independent branch from the original version…)

I have tried the following command, as the files uploaded are bigger that 5Gb and every sync to s3 is sloow.

rclone sync -vv --dump-headers --s3-chunk-size 100M origin_folder/ s3:s3-bucket/upload/

I have a Windows and many many files, so I have found so many python functions to achieve this but not a simple PowerShell to automatize that.

Experienced users of rclone, How do you do check the Data Integrity from a remote file against the remote bucket in s3?

Thanks you very much folks!

Manu


#2

Hi there. Any help on this issue about ETAG on multipart files?


#3

What version are you using? rclone -V should tell you.

You don’t need to set chunk size any more - rclone figures it out adaptively.

If you try the latest beta you will find that rclone adds metadata chunked uploads so they retain their MD5SUM. Normally s3 doesn’t give chunked uploads an Etag which is an MD5.

You can then use rclone check to check the hashes between your local copy and the remote copy.


#4

My version installed on Mac is: 1.3.8
But I’m having issues in Windows and Linux as well.
I’m going to try with the beta version and see if in multipart uploads adds the md5 hash.
Thanks @ncw Nick!