Files that are very large (over 500 GB) are taking a long time to upload to Wasabi, and that seems to be caused by the calculation of the MD5 checksum. It's literally taking longer to calculate the checksum than it does to upload the file. So, I am pondering the use of --s3-disable-checksum.
If I disable the MD5 checksum, what exactly am I losing? Does rclone exclusively use the checksum to ensure the file is not corrupted in transit? I would think TCP itself is already ensuring transit integrity, right?
So, I want to know the exact disadvantages of not using the MD5 checksum. I don't want to disable it and regret it, but I also need files uploaded faster than they currently are. I'm trying to determine if the MD5 checksums are needed for my use case.
What is your rclone version (output from rclone version)
v1.53.2
Which OS you are using and how many bits (eg Windows 7, 64 bit)
Windows Server 2019 Standard, 64-bit
Which cloud storage system are you using? (eg Google Drive)
Wasabi (S3)
The command you were trying to run (eg rclone copy /tmp remote:tmp)
i use rclone to upload large veeam backups files to wasabi.
my local server is the free windows 2019 server hyper-v edition using REFS file system, which is soft-raid.
it takes rclone much longer to calculate the md5 checksum of the local file then to upload that file to wasabi.
there is only one way to make sure a file is uploaded correctly.
rclone calculates the checksum of a local file named 500GB.file
rclone uploads 500GB.file to wasabi.
wasabi calculates the md5 checksum of its version of 500GB.file
rclone compares the md5 checksum of the local file 500GB.file to the md5 checksum the corresponding 500GB.file in wasabi.
It is taking so much time to upload that it's time to do another upload before the previous upload finishes.
When I turn off MD5 checksum, the problem goes away.
Right now, I'm working around it by reducing the amount we are uploading, but that is not ideal either because it means some things are not really being backed up.
The biggest is about 1 TB. We get an average of about 95 MBytes/sec. Once the upload starts, it gets done quickly. But it spends the majority of the time calculating the MD5 checksum before it ever starts uploading.
I think the question I'd like to ask is why is calculating the MD5SUM so slow? It should progress roughly as fast as your disk can deliver data. On my SSD laptop rclone md5sum can do about 500MB/s.
Can you do some tests with rclone md5sum on big files and calculate how many MB/s they are doing?
If you do disable --s3-disable-checksum what you are missing is the metadata on the S3 object that has the md5sum. For small objects S3 provides this as the ETag but for large objects uploaded as chunks they don't, so rclone calculates it.
Rclone provides this at the start of the upload. If it wanted to add it at the end of the upload (which would save the delay as it could be calculated streaming) then rclone could have to COPY the object to add it which takes time and costs money.
So --s3-disable-checksum only applies to large objects that are uploaded in chunks. Each chunk is uploaded with an sha1 hash which s3 checks so it is extremely unlikely that corruption could pass undetected in a multipart upload even without --s3-disable-checksum.
What you do lose with --s3-disable-checksum is the ability to do rclone check. Without the md5sum in the metadata rclone can't find out the MD5SUM of an object so can't check it properly. This would mean that you can't detect bitrot on your local disks, or if you have bad RAM which flipped a few bits on the upload.
So if you just want to be sure that the upload was OK then you can use --s3-disable-checksum just fine. However for long term archiving and full end to end checking you want it.
on a 20GB file
my laptop, intel i5, win10.64, 1 ssd, ntfs, = 300MB/s
my home server, intel i5, win server 2019, 3 hdd, refs softraid5 = 70MB/s
refs = microsoft version of zfs .