Amazon Drive repeated "corrupted on transfer: MD5 hash differ"

rclone v1.35

I get uploads with these messages:
command used: rclone copy --stats 15m --transfers 4 --acd-upload-wait-per-gb 4m --max-size 50G --no-update-modtime" --bwlimit 15M --no-gzip-encoding --checksum

2017/02/27 07:11:31 EXIPWLAE4HQHQ4I6RGTG34UD2JF6NUFCNL2EO3IOP6EWU2DTL43YRED/DL3IANDDZYONNNSPB2E66FWTR2QDSARCIY4KOZHJQQL34PVEQA252ZA/2NUQDUVDYNVB5CIKVHJ7H7JM5EDO4GOMYVSYGEA4HKEQ463I2GFSPDA65BL26IALIH5MYTM56IEMKIKE: corrupted on transfer: MD5 hash differ “b08356f5da8807b71b206105f62d7cd4” vs “b360f824c10b97638b96170bbfa806f3”

The upload results are always something like this:
2017/02/27 07:22:31
Transferred: 10.559 GBytes (10.004 MBytes/s)
Errors: 0
Checks: 187
Transferred: 88
Elapsed time: 18m0.8s

If I upload the same directory with --size-only I get:
using: command used: rclone copy --stats 15m --transfers 4 --acd-upload-wait-per-gb 4m --max-size 50G --no-update-modtime" --bwlimit 15M --no-gzip-encoding --size-only

2017/02/27 07:50:05 Starting bandwidth limiter at 15MBytes/s
2017/02/27 07:50:08 amazon drive root ‘skynet-main/8OW72MIKQ9EY0G96QBCSSFMREX7PDVFS/NWYIE2A5VDCM2ZJ2GX4LDIERHKPVK’: Waiting for checks to finish
2017/02/27 07:50:08 amazon drive root ‘skynet-main/8OW72MIKQ9EY0G96QBCSSFMREX7PDVFS/NWYIE2A5VDCM2ZJ2GX4LDIERHKPVK’: Waiting for transfers to finish
2017/02/27 07:50:08
Transferred: 0 Bytes (0 Bytes/s)
Errors: 0
Checks: 99
Transferred: 0
Elapsed time: 2.7s

Then if I do the original upload again, I expect no errors, and rclone skipping previously uploaded files… but i get errors again.
using: command used: rclone copy --stats 15m --transfers 4 --acd-upload-wait-per-gb 4m --max-size 50G --no-update-modtime" --bwlimit 15M --no-gzip-encoding --checksum
2017/02/27 07:52:48 3EIUAMOHHBPKRPPI6LACZGXU5A2T23G64KTDP77UZASEEMWSUJ4IWZB/I7BFB5AVNYGZEDIZEP2MYMXBKA4NK7TMVRPXTIGUS72H7UQJ6GDRIMD/576W3G4YXMHLLOTNG4KA43KZBUK5OIRFQVV6PZRXFSZDGPYRPQ7D4WRG3KCYBM7SQWWEL7HZJXAE5AO2: corrupted on transfer: MD5 hash differ “3314812dcf1a8ac36e8f293c9d78698b” vs “5dc25be2fa5799a5928114cb09f86973”

It might be different files that say “corrupted on transfer: MD5 hash differ” and reupload… but why is it saying files are currupted on back to back uplaods of the same directory and same local files?

Is there anything I can change to make the file transfer work correctly?

Hello Shortbus,

Not sure whether it will fix your case, but the latest rclone betas have fixes for data corruption on ACD, check here: https://github.com/ncw/rclone/issues/999#issuecomment-281047967

Failing that, I would strongly suspect RAM hardware errors on your machine – if it doesn’t have ECC, it could very well be happening. If I were you, I would subject your machine to at least a 48-hour burn-in test; see https://github.com/ncw/rclone/issues/999#issuecomment-271189602 for a brief description of my burn-in procedure.

Cheers,

Durval.

thank you durval. I will look at the beta code. The source files are coming off a freebsd platform with zfs filesystem. ECC Ram is present.

I used the version rclone v1.35-92-g18c75a8β - retried back to back uploads and see the same issue.

The corrupted files chosen seem to be different and random. With something like 10% of the files chosen to be called “corrupted on transfer”.

I am looking for a trial cloud storage or some way I can test locally. I am not saying this is an rclone issue, I am just reporitng what I see. I obvi need more/better validation.

Hi Shortbus,

The source files are coming off a freebsd platform with zfs filesystem. ECC Ram is present.

I agree that this makes for a very reliable setup, therefore local problems (RAM, etc) are much less probable than I thought at first.

I used the version rclone v1.35-92-g18c75a8β - retried back to back uploads and see the same issue.

This version, as pointed by @ncw, should have the fix for the known ACD corruption problem. So you either hit another one, or it’s a non-rclone issue (or perhaps even a non-issue, see below).

I am looking for a trial cloud storage or some way I can test locally. I am not saying this is an rclone issue, I am just reporitng what I see. I obvi need more/better validation.

The next step, regarding rclone, would be for you to try with the latest beta and the “-v -v --log-file=XXX” options and then post the parts around the errors, like you did above for the latest stable (but look around the error lines for other interesting/suspicious stuff).

Humrmrmr… another thing: I just noticed that you posted two stat blocks and that they both say “Errors: 0”. Perhaps the data at ACD is OK, and the message you posted is just some transient error which rclone was able to solve on its own. To check for that, I suggest you use “rclone copy” on the file logged with the error, to download it from the ACD to your local file system (with another name or to another dir, of course), and then compare it to your original file. If they are exactly the same, then this last hypothesis would hold.

Cheers,

Durval.

durval, thank you for another reply. It is much appreciated.

One of the first things on my list of tests is to download the file and check the md5sum/sha1sum I get back. That will surely be good feedback.

I have been creating log files, but not with the double v. I will do that. In the logs I have created I can see 4-7 failures out of 124 in the test set. Odd thing is a paticular file will be good on one ‘rclone copy’ then failed two 'rclone copy(‘s)’ later.

I fully expect to discover some Amazon weirdness, but I need to trust the uplaods and therefore need the validation.

Hi Shortbus,

thank you for another reply. It is much appreciated.

No problem, we are all here for each other! :slight_smile:

One of the first things on my list of tests is to download the file and check the md5sum/sha1sum I get back. That will surely be good feedback.

At the end of the day, IMHO, nothing beats a direct MD5 check.

What I do here, and has helped me iron out a lot of problems, is to locally generate a .md5 file with hashes for all (recursive) files in a directory, then upload it along with the files.

Later, when the upload is finished, I use “rclone mount” to make that directory available locally, then cd to that directory and use “md5sum -c” on that file to check everything.

Please note that the double “-v” is needed only with the latest betas (after @ncw implemented separated message into DEBUG, INFO, ERROR etc “categories”) so it will log the DEBUG ones.

Unrepeatable, intermittent crap like this is, IME, usually the cloud provider’s fault…

ACD is full of weirdnesses… the ones that bothered me the most before I moved to GoogleDrive are the dir/file name length limits (which you hit pretty fast if you are using rclone’s encryption) and the strange way it behaves with files over 20GB or so (sometimes failing, sometimes not, but failing more frequently the larger the file is, until it hits 40GB or so where it fails all the time). But then, there’s the useless modification time metadata, and lots of other things… I’m glad I’m done with them, and that I was able to find out about these issues during the trial (so no money wasted).

Good luck, and please keep us posted! :slight_smile:

Cheers,

Durval.

I will adopt your md5sum uplaod and rclone mount… I looked at goole drive prices and I am not ready for $299 a month for 30TB. Dropbox seems like a better deal, but I am still not ready for those dollar amounts.

Hello Shortbus,

GDrive is really expensive for large amounts of storage, when you are using a standard account.

But you can get a “Business GSuite account” for $10/mo ($120/yr), and it features unlimited storage; see here: https://gsuite.google.com/pricing.html (they also offer a 14-day trial period).

Still double the $60/yr price of an unlimited Amazon Drive account, but for me it would have been worth it (I lucked out and got GDrive for free when I discovered that the University I attend here has an agreement with Google for unlimited accounts for all students, so I’m paying nothing – but in retrospect, $60 more per year to get rid of all the issues I was getting from Amazon Drive would have been well worth it).

Cheers,

Durval,

durval, I think I am going to skip learning my own lessons through frustrations. I am creating the gsuite account now. I will also add md5sum to my scripting. I figure out rclone’s built in encryption. Thanks for the homie hookup.

Hello @shortbus,

I think the unlimited Google Drive from GSuite will get you going faster and with less issues, at least it was that way for me. But I advise you to take advantage of their trial so you can be sure it really does it for you before you have to put up any money (in that case, at least the GSuite payment is monthly, so you can cancel it later with minimum expenditure if it turns out not to be so good for your case).

And md5sum -c is a really good thing; it is not perfect yet the way I use it, because rclone mount gets some transient errors (under 0.1% in my tests) when you are dealing with lots of files, and then I have to go and check those errored files individually (also with md5sum) to make sure they are alright. Perhaps the new rclone cryptcheck is a better option going forward – I still have to test it – and of course the perfect, all-around solution is to implement #637. But md5sum -c on the rclone mount works good enough for me right now, I’ve already verified a lot of TBs and many files using it.

Good luck, and keep us posted on your progress.

Cheers,

Durval.

I am testing on gsuite

Just a thought… Are your files being modified after the rclone sync starts? That will give that error.

Hello @ncw,

Ah… glad to know it. I always use rclone copying from read-only file-systems, so I would not know.

Anyway, not only rclone but no copying utility I’m aware of (including the classics, tar and cpio) would succeed in such a situation…

Cheers,

Durval.

ncw, durval,

the files are not being access or modified… it is not a read only file system, but i can make it one. Probably just the access time is changing as rclone does its thing. The files themselves are not changing.

I am going to try the built in rclone crypt.

Previously, I was using encrfs and uploading the cyphered version. Usiong rclone crypt is cleaner in the long run, so I am rewriting my scripts and retesting. I will report back after changes and testing.

Kind regards.

Hi @shortbus,

the files are not being access or modified… it is not a read only file system, but i can make it one.

If you are entirely sure they aren’t being modified, there’s no need – but it could help to make sure they really aren’t.

Probably just the access time is changing as rclone does its thing. The files themselves are not changing.

OOT, I would mention that, on ZFS, my experience is that setting atime to off on the datasets that are frequently scanned (for copying/syncing to the cloud, for example) avoids a lot of writes and enhances performance on the pool as a whole (not directly related to your problem, of course, just a note).

Cheers,

Durval.

Hi @shortbus,

Previously, I was using encrfs and uploading the cyphered version.

This was standard fare until rclone implemented its own encryption. EncFS has given me some trouble over the years, so I avoid it as much as possible (I only started using rclone intensively after @ncw implemented its built-in encryption).

Usiong rclone crypt is cleaner in the long run, so I am rewriting my scripts and retesting. I will report back after changes and testing.

Even if not directly related to your issue (I do not think it is), in my opinion adapting to avoid encfs is going to pay you back handsomely in the long run – rclone’s encryption works very well and some features (like the recently implemented --crypt-show-mapping option and cryptcheck command) that will obviously only work if you are using rclone’s built-in encryption.

I’m usually of the opinion that it’s better to pair different tools to handle different parts of a larger job – so each tool can be as specialized and small and simple and efficient as possible. But regarding rclone built-in encryption vs rclone+encfs, I made an exception.

Cheers,

Durval.

I have made this change, good advice, thanks.

I hope so, it does reduce the complexity by a considerable margin. (which is always beautiful)

I have the frame work complete. I just have some real work to do before I can compelte this project.