Corruption when copying to Amazon Drive

I’m copying about 200,000 photos from my Synology NAS to Amazon Cloud Drive. Most have uploaded just fine (about 1.1TB in the last week :)).

Today I noticed there was some corruption for some of the files:

2017/01/23 18:12:41 Wedding May 09/IMG_6858.JPG: corrupted on transfer: sizes differ 2046150 vs 108615
2017/01/23 18:12:45 Wedding May 09/IMG_6796.JPG: corrupted on transfer: sizes differ 3606963 vs 347388
2017/01/23 18:12:46 Wedding May 09/IMG_6740.JPG: corrupted on transfer: sizes differ 2687949 vs 153310
2017/01/23 18:12:47 Wedding May 09/IMG_6802.JPG: corrupted on transfer: sizes differ 3014577 vs 216327
2017/01/23 18:12:47 Wedding May 09/IMG_6832.JPG: corrupted on transfer: sizes differ 2211188 vs 125573
2017/01/23 18:12:49 Wedding May 09/IMG_6828.JPG: corrupted on transfer: sizes differ 2559260 vs 155067
2017/01/23 18:13:09 Wedding May 09/IMG_6845.JPG: corrupted on transfer: sizes differ 2595737 vs 166502

Any reason why that might happen? Where do I begin to look?

The rclone command I’m running is super-simple:

rclone copy /volume1/pictures amazon:Pictures --log-file=/var/services/homes/admin/rclone.log

Also, is it possible to have rclone produce a report of these issues, so when I’m running a huge sync job I can see them? I would have missed this if I hadn’t happened to have tailed the logfile at just the right moment.

Thanks!

I’ve not seen that with ACD before.

I’d probably check that those files can be read off the disk properly.

I’m currently sorting out the logging so these sorts of things will be easier to find.

You can use rclone check to gain confidence that your backup is correct. You can also always just run another sync and rclone will tidy up.

In fact those corrupted on upload errors will cause rclone to retry anyway so likely rclone will sort it out for you if it isn’t a persistent error.

I’m currently sorting out the logging so these sorts of things will be easier to find.
[/quote]

@ncw: this is great news! rclone output is sometimes hard to deal with.

@webreaper: what I do (while Nick doesn’t get us a version with better logging) is to run the following in another window, on the logfile:

tail -f LOGFILE | grep -B5 'Elapsed time: '

And then watch the "Errors: " line. Whenever it increments, I know I will have to go into LOGFILE between this and the previous timestamp and look for what happened. OTOH, reaching the end of your transfers with a “Errors: 0” line would indicate (barring cloud-side corruption) that your files uploaded OK.

Cheers,

Durval.

So far I never got corrupted on transfer errors.

Thanks for the help, guys. It would be great if rclone could have more ‘traditional’ style logging for this kind of job; I’m less interested in the periodic transfer stats, and more interested in a log entry per transferred file or per error (with maybe a log entry for skipped files in verbose mode), including why the file was transferred (e.g., size changed/date changed/something else). That way I can skim the log file quickly and check everything’s working as expected.

One other thing, if you’re visiting the logging code - can you have an option to roll over the file for the start of each new sync job? I’d like each sync run to have a separate log file, maybe purging after n log files have been created (so I could opt to keep the last, say, 10 logfiles. :slight_smile:

New here, just starting using rclone today, and it works exactly as I want so far except for this “corrupted on transfer”. This is the only thread I could find related to it.

I have my photographs on both OneDrive and Dropbox (and music only on OneDrive) at the moment, synced using the desktop apps and Windows file junctions/links. I’m moving to ACD as the OneDrive 1TB limit is apparently coming into force on March 1st etc. and the ACD desktop sync app is appalling.

Anyway, I am copying music from onedrive -> acd and getting lots of these errors for the small folder.jpg and cover.jpg images. I have tried on both Linux and Windows. I then tried, as a test, sync’ing just my photos from onedrive to dropbox to check for differences (mostly seem to be filename case sensitivity that the desktop apps on Windows ignored) but I still get these corrupted transfers on mostly smaller jpg and png files.

examples

$ rclone sync onedrive:Music amazon:Music ->

2017/02/24 09:36:25 Soundtracks/Crow; City of Angels, The/Folder.jpg: corrupted on transfer: sizes differ 13753 vs 10951

$ rclone sync onedrive:Pictures dropbox:Pictures ->

2017/02/24 09:41:51 2011/2011-04-19/IMG_20110419_224226.jpg: corrupted on transfer: sizes differ 35365 vs 35206
2017/02/24 09:41:53 2011/2011-04-19/IMG_20110419_224206.jpg: corrupted on transfer: sizes differ 35473 vs 35206
2017/02/24 09:41:54 2013/2013-03-29/Young Einstein.png: corrupted on transfer: sizes differ 1840 vs 1516

It appears to be small files only for me. Larger jpg and png files are all ok.

Apart from reporting this here, what can I provide / do to help?

Looks like the problem is in onedrive; I found this http://stackoverflow.com/questions/26906007/onedrive-wrong-size-for-png-files

from 2014:

Ryan from OneDrive here. We looked into this and have a good understanding of what’s going wrong with the size. OneDrive computes the “space” a file takes up in our system by using the size of the largest data stream associated with a file. When an image is uploaded to OneDrive, we also create thumbnails for images so that we can quickly show various views in our clients and website.

In the case of this particular file, one of the JPG thumbnails we create for the PNG file is actually larger than the original file (due to the JPG compression not being as effective as PNG for this image). As a result, the thumbnail is actually the largest stream on the file. As you can imagine, that doesn’t happen very often, but for this image (and others like it) we have this bug.

We have a bug tracking the issue and are investigating how we can fix the API to return the size of the “default” stream – the stream that represents the actual contents of the file. I don’t have an ETA for the fix, but we’re working on it.

Sigh. I may just have to ignore size when transferring for now, but hope this may help anyone else searching for this message

Update: Just browsing the codebase for rclone. I don’t know enough yet but it may be enough to fetch the /content stream’s metadata for non-folder items in the onedrive.go file. If I get my act together and have time to dive in I may give this a go at home over the weekend. For now, I’m just doing --ignore-size for the copy/sync

See this issue for the full story! If you can fix it properly that would be great.

https://github.com/ncw/rclone/issues/399

Thanks, they appear to be the same, yes. At some point, other time pressures allowing, I will try to get an rclone source tree ready and go play. That said, I am cancelling my OneDrive sub as they are now lowering storage limits against the original sales agreements…