Hello, I have found a possible bug with uploading photos from certain types of cameras to google photos using Rclone. Images from some cameras are regularly uploaded in a damaged format, often overwriting previous good versions on Google Photos.
I see this happening on some but not all photos from compact cameras and iPhones that I use or used in the past. There are three types of corruption that I have seen:
Movement of the edge of the image vertically and horizontally, causing the image to appear cut in half.
Grey bars at the top or bottom of an image.
Total corruption, causing colorful rainbow type effects.
Attached I have added an example of each. EDIT: I am not allowed to attach images as a new user, so I have attached a link to Dropbox.
For all images from a camera that Rclone uploaded, about 1 in 10 shows this corruption. When opened on the local disk, the image in case appears fine in both photo editing software and file browser. When downloaded from Google Photos, the corruption is also present in the downloaded file.
The command that I am using to upload is as follows in crontab:
0 */20 * * * /usr/bin/rclone copy /mnt/cephfs/Fotomateriaal/ gphotos:album/Fotomateriaal --log-file=/opt/logs/rclone-upload-gphotos.log & >/dev/null 2>&1
I have disabled this for the time being.
My file system is CephFS. Rclone runs in a lxc on proxmox 6. CephFS is accessed as read only using a bind mount to the host CephFS kernel mount.
go version: go1.12.10
The camera models that have shown these types of corruption are:
Olympus TG-6 in jpeg mode
Olympus TG-6 in raw mode
Fujifilm Finepix F30
The examples can be seen here as I am not allowed to attach images:
There are a number of questions and concerns that I have:
What could be the cause of this? Why is Rclone overwriting good images on Google Photos? Have I misconfigured Rclone? How can I fix the already overwritten images on Google Photos?
Thanks in advance to anyone willing to lend their assistance.
Hello @ncw thanks a lot for your reply, I really appreciate the support!
I have a few other lxc with Rclone backing up other parts of our files from CephFS, in this case my business files. I have not seen similar corruption there. Also, the corruption has not appeared in any canon raw files (cr2) even though that is the bulk of my photos. This could also be coincidental of course.
The command I use in my other LXC Rclone (variations thereof):
0 */8 * * * rclone sync /mnt/cephfs/Google\ Drive/ cephfsbackup:drive --backup-dir cephfsbackup:drive-backup/date -I -v --transfers=20 --checkers=50 --min-age 30m --log-file=/opt/logs/rclone-upload-cephfs.log & >/dev/null 2>&1
So that makes daily backups of changed files as well (which is a super useful and awesome feature!).
As you suggested, I have added original files from the file system in the dropbox folder:
Hopefully that can give some insight!
As for your question: Yes, I have copies of all my originals (of course) and I have several layers of backups of all these originals using a combination of Rclone syncs and Duplicacy to backup to various clouds. Rclone is a huge part of my copying and backup strategy and I really would not be able to do much without it!
Possibly unrelated: As I was browsing for images, I came across another oddity just now. Rclone Google Photos upload seems to make many many duplicate albums on Google Photos. I have a Ingress folder which contains to-be-processed images and Rclone seems to have made dozens of albums with this name each containing one (1) different image.
I will also run a -vv version of my copy command but I am slightly worried that it will overwrite more images.
Hello @ncw thanks again for your help and suggestions.
The first thing I tried was your idea to use md5sum to see if misreads occur. I manually ran the command at various points in the past ~18 hours from 3 different points, the proxmox host (straight to kernel Ceph) the lxc where Rclone-google-photos lives and another lxc that I use for uploading parts of Ceph to google drive. I took care to point them all to the same CephFS directory.
It appears that in these 15 cases, the md5 was calculated in the same way.
However, this was done on a subfolder, and I worried slightly that this might only occur on large folders. So I also ran three checksums from the same three hosts towards a much larger folder, just to make sure that it was not related to the number of files in a folder somehow.
yes, the CR2 are around 20-30 MB each. Most jpeg files are much smaller at between 2-8 MB each. The ORF files are around 5-12 MB.
Well, I could point Rclone at my ingress directory, which contains some 10.000 images at the moment and sort Google Photos by 'added recently'. That was how I first discovered the corrupted images. I think this should show pretty quickly if any new images are getting corrupted.
Ok, I will have it take a run on a specific folder tonight and see what kind of messages it gives me. Again, I really appreciate the help.
@ncw As you suggested, I set Rclone to upload using -vv and this shows some errors. This time I am using the path gphotos:upload and left it to run for two days. Here is an example of an image that got damaged.
2019/10/17 20:09:09 DEBUG : _7190137.ORF: >Update: err=failed to create media item: Quota exceeded for quota metric 'photoslibrary.googleapis.com/write_requests' and limit 'WritesPerMinutePerUser' of service 'photoslibrary.googleapis.com' for consumer 'project_number:202264815644'. (429 RESOURCE_EXHAUSTED)
2019/10/17 20:09:09 DEBUG : _7190137.ORF: Received error: failed to create media item: Quota exceeded for quota metric 'photoslibrary.googleapis.com/write_requests' and limit 'WritesPerMinutePerUser' of service 'photoslibrary.googleapis.com' for consumer 'project_number:202264815644'. (429 RESOURCE_EXHAUSTED) - low level retry 1/10
2019/10/17 20:09:09 DEBUG : Google Photos path "upload": Put: src=_7190137.ORF
2019/10/17 20:09:09 DEBUG : _7190137.ORF: Update: src=_7190137.ORF
* _7190137.ORF:154% /13.226M, 120.193k/s, -
2019/10/17 20:10:22 DEBUG : _7190137.ORF: >Update: err=<nil>
2019/10/17 20:10:22 DEBUG : _7190137.ORF: Size:
2019/10/17 20:10:22 DEBUG : _7190137.ORF: >Size:
2019/10/17 20:10:22 INFO : _7190137.ORF: Copied (new)
I have added the image in the dropbox folder, both the version from Google Photos and the version pulled locally from the filesystem.
This also happens similarly to jpeg images:
2019/10/17 18:45:58 DEBUG : P9211403.JPG: >Update: err=failed to create media item: Quota exceeded for quota metric 'photoslibrary.googleapis.com/write_requests' and limit 'WritesPerMinutePerUser' of service 'photoslibrary.googleapis.com' for consumer 'project_number:202264815644'. (429 RESOURCE_EXHAUSTED)
2019/10/17 18:45:58 DEBUG : P9211403.JPG: Received error: failed to create media item: Quota exceeded for quota metric 'photoslibrary.googleapis.com/write_requests' and limit 'WritesPerMinutePerUser' of service 'photoslibrary.googleapis.com' for consumer 'project_number:202264815644'. (429 RESOURCE_EXHAUSTED) - low level retry 1/10
2019/10/17 18:45:58 DEBUG : Google Photos path "upload": Put: src=P9211403.JPG
2019/10/17 18:45:58 DEBUG : P9211403.JPG: Update: src=P9211403.JPG
* P9211403.JPG:197% /5.469M, 85.343k/s, -
2019/10/17 18:47:10 DEBUG : P9211403.JPG: >Update: err=<nil>
2019/10/17 18:47:10 DEBUG : P9211403.JPG: Size:
2019/10/17 18:47:10 DEBUG : P9211403.JPG: >Size:
2019/10/17 18:47:10 INFO : P9211403.JPG: Copied (new)
This image contains people so I have not added this to dropbox, but I wanted to show you the log in case that helps.
In the past I have done this often when I wanted to quickly upload a small set of images, basically drag and drop to the web interface. This has, to my knowledge, never led to this kind of corruption.
But the corruption is not present for most of the uploads. At the risk of confusing causation and correlation, there appears to be some kind of link. After I manually checked a few images that showed the error I saw that of the 10 I checked 2 of them were corrupted.
And I also checked for the > 100% completed part, this also shows up a few times:
Is there any chance you could send me the complete log? If you don't want to make it public then email it to me firstname.lastname@example.org with a link to this forum page - thanks!
I can't see a mechanism for how retries could cause corruptions unless something funny is happening at google. Though I can't work out why the insert upload item isn't being retried which I'm hoping the full logs will reveal!
When you upload to "gphotos:upload" it will duplicate things if you upload them again. I tried to make this clear in the docs but maybe didn't succeed!
The upload directory is for uploading files you don’t want to put into albums. This will be empty to start with and will contain the files you’ve uploaded for one rclone session only, becoming empty again when you restart rclone. The use case for this would be if you have a load of files you just want to once off dump into Google Photos. For repeated syncing, uploading to album will work better.
If you copy the same thing again and again, then you want to upload to an album which can prevent you uploading the same thing again.
Back to the corruption...
Could you try slowing down the upload rate a bit with --tpslimit 10 or maybe --tpslimit 1 to try to reduce the number of 429 errors? Then we can see if they really are correlated with the corruption?
No this is clear, I only used gphotos:upload this time for the log, my crontab script uses albums.
What I do not understand is why it did multiple copy operations, as I only ran the command for this log once (unless I am very mistaken, which is a possibility). It certainly is not your docs. The duplication was also seen in my regular crontab command which uses albums.
But this is of lesser concern, the corruption is more problematic.
I will do that right away with an age limit (so it will not take another 2 days). The command will be /usr/bin/rclone copy /mnt/cephfs/Fotomateriaal/Ingress/ gphotos:upload -vv --log-file=upload-gphotos-again.log --max-age 2M --tpslimit 1
@ncw this is taking a long time, because I had not sufficiently realized that 2 months would still be quite a few images and obviously it only transfers 1 file per second now.
However! None of the files that have been transferred so far show any corruption!
The log so far still shows the 429 errors though, 57 so far.
But that does not really matter I think, so long as images do not get damaged.
Hello @ncw, apologies for the late reply, I was away for a few days (taking pictures )
You assumption is correct, I look at the 'recently added' search option on google photos and scroll down till I see corruption. It is not a very efficient way of doing it, but it is how I first discovered the corruption and the really badly corrupted images are very easily discovered. The partly moved x/y border images are similarly easily found.