Rclone, Google Photos & duplicates

Hi all

I’m using rclone on my Raspberry Pi to get a very large photo library on my WD MyCloud NAS (mapped through SMB) onto Google Photos.

It’s working very well and the vast majority has uploaded successfully but there’s a few behaviours which I’m confused about for which I haven’t been able to find answers.

I’m using the copy command. It should be a one-off, one-way transfer so no need to use sync.

My queries are as follows:

  1. What does rclone do when a duplicate photo already exists on Google Photos (which itself has previously been uploaded by rclone)? My understanding of the copy function had been that rclone would scan the destination and where a duplicate exists, it would skip uploading the photo. Upon re-running the copy function (after it was interrupted first time round), at first it seemed that this was happening (the log was producing high number of ‘Checks’ - I assumed this was confirmation of an already uploaded file). But as the very large upload has continued, I’m seeing files be uploaded that I know were previously uploaded. Any filtering of the duplicate would appear to be happening post-upload on Google’s side. This makes me nervous about re-running the copy command once this process is complete (to capture a small number of failures), because I don’t particularly want to have hundreds of GB pointlessly reuploded.

  2. Perhaps a linked point: on this large upload, each time it nears completion, the number of files to upload increases significantly again, and the log appears to go back over folders it has previously dealt with and upload specific images I can verify are already on the Google platform.

I’ve attached a picture of the process when paused, so you can see my initial command and its current status.

Any thoughts appreciated.

There was a template that you deleted before posting that asks for information we require to assist you.

General info:
https://rclone.org/googlephotos/#duplicates

Thanks. I hoped the information and photograph above was sufficient given I’m asking questions in the generic. The process is ongoing, which I believe limits my ability so give all the required data.

I’ve read that link several times. I must admit it’s never made total sense to me. It appears to say that rclone will duplicate photos and that Google does the de-duplicating and that there is a minor naming issue as a result. So in respect of Google Photos, rclone doesn’t perform any skipping of duplicates prior to upload. Can you confirm this is the case?

correct. with G Photos it will reupload the data and google will just ignore already uploaded content. You can use something like max-age min-age to reduce that amount of data after that initial upload. The API doesn't allow rclone to detect that duplication itself.

OK, thanks very much. That seems to align with the behaviour I’m seeing!

Lastly, the fact that rclone seems to be going over folders again: could this be rclone attempting to “retry” failures? Or does the retrying of failures happen immediately following a failure?

Without looking at your log i'd be guessing. My guess would be yes.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.