Backing up google photos to google drive missing files

So I'm trying to copy my google photos to google drive.

I still had the files from the time where this was done by google, and comparing my older years like 2014, 2010 to the folders rclone copied. There are many files missing.

And there is even a folder where google folder is much bigger in size than the rclone folder..

This is the command I ran:

rclone copy photos:media/by-year gdrive:Photos -v --stats 2s --transfers 30

My original 2014 folder have 340 files, and rclone 2014 folder only has 337 files.

Can you find out what is missing?

Note that the google photos API export reduces the quality of videos in particular - check the docs here: https://rclone.org/googlephotos/#limitations

I compared lots of random files, and the difference between the files rclone downloaded, and the original google files the file size seems to be nearly the same, with the difference being just a few kb.

Can you find out what is missing?

How ?

That is probably the GPS data in the EXIF tag - see the link above.

The Google Photos API isn't capable of doing a full bit-identical backup which is really annoying. It is better than nothing, but a long way from perfect.

I'd do

rclone lsf -R --files-only drive:path/to/original/2014 | sort > orig
rclone lsf -R --files-only drive:path/to/new/2014 | sort > new
diff orig new

to see which files are missing

How do you explain my google 2019 folder having 1910 files at 7 GB and the 2019 folder rclone created has 2272 files but only 4.7 GB.

It's not image compression, since I compared a few files and the difference is not enough for that...

Do you want to see the output of my rclone lsf?

I would guess the size difference is made up by videos rather than images.

I don't know why there are more files though!

The output of the diff would be interesting

rclone lsf -R --files-only gdrive:"Google Photos/2019" | sort > orig
rclone lsf -R --files-only gdrive:Photos/2019 | sort > new
diff orig new > diff.txt

I'd upload the files to here, but it seems you don't allow uploading .txt files to your discourse instance...which I think it'd be nice as it stop people uploading logs / etc to temporary links that will expire

https://pastebin.com/K3NtXw4Z

Good idea - I didn't know you could allow that! I've allowed .txt and .log files to be uploaded

I can see some duplicates in there, so it might be worth trying rclone dedupe on the destination directory.

1 Like

That may be true but I'm accessing the files via a rclone mount so duplicated files should appear as just 1 file right?

But the rclone 2019 folder still have more files than the google 2019 folder! About 362 more files

I note in your original post you used rclone copy which won't delete any excess files in the destination. If you use rclone sync it will. Try first with --dry-run.

I have tried rclone dedupe gdrive: --dedupe-mode skip but it deleted less than 10 files. Why? I don't understand what's happening and I need to be sure I can trust the photos remote to keep my photos & videos in sync with my desktop

Absolutely!

Did you see this comment

rclone sync photos:media/by-year gdrive:Photos -v --stats 2s --transfers 30 --fast-list

I ran this but when I open the two 2019 folders on my mount using my file manager I still see 300 files more than the original folder.

The dedupe command didn't fix this, the sync didn't... I don't know what's happening here.

No me neither!

Did the sync end with an error? If so it won't have done the deletions.

No. There wasn't any errors with rclone sync or dedupe

I wonder if we are talking at cross purposes here?

I think you are syncing gphotos -> gdrive however you are comparing numbers of files in gdrive:"Google Photos/2019" with drive:Photos/2019 - is that correct?

And what you are saying is that there are 300 more files in drive:Photos/2019 than gdrive:"Google Photos/2019"

Is that correct? If so that would mean that google didn't sync some of your 2019 files. Since they turned off the syncing in July 2019 that doesn't seem too suprising.

OK that makes sense.

But same is happening with my 2018 & 2017 folder.

Google Photos 2018 folder = 1383 files
Rclone 2018 folder = 1223

Google Photos 2017 folder = 2955 files
Rclone 2017 folder = 2078 files

Those folders appear to be the other way round with more in the "Google Photos" folders this time?

Did you delete stuff from Google Photos at any point? You could try using rclone check or diffing two rclone lsf to work out what is missing to see if that gives you any ideas.