So the google photos API doesn't allow you to check existence by size (let alone checksum), as stated in the documentation:
"The Google Photos API does not return the size of media. This means that when syncing to Google Photos, rclone can only do a file existence check."
Even the file existence check is not 100% accurate, since maybe the file was previously uploaded with another name, and you would upload the whole file again, only to have it de-duplicated by Google Photos (since the service only stores files with the same binary data once).
Since google photos de-duplicates files, all you would really need is for a way to check if a file with the same binary data already exists in Google Photos. If it does, you don't need to upload it.
This seems to be pretty imposible with the API, but it's exactly what the Google Photos web app does. When you upload a new file via the web interface, even if it's a huge 1GB video, it only uploads it if the video doesn't exist. If the video already exists, it takes very little time to realize this, and moves on to the next file you're trying to upload.
I don't understand completely how it does this, but it seems to make a request with a hash of some sort of the data of the file, and gets a response indicating whether the file exists or not. Here are some examples (where sde3Dddead+dRTOD343Dsfas3d would be the hash of the data, I haven't figured out how the web app's js computes that):
Request:
Request Method: POST
Status Code: 200 (from ServiceWorker)
Referrer Policy: originForm Data:
f.req: [[["swbisb","[[\"sde3Dddead+dRTOD343Dsfas3d=\"],null,3]",null,"generic"]]]
Response when file exists:
[["wrb.fr","swbisb","[[[\"sde3Dddead+dRTOD343Dsfas3d\\u003d\",[\"XXXXXXXXXXXX\",[\"<Link to photo>\",1197,1596,null,null,null,null,null,[1197,1596,3]\n,[99999999]\n]\n,99999999,\"sde3Dddead+dRTOD343Dsfas3d\",-14400000,999999999999,null,null,2,null,null,null,null,null,999999]\n]\n]\n]\n",null,null,null,"generic"]
Response when file doesn't exist:
[["wrb.fr","swbisb","[]\n",null,null,null,"generic"]
As you can see, when the file exists you get the url of the file in google photos (I had to replace it with <Link to photo>
since this forum doesn't allow me to post links).
I was thinking this could be a pretty interesting to investigate for sync purposes. It would really speed things up when syncing folders with google photos: all you need to do is make a request for each of your files, check existence, and then upload those that don't exist. I have tested this in the web app with thousands of existing files and it runs pretty quickly.
Still, there are quite a few problems I would need to investigate:
- How to get hash from the file
- Doing the request using the api token rclone has (instead of the web session I'm using for the requests showed earlier).
I'm willing to keep investigating this, but I would love some feedback whether this looks promising or not (so I don't lose time if it doesn't). For example:
- Do you see some roadblock that would make this not work as I'm imagining it?
- Do you know of anyone who has tried this before and failed?
- Is this a feature that would be interesting for rclone?
Thanks!