Rclone dedupe change hashing function

Is it possible to change the default hashing algorithim that dedupe uses to something else for example SHA1 as the new default, but also allow us to specify which hashing function to use as well?

MD5 collisions are possible when the data is different, but the MD5 hash is the same.
https://www.mscs.dal.ca/~selinger/md5collision/

I’d personally like SHA1 to be the new hashing function, but collisions are also possible with this. https://shattered.io/

The last link mentions that safer alternatives should be used such as SHA-256, or SHA-3.

Google only keeps MD5SUMs on the objects, so if we don’t want to download them then we are stuck with MD5SUMs.

Note that these collisions are not as useful as you might think. You can generate two documents A and B which have the same hash, but if I give you a doc C it is computationally infeasible for you to generate a doc D which hash the same hash. So that means that you can’t substitute random files in my google drive and keep the hashes the same.

Oh right I see, that I wasn’t aware of, I thought rclone must have been making the hashes itself via a stream. How difficult and time consuming would this be to implement, maybe also with the option of choosing the hashing algorithim?

On a different note, how easy/difficult is GO to learn, if an individual (me) has a loose background in programming, but for the web instead such as PHP. Do you have any recommendations for books or other content to learn it?

If you want to check integrity of the upload not using hashes you can use rclone check with the --download flag.

It would be possible to add a --download flag to rclone dedupe too - however I don’t think it would get much use.

If you’ve done any C derived languages (C, Java, C++, C#, Javascript) then learning Go is pretty easy. By and large go is a reasonably simple language so it isn’t too hard to pick up. I usually recommend people start with the go tour.